| 1 | [[PageOutline]] |
| 2 | |
| 3 | ◢ <[wiki:TREND120929/Lab2 實作二]> | <[wiki:TREND120929 回課程大綱]> ▲ | <[wiki:TREND120929/Lab4 實作四]> ◣ |
| 4 | |
| 5 | = 實作三 Lab3 = |
| 6 | |
| 7 | {{{ |
| 8 | #!html |
| 9 | <div style="text-align: center;"><big style="font-weight: bold;"><big>HDFS 單機操作練習<br/>HDFS local mode in Practice</big></big></div> |
| 10 | }}} |
| 11 | |
| 12 | == 0. 啟動 Hadoop4Win == |
| 13 | |
| 14 | * STEP 1 : 請在「開始功能表」依序點選以下捷徑 |
| 15 | * [[BR]][[Image(Hadoop4Win:hadoop4win-installer_11.jpg)]] |
| 16 | * STEP 2 :首先點選 start-hadoop 來啟動 Hadoop 的服務(跑在獨立的 CMD 視窗中) |
| 17 | * '''注意''':必須看到 Safe Mode is OFF 才算正常啟動完畢。 |
| 18 | * [[BR]][[Image(Hadoop4Win:hadoop4win_29.jpg,width=800)]] |
| 19 | * STEP 3 :其次點選 NameNode Web UI 用瀏覽器開啟 http://localhost:50070 的頁面,確認 NameNode 正常開啟,可以正常顯示如下畫面: |
| 20 | * '''注意''':必須有一個 Live Node 才算是正常。 |
| 21 | * [[BR]][[Image(Hadoop4Win:hadoop4win_10.jpg,width=800)]] |
| 22 | * STEP 4 :接著點選 JobTracker Web UI 用瀏覽器開啟 http://localhost:50030 的頁面,確認 JobTracker 正常開啟,可以正常顯示如下畫面: |
| 23 | * '''注意''':狀態必須是 RUNNING 才算是正常。 |
| 24 | * [[BR]][[Image(Hadoop4Win:hadoop4win_11.jpg,width=800)]] |
| 25 | * STEP 5 : 最後點選 hadoop4win 來啟動 hadoop4win 的 Cygwin 視窗,用以輸入後續的指令。 |
| 26 | * [[BR]][[Image(Hadoop4Win:hadoop4win_20.jpg,width=800)]] |
| 27 | |
| 28 | == 1. HDFS 指令練習 == |
| 29 | |
| 30 | === 1.1 瀏覽您的 HDFS 目錄 === |
| 31 | |
| 32 | * 首先,您可以使用 hadoop fs -ls 指令來瀏覽您的 HDFS 目錄 |
| 33 | {{{ |
| 34 | Jazz@human ~ |
| 35 | $ hadoop fs -ls |
| 36 | Found 1 items |
| 37 | drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp |
| 38 | }}} |
| 39 | |
| 40 | === 1.2 上傳資料到 HDFS 目錄 === |
| 41 | |
| 42 | * 接著,讓我們來練習如何上傳資料到 HDFS 目錄。這裡我們使用的是 /opt/hadoop/conf 當作來源目錄,/user/${使用者名稱}/input 當作目標目錄。 |
| 43 | * '''注意''':由於 Windows 版的 Hadoop 運行於 Cygwin 中,然而 Cygwin 的路徑是虛擬路徑,JRE(Java Runtime Environment)只認識 Windows 目錄路徑,因此倘若您遇到類似底下的錯誤訊息,請加上 cygpath -w 來轉換 Cygwin 路徑到 Windows 路徑。 |
| 44 | {{{ |
| 45 | Jazz@human ~ |
| 46 | $ hadoop fs -put /opt/hadoop/conf input |
| 47 | put: File /opt/hadoop/conf does not exist. |
| 48 | Jazz@human ~ |
| 49 | $ hadoop fs -put $(cygpath -w /opt/hadoop/conf) input |
| 50 | }}} |
| 51 | |
| 52 | * 我們可以使用 hadoop fs -ls 來檢查剛剛上傳的檔案 |
| 53 | {{{ |
| 54 | Jazz@human ~ |
| 55 | $ hadoop fs -ls |
| 56 | Found 2 items |
| 57 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 11:45 /user/Jazz/input |
| 58 | drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp |
| 59 | |
| 60 | Jazz@human ~ |
| 61 | $ hadoop fs -ls input |
| 62 | Found 13 items |
| 63 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| 64 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| 65 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| 66 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| 67 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| 68 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| 69 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| 70 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| 71 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| 72 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/masters |
| 73 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 74 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| 75 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| 76 | }}} |
| 77 | |
| 78 | === 1.3 下載 HDFS 的資料到本地目錄 === |
| 79 | |
| 80 | * 接著讓我們來練習如何透過指令從 HDFS 下載資料到本地目錄 |
| 81 | {{{ |
| 82 | Jazz@human ~ |
| 83 | $ hadoop fs -get input fromHDFS |
| 84 | }}} |
| 85 | |
| 86 | * 您可以透過 diff 指令來檢查剛剛上傳的內容與下載下來的內容是否一致 |
| 87 | {{{ |
| 88 | Jazz@human ~ |
| 89 | $ diff -Naur fromHDFS/ /opt/hadoop/conf |
| 90 | }}} |
| 91 | |
| 92 | === 1.4 刪除 HDFS 上的檔案 === |
| 93 | |
| 94 | * 您可以透過 hadoop fs -rm 來刪除 HDFS 上的單一檔案 |
| 95 | {{{ |
| 96 | Jazz@human ~ |
| 97 | $ hadoop fs -rm input/masters |
| 98 | Deleted hdfs://localhost:9000/user/Jazz/input/masters |
| 99 | }}} |
| 100 | * 倘若您欲刪除的是目錄,請使用 hadoop fs -rmr 來刪除 HDFS 上的目錄 |
| 101 | {{{ |
| 102 | Jazz@human ~ |
| 103 | $ hadoop fs -rmr tmp |
| 104 | Deleted hdfs://localhost:9000/user/Jazz/tmp |
| 105 | }}} |
| 106 | |
| 107 | === 1.5 傾印 HDFS 上的檔案內容 === |
| 108 | |
| 109 | * 有時,如果只是想要查閱 HDFS 上的檔案內容,可以使用 hdfs fs -cat 來傾印(dump)檔案內容。 |
| 110 | {{{ |
| 111 | Jazz@human ~ |
| 112 | $ hadoop fs -cat input/slaves |
| 113 | localhost |
| 114 | }}} |
| 115 | |
| 116 | === 1.6 更多 HDFS 指令操作 === |
| 117 | |
| 118 | * HDFS 支援的所有指令可以透過以下方式取得列表: |
| 119 | {{{ |
| 120 | Jazz@human ~ |
| 121 | $ hadoop fs |
| 122 | Usage: java FsShell |
| 123 | [-ls <path>] |
| 124 | [-lsr <path>] |
| 125 | [-du <path>] |
| 126 | [-dus <path>] |
| 127 | [-count[-q] <path>] |
| 128 | [-mv <src> <dst>] |
| 129 | [-cp <src> <dst>] |
| 130 | [-rm [-skipTrash] <path>] |
| 131 | [-rmr [-skipTrash] <path>] |
| 132 | [-expunge] |
| 133 | [-put <localsrc> ... <dst>] |
| 134 | [-copyFromLocal <localsrc> ... <dst>] |
| 135 | [-moveFromLocal <localsrc> ... <dst>] |
| 136 | [-get [-ignoreCrc] [-crc] <src> <localdst>] |
| 137 | [-getmerge <src> <localdst> [addnl]] |
| 138 | [-cat <src>] |
| 139 | [-text <src>] |
| 140 | [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>] |
| 141 | [-moveToLocal [-crc] <src> <localdst>] |
| 142 | [-mkdir <path>] |
| 143 | [-setrep [-R] [-w] <rep> <path/file>] |
| 144 | [-touchz <path>] |
| 145 | [-test -[ezd] <path>] |
| 146 | [-stat [format] <path>] |
| 147 | [-tail [-f] <file>] |
| 148 | [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] |
| 149 | [-chown [-R] [OWNER][:[GROUP]] PATH...] |
| 150 | [-chgrp [-R] GROUP PATH...] |
| 151 | [-help [cmd]] |
| 152 | |
| 153 | Generic options supported are |
| 154 | -conf <configuration file> specify an application configuration file |
| 155 | -D <property=value> use value for given property |
| 156 | -fs <local|namenode:port> specify a namenode |
| 157 | -jt <local|jobtracker:port> specify a job tracker |
| 158 | -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster |
| 159 | -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. |
| 160 | -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines. |
| 161 | |
| 162 | The general command line syntax is |
| 163 | bin/hadoop command [genericOptions] [commandOptions] |
| 164 | }}} |
| 165 | |
| 166 | == 2. 使用網頁介面來瀏覽 HDFS 的內容資訊 == |
| 167 | |
| 168 | * 您亦可透過調閱 [http://localhost:50070 NameNode] 的頁面來查詢方才上傳的檔案內容與 Block Size、File Size、Block Location、Rack Location 等資訊。 |
| 169 | * [[BR]][[Image(Hadoop4Win:hadoop4win_30.jpg,width=800)]] |
| 170 | * [[BR]][[Image(Hadoop4Win:hadoop4win_31.jpg,width=800)]] |
| 171 | * [[BR]][[Image(Hadoop4Win:hadoop4win_32.jpg,width=800)]] |
| 172 | |
| 173 | == 3. 更多 HDFS shell 的用法 == |
| 174 | |
| 175 | === -ls === |
| 176 | |
| 177 | * -ls 的操作預設目錄在 /user/${username}/ 下,意思就是您使用的是相對於 /user/${username} 的「相對路徑」 |
| 178 | {{{ |
| 179 | Jazz@human ~ |
| 180 | $ hadoop fs -ls input |
| 181 | Found 13 items |
| 182 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| 183 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| 184 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| 185 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| 186 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| 187 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| 188 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| 189 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| 190 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| 191 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 192 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| 193 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| 194 | }}} |
| 195 | * 當然您也可以指定「完整路徑」,採用 '''hdfs://node:port/path''' 這種格式。 |
| 196 | {{{ |
| 197 | Jazz@human ~ |
| 198 | $ hadoop fs -ls hdfs://localhost:9000/user/${USER}/input |
| 199 | Found 12 items |
| 200 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| 201 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| 202 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| 203 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| 204 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| 205 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| 206 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| 207 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| 208 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| 209 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 210 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| 211 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| 212 | }}} |
| 213 | |
| 214 | === -cat === |
| 215 | |
| 216 | * 將路徑指定文件的內容輸出到標準輸出(STDOUT) |
| 217 | {{{ |
| 218 | Jazz@human ~ |
| 219 | $ hadoop fs -cat input/slaves |
| 220 | localhost |
| 221 | }}} |
| 222 | |
| 223 | === -chgrp === |
| 224 | |
| 225 | * 改變文件所屬的群組 |
| 226 | {{{ |
| 227 | Jazz@human ~ |
| 228 | $ hadoop fs -ls input/slaves |
| 229 | Found 1 items |
| 230 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 231 | |
| 232 | Jazz@human ~ |
| 233 | $ hadoop fs -chgrp ${USERNAME} input/slaves |
| 234 | |
| 235 | Jazz@human ~ |
| 236 | $ hadoop fs -ls input/slaves |
| 237 | Found 1 items |
| 238 | -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 239 | }}} |
| 240 | |
| 241 | === -chmod === |
| 242 | |
| 243 | * 改變文件的權限 |
| 244 | {{{ |
| 245 | Jazz@human ~ |
| 246 | $ hadoop fs -ls input/slaves |
| 247 | Found 1 items |
| 248 | -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 249 | |
| 250 | Jazz@human ~ |
| 251 | $ hadoop fs -chmod 700 input/slaves |
| 252 | |
| 253 | Jazz@human ~ |
| 254 | $ hadoop fs -ls input/slaves |
| 255 | Found 1 items |
| 256 | -rw------- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 257 | }}} |
| 258 | |
| 259 | === -chown === |
| 260 | |
| 261 | * 改變文件的擁有者 |
| 262 | {{{ |
| 263 | Jazz@human ~ |
| 264 | $ hadoop fs -chown hadoop input/slaves |
| 265 | |
| 266 | Jazz@human ~ |
| 267 | $ hadoop fs -ls input/slaves |
| 268 | Found 1 items |
| 269 | -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 270 | }}} |
| 271 | |
| 272 | === -copyFromLocal, -put === |
| 273 | |
| 274 | * 從本機(local)上傳檔案到 HDFS |
| 275 | {{{ |
| 276 | Jazz@human ~ |
| 277 | $ hadoop fs -put fromHDFS dfs_input |
| 278 | |
| 279 | Jazz@human ~ |
| 280 | $ hadoop fs -ls |
| 281 | Found 2 items |
| 282 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 283 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 284 | }}} |
| 285 | |
| 286 | === -copyToLocal, -get === |
| 287 | |
| 288 | * 把 HDFS 上的檔案下載到本機(local) |
| 289 | {{{ |
| 290 | Jazz@human ~ |
| 291 | $ hadoop fs -get dfs_input input1 |
| 292 | }}} |
| 293 | |
| 294 | === -cp === |
| 295 | |
| 296 | * 將文件從 HDFS 原本路徑複製到 HDFS 目標路徑 |
| 297 | {{{ |
| 298 | Jazz@human ~ |
| 299 | $ hadoop fs -cp dfs_input input1 |
| 300 | |
| 301 | Jazz@human ~ |
| 302 | $ hadoop fs -ls |
| 303 | Found 3 items |
| 304 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 305 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 306 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 307 | }}} |
| 308 | |
| 309 | === -du === |
| 310 | |
| 311 | * 顯示目錄中所有文件的大小 |
| 312 | {{{ |
| 313 | Jazz@human ~ |
| 314 | $ hadoop fs -du input |
| 315 | Found 12 items |
| 316 | 3936 hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml |
| 317 | 535 hdfs://localhost:9000/user/Jazz/input/configuration.xsl |
| 318 | 326 hdfs://localhost:9000/user/Jazz/input/core-site.xml |
| 319 | 2409 hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh |
| 320 | 1245 hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties |
| 321 | 4190 hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml |
| 322 | 196 hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml |
| 323 | 2815 hdfs://localhost:9000/user/Jazz/input/log4j.properties |
| 324 | 212 hdfs://localhost:9000/user/Jazz/input/mapred-site.xml |
| 325 | 10 hdfs://localhost:9000/user/Jazz/input/slaves |
| 326 | 1243 hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example |
| 327 | 1195 hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example |
| 328 | }}} |
| 329 | |
| 330 | === -dus === |
| 331 | |
| 332 | * 顯示該目錄/文件的總大小 |
| 333 | {{{ |
| 334 | Jazz@human ~ |
| 335 | $ hadoop fs -dus input |
| 336 | hdfs://localhost:9000/user/Jazz/input 18312 |
| 337 | }}} |
| 338 | |
| 339 | === -expunge === |
| 340 | |
| 341 | * 清空垃圾桶 |
| 342 | {{{ |
| 343 | Jazz@human ~ |
| 344 | $ hadoop fs -expunge |
| 345 | }}} |
| 346 | |
| 347 | === -getmerge === |
| 348 | |
| 349 | * 將來源目錄 <src> 下所有的文件都集合到本機一個 <localdst> 檔案內 |
| 350 | * 語法:hadoop fs -getmerge <src> <localdst> |
| 351 | {{{ |
| 352 | Jazz@human ~ |
| 353 | $ mkdir -p in1 |
| 354 | |
| 355 | Jazz@human ~ |
| 356 | $ echo "this is one; " > in1/input |
| 357 | |
| 358 | Jazz@human ~ |
| 359 | $ echo "this is two; " > in1/input2 |
| 360 | |
| 361 | Jazz@human ~ |
| 362 | $ hadoop fs -put in1 in1 |
| 363 | |
| 364 | Jazz@human ~ |
| 365 | $ hadoop fs -getmerge in1 merge.txt |
| 366 | |
| 367 | Jazz@human ~ |
| 368 | $ cat ./merge.txt |
| 369 | this is one; |
| 370 | this is two; |
| 371 | }}} |
| 372 | |
| 373 | === -ls === |
| 374 | |
| 375 | * 列出文件或目錄的資訊 |
| 376 | * 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID |
| 377 | * 目錄名 <dir> 修改日期 修改時間 權限 用戶ID 組ID |
| 378 | {{{ |
| 379 | Jazz@human ~ |
| 380 | $ hadoop fs -ls |
| 381 | Found 3 items |
| 382 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 383 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 384 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 385 | }}} |
| 386 | |
| 387 | === -lsr === |
| 388 | |
| 389 | * ls 命令的遞迴版本 |
| 390 | {{{ |
| 391 | Jazz@human ~ |
| 392 | $ hadoop fs -lsr |
| 393 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 394 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:33 /user/Jazz/dfs_input/capacity-scheduler.xml |
| 395 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:33 /user/Jazz/dfs_input/configuration.xsl |
| 396 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:33 /user/Jazz/dfs_input/core-site.xml |
| 397 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-env.sh |
| 398 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-metrics.properties |
| 399 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-policy.xml |
| 400 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:33 /user/Jazz/dfs_input/hdfs-site.xml |
| 401 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:33 /user/Jazz/dfs_input/log4j.properties |
| 402 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:33 /user/Jazz/dfs_input/mapred-site.xml |
| 403 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/masters |
| 404 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/slaves |
| 405 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-client.xml.example |
| 406 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-server.xml.example |
| 407 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| 408 | -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input |
| 409 | -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input2 |
| 410 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 411 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| 412 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| 413 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| 414 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| 415 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| 416 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| 417 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| 418 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| 419 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| 420 | -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| 421 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| 422 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| 423 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 424 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:34 /user/Jazz/input1/capacity-scheduler.xml |
| 425 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:34 /user/Jazz/input1/configuration.xsl |
| 426 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:34 /user/Jazz/input1/core-site.xml |
| 427 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:34 /user/Jazz/input1/hadoop-env.sh |
| 428 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:34 /user/Jazz/input1/hadoop-metrics.properties |
| 429 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:34 /user/Jazz/input1/hadoop-policy.xml |
| 430 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:34 /user/Jazz/input1/hdfs-site.xml |
| 431 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:34 /user/Jazz/input1/log4j.properties |
| 432 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:34 /user/Jazz/input1/mapred-site.xml |
| 433 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/masters |
| 434 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/slaves |
| 435 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:34 /user/Jazz/input1/ssl-client.xml.example |
| 436 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:34 /user/Jazz/input1/ssl-server.xml.example |
| 437 | }}} |
| 438 | === -mkdir === |
| 439 | |
| 440 | * 建立資料夾 |
| 441 | {{{ |
| 442 | Jazz@human ~ |
| 443 | $ hadoop fs -mkdir tmp |
| 444 | Jazz@human ~ |
| 445 | $ hadoop fs -ls |
| 446 | Found 5 items |
| 447 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 448 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| 449 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 450 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 451 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| 452 | }}} |
| 453 | |
| 454 | === -moveFromLocal === |
| 455 | |
| 456 | * 將 local 端的資料夾剪下移動到 HDFS 上 |
| 457 | {{{ |
| 458 | Jazz@human ~ |
| 459 | $ hadoop fs -moveFromLocal in1 in2 |
| 460 | Jazz@human ~ |
| 461 | $ hadoop fs -ls |
| 462 | Found 6 items |
| 463 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 464 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| 465 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in2 |
| 466 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 467 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 468 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| 469 | }}} |
| 470 | |
| 471 | === -mv === |
| 472 | |
| 473 | * 更改資料的名稱 |
| 474 | {{{ |
| 475 | Jazz@human ~ |
| 476 | $ hadoop fs -mv in2 in3 |
| 477 | |
| 478 | Jazz@human ~ |
| 479 | $ hadoop fs -ls |
| 480 | Found 6 items |
| 481 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| 482 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| 483 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in3 |
| 484 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| 485 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| 486 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| 487 | }}} |
| 488 | |
| 489 | === -rm === |
| 490 | |
| 491 | * 刪除指定的檔案(不能是資料夾) |
| 492 | {{{ |
| 493 | Jazz@human ~ |
| 494 | $ hadoop fs -rm in1/input |
| 495 | Deleted hdfs://localhost:9000/user/Jazz/in1/input |
| 496 | }}} |
| 497 | |
| 498 | === -rmr === |
| 499 | |
| 500 | * 遞迴刪除資料夾(包含在內的所有檔案),可以是多個資料夾 |
| 501 | {{{ |
| 502 | Jazz@human ~ |
| 503 | $ hadoop fs -rmr dfs_input in1 in3 input1 |
| 504 | Deleted hdfs://localhost:9000/user/Jazz/dfs_input |
| 505 | Deleted hdfs://localhost:9000/user/Jazz/in1 |
| 506 | Deleted hdfs://localhost:9000/user/Jazz/in3 |
| 507 | Deleted hdfs://localhost:9000/user/Jazz/input1 |
| 508 | }}} |
| 509 | |
| 510 | === -setrep === |
| 511 | |
| 512 | * 設定副本係數 |
| 513 | * 語法:hadoop fs -setrep [-R] [-w] <rep> <path/file> |
| 514 | {{{ |
| 515 | Jazz@human ~ |
| 516 | $ hadoop fs -setrep -w 1 -R input |
| 517 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml |
| 518 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/configuration.xsl |
| 519 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/core-site.xml |
| 520 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh |
| 521 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties |
| 522 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml |
| 523 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml |
| 524 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/log4j.properties |
| 525 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/mapred-site.xml |
| 526 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/slaves |
| 527 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example |
| 528 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example |
| 529 | Waiting for hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml ... done |
| 530 | Waiting for hdfs://localhost:9000/user/Jazz/input/configuration.xsl ... done |
| 531 | Waiting for hdfs://localhost:9000/user/Jazz/input/core-site.xml ... done |
| 532 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh ... done |
| 533 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties ...done |
| 534 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml ... done |
| 535 | Waiting for hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml ... done |
| 536 | Waiting for hdfs://localhost:9000/user/Jazz/input/log4j.properties ... done |
| 537 | Waiting for hdfs://localhost:9000/user/Jazz/input/mapred-site.xml ... done |
| 538 | Waiting for hdfs://localhost:9000/user/Jazz/input/slaves ... done |
| 539 | Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example ... done |
| 540 | Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example ... done |
| 541 | $ bin/hadoop fs -setrep -w 2 -R input |
| 542 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt |
| 543 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt |
| 544 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt |
| 545 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt |
| 546 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done |
| 547 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done |
| 548 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done |
| 549 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done |
| 550 | }}} |
| 551 | |
| 552 | === -stat === |
| 553 | |
| 554 | * 印出時間資訊 |
| 555 | {{{ |
| 556 | Jazz@human ~ |
| 557 | $ hadoop fs -stat input |
| 558 | 2011-10-21 04:00:44 |
| 559 | }}} |
| 560 | |
| 561 | === -tail === |
| 562 | |
| 563 | * 將文件的最後1k內容輸出 |
| 564 | * 用法:hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大,則秀出被append上得內容) |
| 565 | {{{ |
| 566 | Jazz@human ~ |
| 567 | $ hadoop fs -tail input/log4j.properties |
| 568 | g4j.RollingFileAppender |
| 569 | #log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} |
| 570 | |
| 571 | # Logfile size and and 30-day backups |
| 572 | #log4j.appender.RFA.MaxFileSize=1MB |
| 573 | #log4j.appender.RFA.MaxBackupIndex=30 |
| 574 | |
| 575 | #log4j.appender.RFA.layout=org.apache.log4j.PatternLayout |
| 576 | #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n |
| 577 | #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) |
| 578 | - %m%n |
| 579 | |
| 580 | # |
| 581 | # FSNamesystem Audit logging |
| 582 | # All audit events are logged at INFO level |
| 583 | # |
| 584 | log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN |
| 585 | |
| 586 | # Custom Logging levels |
| 587 | |
| 588 | #log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG |
| 589 | #log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG |
| 590 | #log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG |
| 591 | |
| 592 | # Jets3t library |
| 593 | log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR |
| 594 | |
| 595 | # |
| 596 | # Event Counter Appender |
| 597 | # Sends counts of logging messages at different severity levels to Hadoop Metric |
| 598 | s. |
| 599 | # |
| 600 | log4j.appender.EventCounter=org.apache.hadoop.metrics.jvm.EventCounter |
| 601 | }}} |
| 602 | |
| 603 | === -test === |
| 604 | |
| 605 | * 測試檔案, -e 檢查文件是否存在(1=存在, 0=否), -z 檢查文件是否為空(1=空, 0=不為空), -d 檢查是否為目錄(1=存在, 0=否) |
| 606 | * 要用echo $? 來看回傳值為 0 or 1 |
| 607 | * 用法: bin/hadoop fs -test -[ezd] URI |
| 608 | {{{ |
| 609 | ########## -e 用來判斷檔案是否存在,回傳 0 為真,回傳 1 為偽 ########## |
| 610 | |
| 611 | Jazz@human ~ |
| 612 | $ hadoop fs -test -e input/slaves |
| 613 | |
| 614 | Jazz@human ~ |
| 615 | $ echo $? |
| 616 | 0 |
| 617 | |
| 618 | Jazz@human ~ |
| 619 | $ hadoop fs -test -e input/masters |
| 620 | |
| 621 | Jazz@human ~ |
| 622 | $ echo $? |
| 623 | 1 |
| 624 | |
| 625 | ########## -z 用來判斷檔案大小是否為零,回傳 0 為真,回傳 1 為偽 ########## |
| 626 | |
| 627 | Jazz@human ~ |
| 628 | $ hadoop fs -test -z input/slaves |
| 629 | |
| 630 | Jazz@human ~ |
| 631 | $ echo $? |
| 632 | 1 |
| 633 | |
| 634 | Jazz@human ~ |
| 635 | $ hadoop fs -test -z input/masters |
| 636 | test: File does not exist: input/masters |
| 637 | |
| 638 | ########## -d 用來判斷是不是目錄,回傳 0 為真,回傳 1 為偽 ########## |
| 639 | |
| 640 | Jazz@human ~ |
| 641 | $ hadoop fs -test -d input/slaves |
| 642 | |
| 643 | Jazz@human ~ |
| 644 | $ echo $? |
| 645 | 1 |
| 646 | |
| 647 | Jazz@human ~ |
| 648 | $ hadoop fs -test -d input |
| 649 | |
| 650 | Jazz@human ~ |
| 651 | $ echo $? |
| 652 | 0 |
| 653 | |
| 654 | }}} |
| 655 | |
| 656 | === -text === |
| 657 | |
| 658 | * 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式 |
| 659 | * hadoop fs -text <src> |
| 660 | {{{ |
| 661 | Jazz@human ~ |
| 662 | $ tar zcvf input.tar.gz input1 |
| 663 | input1/ |
| 664 | input1/capacity-scheduler.xml |
| 665 | input1/configuration.xsl |
| 666 | input1/core-site.xml |
| 667 | input1/hadoop-env.sh |
| 668 | input1/hadoop-metrics.properties |
| 669 | input1/hadoop-policy.xml |
| 670 | input1/hdfs-site.xml |
| 671 | input1/log4j.properties |
| 672 | input1/mapred-site.xml |
| 673 | input1/masters |
| 674 | input1/slaves |
| 675 | input1/ssl-client.xml.example |
| 676 | input1/ssl-server.xml.example |
| 677 | Jazz@human ~ |
| 678 | $ hadoop fs -put input1.tar.gz . |
| 679 | Jazz@human ~ |
| 680 | $ hadoop fs -text input.tar.gz |
| 681 | <略> |
| 682 | }}} |
| 683 | * 註:目前沒支援 zip 的函式庫 |
| 684 | {{{ |
| 685 | Jazz@human ~ |
| 686 | $ zip -r input1.zip input1/ |
| 687 | updating: input1/ (stored 0%) |
| 688 | adding: input1/capacity-scheduler.xml (deflated 71%) |
| 689 | adding: input1/configuration.xsl (deflated 50%) |
| 690 | adding: input1/core-site.xml (deflated 46%) |
| 691 | adding: input1/hadoop-env.sh (deflated 58%) |
| 692 | adding: input1/hadoop-metrics.properties (deflated 78%) |
| 693 | adding: input1/hadoop-policy.xml (deflated 83%) |
| 694 | adding: input1/hdfs-site.xml (deflated 35%) |
| 695 | adding: input1/log4j.properties (deflated 67%) |
| 696 | adding: input1/mapred-site.xml (deflated 34%) |
| 697 | adding: input1/masters (stored 0%) |
| 698 | adding: input1/slaves (stored 0%) |
| 699 | adding: input1/ssl-client.xml.example (deflated 79%) |
| 700 | adding: input1/ssl-server.xml.example (deflated 78%) |
| 701 | Jazz@human ~ |
| 702 | $ hadoop fs -put input1.zip . |
| 703 | Jazz@human ~ |
| 704 | $ hadoop fs -text input1.zip |
| 705 | PK |
| 706 | <略> |
| 707 | }}} |
| 708 | |
| 709 | === -touchz === |
| 710 | |
| 711 | * 建立一個空文件 |
| 712 | {{{ |
| 713 | Jazz@human ~ |
| 714 | $ hadoop fs -touchz empty |
| 715 | |
| 716 | Jazz@human ~ |
| 717 | $ hadoop fs -test -z empty ; echo $? |
| 718 | 0 |
| 719 | }}} |