實作三 Lab3
HDFS 單機操作練習
HDFS local mode in Practice
HDFS local mode in Practice
0. 啟動 Hadoop4Win
- STEP 1 : 請在「開始功能表」依序點選以下捷徑
- STEP 2 :首先點選 start-hadoop 來啟動 Hadoop 的服務(跑在獨立的 CMD 視窗中)
- STEP 3 :其次點選 NameNode Web UI 用瀏覽器開啟 http://localhost:50070 的頁面,確認 NameNode 正常開啟,可以正常顯示如下畫面:
- STEP 4 :接著點選 JobTracker Web UI 用瀏覽器開啟 http://localhost:50030 的頁面,確認 JobTracker 正常開啟,可以正常顯示如下畫面:
- STEP 5 : 最後點選 hadoop4win 來啟動 hadoop4win 的 Cygwin 視窗,用以輸入後續的指令。
1. HDFS 指令練習
1.1 瀏覽您的 HDFS 目錄
- 首先,您可以使用 hadoop fs -ls 指令來瀏覽您的 HDFS 目錄
Jazz@human ~ $ hadoop fs -ls Found 1 items drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp
1.2 上傳資料到 HDFS 目錄
- 接著,讓我們來練習如何上傳資料到 HDFS 目錄。這裡我們使用的是 /opt/hadoop/conf 當作來源目錄,/user/${使用者名稱}/input 當作目標目錄。
- 注意:由於 Windows 版的 Hadoop 運行於 Cygwin 中,然而 Cygwin 的路徑是虛擬路徑,JRE(Java Runtime Environment)只認識 Windows 目錄路徑,因此倘若您遇到類似底下的錯誤訊息,請加上 cygpath -w 來轉換 Cygwin 路徑到 Windows 路徑。
Jazz@human ~ $ hadoop fs -put /opt/hadoop/conf input put: File /opt/hadoop/conf does not exist. Jazz@human ~ $ hadoop fs -put $(cygpath -w /opt/hadoop/conf) input
- 我們可以使用 hadoop fs -ls 來檢查剛剛上傳的檔案
Jazz@human ~ $ hadoop fs -ls Found 2 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 11:45 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp Jazz@human ~ $ hadoop fs -ls input Found 13 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
1.3 下載 HDFS 的資料到本地目錄
- 接著讓我們來練習如何透過指令從 HDFS 下載資料到本地目錄
Jazz@human ~ $ hadoop fs -get input fromHDFS
- 您可以透過 diff 指令來檢查剛剛上傳的內容與下載下來的內容是否一致
Jazz@human ~ $ diff -Naur fromHDFS/ /opt/hadoop/conf
1.4 刪除 HDFS 上的檔案
- 您可以透過 hadoop fs -rm 來刪除 HDFS 上的單一檔案
Jazz@human ~ $ hadoop fs -rm input/masters Deleted hdfs://localhost:9000/user/Jazz/input/masters
- 倘若您欲刪除的是目錄,請使用 hadoop fs -rmr 來刪除 HDFS 上的目錄
Jazz@human ~ $ hadoop fs -rmr tmp Deleted hdfs://localhost:9000/user/Jazz/tmp
1.5 傾印 HDFS 上的檔案內容
- 有時,如果只是想要查閱 HDFS 上的檔案內容,可以使用 hdfs fs -cat 來傾印(dump)檔案內容。
Jazz@human ~ $ hadoop fs -cat input/slaves localhost
1.6 更多 HDFS 指令操作
- HDFS 支援的所有指令可以透過以下方式取得列表:
Jazz@human ~ $ hadoop fs Usage: java FsShell [-ls <path>] [-lsr <path>] [-du <path>] [-dus <path>] [-count[-q] <path>] [-mv <src> <dst>] [-cp <src> <dst>] [-rm [-skipTrash] <path>] [-rmr [-skipTrash] <path>] [-expunge] [-put <localsrc> ... <dst>] [-copyFromLocal <localsrc> ... <dst>] [-moveFromLocal <localsrc> ... <dst>] [-get [-ignoreCrc] [-crc] <src> <localdst>] [-getmerge <src> <localdst> [addnl]] [-cat <src>] [-text <src>] [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>] [-moveToLocal [-crc] <src> <localdst>] [-mkdir <path>] [-setrep [-R] [-w] <rep> <path/file>] [-touchz <path>] [-test -[ezd] <path>] [-stat [format] <path>] [-tail [-f] <file>] [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-chgrp [-R] GROUP PATH...] [-help [cmd]] Generic options supported are -conf <configuration file> specify an application configuration file -D <property=value> use value for given property -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions]
2. 使用網頁介面來瀏覽 HDFS 的內容資訊
- 您亦可透過調閱 NameNode 的頁面來查詢方才上傳的檔案內容與 Block Size、File Size、Block Location、Rack Location 等資訊。
3. 更多 HDFS shell 的用法
-ls
- -ls 的操作預設目錄在 /user/${username}/ 下,意思就是您使用的是相對於 /user/${username} 的「相對路徑」
Jazz@human ~ $ hadoop fs -ls input Found 13 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
- 當然您也可以指定「完整路徑」,採用 hdfs://node:port/path 這種格式。
Jazz@human ~ $ hadoop fs -ls hdfs://localhost:9000/user/${USER}/input Found 12 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
-cat
- 將路徑指定文件的內容輸出到標準輸出(STDOUT)
Jazz@human ~ $ hadoop fs -cat input/slaves localhost
-chgrp
- 改變文件所屬的群組
Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves Jazz@human ~ $ hadoop fs -chgrp ${USERNAME} input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
-chmod
- 改變文件的權限
Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves Jazz@human ~ $ hadoop fs -chmod 700 input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw------- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
-chown
- 改變文件的擁有者
Jazz@human ~ $ hadoop fs -chown hadoop input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
-copyFromLocal, -put
- 從本機(local)上傳檔案到 HDFS
Jazz@human ~ $ hadoop fs -put fromHDFS dfs_input Jazz@human ~ $ hadoop fs -ls Found 2 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
-copyToLocal, -get
- 把 HDFS 上的檔案下載到本機(local)
Jazz@human ~ $ hadoop fs -get dfs_input input1
-cp
- 將文件從 HDFS 原本路徑複製到 HDFS 目標路徑
Jazz@human ~ $ hadoop fs -cp dfs_input input1 Jazz@human ~ $ hadoop fs -ls Found 3 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
-du
- 顯示目錄中所有文件的大小
Jazz@human ~ $ hadoop fs -du input Found 12 items 3936 hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml 535 hdfs://localhost:9000/user/Jazz/input/configuration.xsl 326 hdfs://localhost:9000/user/Jazz/input/core-site.xml 2409 hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh 1245 hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties 4190 hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml 196 hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml 2815 hdfs://localhost:9000/user/Jazz/input/log4j.properties 212 hdfs://localhost:9000/user/Jazz/input/mapred-site.xml 10 hdfs://localhost:9000/user/Jazz/input/slaves 1243 hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example 1195 hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example
-dus
- 顯示該目錄/文件的總大小
Jazz@human ~ $ hadoop fs -dus input hdfs://localhost:9000/user/Jazz/input 18312
-expunge
- 清空垃圾桶
Jazz@human ~ $ hadoop fs -expunge
-getmerge
- 將來源目錄 <src> 下所有的文件都集合到本機一個 <localdst> 檔案內
- 語法:hadoop fs -getmerge <src> <localdst>
Jazz@human ~ $ mkdir -p in1 Jazz@human ~ $ echo "this is one; " > in1/input Jazz@human ~ $ echo "this is two; " > in1/input2 Jazz@human ~ $ hadoop fs -put in1 in1 Jazz@human ~ $ hadoop fs -getmerge in1 merge.txt Jazz@human ~ $ cat ./merge.txt this is one; this is two;
-ls
- 列出文件或目錄的資訊
- 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID
- 目錄名 <dir> 修改日期 修改時間 權限 用戶ID 組ID
Jazz@human ~ $ hadoop fs -ls Found 3 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
-lsr
- ls 命令的遞迴版本
Jazz@human ~ $ hadoop fs -lsr drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:33 /user/Jazz/dfs_input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:33 /user/Jazz/dfs_input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:33 /user/Jazz/dfs_input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:33 /user/Jazz/dfs_input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:33 /user/Jazz/dfs_input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:33 /user/Jazz/dfs_input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-server.xml.example drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input2 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:34 /user/Jazz/input1/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:34 /user/Jazz/input1/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:34 /user/Jazz/input1/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:34 /user/Jazz/input1/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:34 /user/Jazz/input1/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:34 /user/Jazz/input1/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:34 /user/Jazz/input1/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:34 /user/Jazz/input1/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:34 /user/Jazz/input1/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:34 /user/Jazz/input1/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:34 /user/Jazz/input1/ssl-server.xml.example
-mkdir
- 建立資料夾
Jazz@human ~ $ hadoop fs -mkdir tmp Jazz@human ~ $ hadoop fs -ls Found 5 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
-moveFromLocal
- 將 local 端的資料夾剪下移動到 HDFS 上
Jazz@human ~ $ hadoop fs -moveFromLocal in1 in2 Jazz@human ~ $ hadoop fs -ls Found 6 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in2 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
-mv
- 更改資料的名稱
Jazz@human ~ $ hadoop fs -mv in2 in3 Jazz@human ~ $ hadoop fs -ls Found 6 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in3 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
-rm
- 刪除指定的檔案(不能是資料夾)
Jazz@human ~ $ hadoop fs -rm in1/input Deleted hdfs://localhost:9000/user/Jazz/in1/input
-rmr
- 遞迴刪除資料夾(包含在內的所有檔案),可以是多個資料夾
Jazz@human ~ $ hadoop fs -rmr dfs_input in1 in3 input1 Deleted hdfs://localhost:9000/user/Jazz/dfs_input Deleted hdfs://localhost:9000/user/Jazz/in1 Deleted hdfs://localhost:9000/user/Jazz/in3 Deleted hdfs://localhost:9000/user/Jazz/input1
-setrep
- 設定副本係數
- 語法:hadoop fs -setrep [-R] [-w] <rep> <path/file>
Jazz@human ~ $ hadoop fs -setrep -w 1 -R input Replication 1 set: hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/configuration.xsl Replication 1 set: hdfs://localhost:9000/user/Jazz/input/core-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/log4j.properties Replication 1 set: hdfs://localhost:9000/user/Jazz/input/mapred-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/slaves Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example Waiting for hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/configuration.xsl ... done Waiting for hdfs://localhost:9000/user/Jazz/input/core-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties ...done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/log4j.properties ... done Waiting for hdfs://localhost:9000/user/Jazz/input/mapred-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/slaves ... done Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example ... done Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example ... done $ bin/hadoop fs -setrep -w 2 -R input Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done
-stat
- 印出時間資訊
Jazz@human ~ $ hadoop fs -stat input 2011-10-21 04:00:44
-tail
- 將文件的最後1k內容輸出
- 用法:hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大,則秀出被append上得內容)
Jazz@human ~ $ hadoop fs -tail input/log4j.properties g4j.RollingFileAppender #log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} # Logfile size and and 30-day backups #log4j.appender.RFA.MaxFileSize=1MB #log4j.appender.RFA.MaxBackupIndex=30 #log4j.appender.RFA.layout=org.apache.log4j.PatternLayout #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # # FSNamesystem Audit logging # All audit events are logged at INFO level # log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN # Custom Logging levels #log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG #log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG #log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG # Jets3t library log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR # # Event Counter Appender # Sends counts of logging messages at different severity levels to Hadoop Metric s. # log4j.appender.EventCounter=org.apache.hadoop.metrics.jvm.EventCounter
-test
- 測試檔案, -e 檢查文件是否存在(1=存在, 0=否), -z 檢查文件是否為空(1=空, 0=不為空), -d 檢查是否為目錄(1=存在, 0=否)
- 要用echo $? 來看回傳值為 0 or 1
- 用法: bin/hadoop fs -test -[ezd] URI
########## -e 用來判斷檔案是否存在,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -e input/slaves Jazz@human ~ $ echo $? 0 Jazz@human ~ $ hadoop fs -test -e input/masters Jazz@human ~ $ echo $? 1 ########## -z 用來判斷檔案大小是否為零,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -z input/slaves Jazz@human ~ $ echo $? 1 Jazz@human ~ $ hadoop fs -test -z input/masters test: File does not exist: input/masters ########## -d 用來判斷是不是目錄,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -d input/slaves Jazz@human ~ $ echo $? 1 Jazz@human ~ $ hadoop fs -test -d input Jazz@human ~ $ echo $? 0
-text
- 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式
- hadoop fs -text <src>
Jazz@human ~ $ tar zcvf input.tar.gz input1 input1/ input1/capacity-scheduler.xml input1/configuration.xsl input1/core-site.xml input1/hadoop-env.sh input1/hadoop-metrics.properties input1/hadoop-policy.xml input1/hdfs-site.xml input1/log4j.properties input1/mapred-site.xml input1/masters input1/slaves input1/ssl-client.xml.example input1/ssl-server.xml.example Jazz@human ~ $ hadoop fs -put input1.tar.gz . Jazz@human ~ $ hadoop fs -text input.tar.gz <略>
- 註:目前沒支援 zip 的函式庫
Jazz@human ~ $ zip -r input1.zip input1/ updating: input1/ (stored 0%) adding: input1/capacity-scheduler.xml (deflated 71%) adding: input1/configuration.xsl (deflated 50%) adding: input1/core-site.xml (deflated 46%) adding: input1/hadoop-env.sh (deflated 58%) adding: input1/hadoop-metrics.properties (deflated 78%) adding: input1/hadoop-policy.xml (deflated 83%) adding: input1/hdfs-site.xml (deflated 35%) adding: input1/log4j.properties (deflated 67%) adding: input1/mapred-site.xml (deflated 34%) adding: input1/masters (stored 0%) adding: input1/slaves (stored 0%) adding: input1/ssl-client.xml.example (deflated 79%) adding: input1/ssl-server.xml.example (deflated 78%) Jazz@human ~ $ hadoop fs -put input1.zip . Jazz@human ~ $ hadoop fs -text input1.zip PK <略>
-touchz
- 建立一個空文件
Jazz@human ~ $ hadoop fs -touchz empty Jazz@human ~ $ hadoop fs -test -z empty ; echo $? 0
Last modified 12 years ago
Last modified on Jan 26, 2013, 12:17:16 AM