[[PageOutline]]
◢ <[wiki:III110813/Lab2 實作二]> | <[wiki:III110813 回課程大綱]> ▲ | <[wiki:III110813/Lab4 實作四]> ◣
= 實作三 Lab3 =
{{{
#!html
HDFS 單機操作練習
HDFS in Practice
}}}
== 0. 啟動 Hadoop4Win ==
* STEP 1 : 請在「開始功能表」依序點選以下捷徑
* [[BR]][[Image(Hadoop4Win:hadoop4win-installer_11.jpg)]]
* STEP 2 :首先點選 start-hadoop 來啟動 Hadoop 的服務(跑在獨立的 CMD 視窗中)
* '''注意''':必須看到 Safe Mode is OFF 才算正常啟動完畢。
* [[BR]][[Image(Hadoop4Win:hadoop4win_29.jpg,width=800)]]
* STEP 3 :其次點選 NameNode Web UI 用瀏覽器開啟 http://localhost:50070 的頁面,確認 NameNode 正常開啟,可以正常顯示如下畫面:
* '''注意''':必須有一個 Live Node 才算是正常。
* [[BR]][[Image(Hadoop4Win:hadoop4win_10.jpg,width=800)]]
* STEP 4 :接著點選 JobTracker Web UI 用瀏覽器開啟 http://localhost:50030 的頁面,確認 JobTracker 正常開啟,可以正常顯示如下畫面:
* '''注意''':狀態必須是 RUNNING 才算是正常。
* [[BR]][[Image(Hadoop4Win:hadoop4win_11.jpg,width=800)]]
* STEP 5 : 最後點選 hadoop4win 來啟動 hadoop4win 的 Cygwin 視窗,用以輸入後續的指令。
* [[BR]][[Image(Hadoop4Win:hadoop4win_20.jpg,width=800)]]
== 1. HDFS 指令練習 ==
=== 1.1 瀏覽您的 HDFS 目錄 ===
* 首先,您可以使用 hadoop fs -ls 指令來瀏覽您的 HDFS 目錄
{{{
Jazz@human ~
$ hadoop fs -ls
Found 1 items
drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp
}}}
=== 1.2 上傳資料到 HDFS 目錄 ===
* 接著,讓我們來練習如何上傳資料到 HDFS 目錄。這裡我們使用的是 /opt/hadoop/conf 當作來源目錄,/user/${使用者名稱}/input 當作目標目錄。
* '''注意''':由於 Windows 版的 Hadoop 運行於 Cygwin 中,然而 Cygwin 的路徑是虛擬路徑,JRE(Java Runtime Environment)只認識 Windows 目錄路徑,因此倘若您遇到類似底下的錯誤訊息,請加上 cygpath -w 來轉換 Cygwin 路徑到 Windows 路徑。
{{{
Jazz@human ~
$ hadoop fs -put /opt/hadoop/conf input
put: File /opt/hadoop/conf does not exist.
Jazz@human ~
$ hadoop fs -put $(cygpath -w /opt/hadoop/conf) input
}}}
* 我們可以使用 hadoop fs -ls 來檢查剛剛上傳的檔案
{{{
Jazz@human ~
$ hadoop fs -ls
Found 2 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 11:45 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp
Jazz@human ~
$ hadoop fs -ls input
Found 13 items
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/masters
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
}}}
=== 1.3 下載 HDFS 的資料到本地目錄 ===
* 接著讓我們來練習如何透過指令從 HDFS 下載資料到本地目錄
{{{
Jazz@human ~
$ hadoop fs -get input fromHDFS
}}}
* 您可以透過 diff 指令來檢查剛剛上傳的內容與下載下來的內容是否一致
{{{
Jazz@human ~
$ diff -Naur fromHDFS/ /opt/hadoop/conf
}}}
=== 1.4 刪除 HDFS 上的檔案 ===
* 您可以透過 hadoop fs -rm 來刪除 HDFS 上的單一檔案
{{{
Jazz@human ~
$ hadoop fs -rm input/masters
Deleted hdfs://localhost:9000/user/Jazz/input/masters
}}}
* 倘若您欲刪除的是目錄,請使用 hadoop fs -rmr 來刪除 HDFS 上的目錄
{{{
Jazz@human ~
$ hadoop fs -rmr tmp
Deleted hdfs://localhost:9000/user/Jazz/tmp
}}}
=== 1.5 傾印 HDFS 上的檔案內容 ===
* 有時,如果只是想要查閱 HDFS 上的檔案內容,可以使用 hdfs fs -cat 來傾印(dump)檔案內容。
{{{
Jazz@human ~
$ hadoop fs -cat input/slaves
localhost
}}}
=== 1.6 更多 HDFS 指令操作 ===
* HDFS 支援的所有指令可以透過以下方式取得列表:
{{{
Jazz@human ~
$ hadoop fs
Usage: java FsShell
[-ls ]
[-lsr ]
[-du ]
[-dus ]
[-count[-q] ]
[-mv ]
[-cp ]
[-rm [-skipTrash] ]
[-rmr [-skipTrash] ]
[-expunge]
[-put ... ]
[-copyFromLocal ... ]
[-moveFromLocal ... ]
[-get [-ignoreCrc] [-crc] ]
[-getmerge [addnl]]
[-cat ]
[-text ]
[-copyToLocal [-ignoreCrc] [-crc] ]
[-moveToLocal [-crc] ]
[-mkdir ]
[-setrep [-R] [-w] ]
[-touchz ]
[-test -[ezd] ]
[-stat [format] ]
[-tail [-f] ]
[-chmod [-R] PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-chgrp [-R] GROUP PATH...]
[-help [cmd]]
Generic options supported are
-conf specify an application configuration file
-D use value for given property
-fs specify a namenode
-jt specify a job tracker
-files specify comma separated files to be copied to the map reduce cluster
-libjars specify comma separated jar files to include in the classpath.
-archives specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
}}}
== 2. 使用網頁介面來瀏覽 HDFS 的內容資訊 ==
* 您亦可透過調閱 [http://localhost:50070 NameNode] 的頁面來查詢方才上傳的檔案內容與 Block Size、File Size、Block Location、Rack Location 等資訊。
* [[BR]][[Image(Hadoop4Win:hadoop4win_30.jpg,width=800)]]
* [[BR]][[Image(Hadoop4Win:hadoop4win_31.jpg,width=800)]]
* [[BR]][[Image(Hadoop4Win:hadoop4win_32.jpg,width=800)]]
== 3. 更多 HDFS shell 的用法 ==
=== -ls ===
* -ls 的操作預設目錄在 /user/${username}/ 下,意思就是您使用的是相對於 /user/${username} 的「相對路徑」
{{{
Jazz@human ~
$ hadoop fs -ls input
Found 13 items
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
}}}
* 當然您也可以指定「完整路徑」,採用 '''hdfs://node:port/path''' 這種格式。
{{{
Jazz@human ~
$ hadoop fs -ls hdfs://localhost:9000/user/${USER}/input
Found 12 items
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
}}}
=== -cat ===
* 將路徑指定文件的內容輸出到標準輸出(STDOUT)
{{{
Jazz@human ~
$ hadoop fs -cat input/slaves
localhost
}}}
=== -chgrp ===
* 改變文件所屬的群組
{{{
Jazz@human ~
$ hadoop fs -ls input/slaves
Found 1 items
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves
Jazz@human ~
$ hadoop fs -chgrp ${USERNAME} input/slaves
Jazz@human ~
$ hadoop fs -ls input/slaves
Found 1 items
-rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
}}}
=== -chmod ===
* 改變文件的權限
{{{
Jazz@human ~
$ hadoop fs -ls input/slaves
Found 1 items
-rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
Jazz@human ~
$ hadoop fs -chmod 700 input/slaves
Jazz@human ~
$ hadoop fs -ls input/slaves
Found 1 items
-rw------- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
}}}
=== -chown ===
* 改變文件的擁有者
{{{
Jazz@human ~
$ hadoop fs -chown hadoop input/slaves
Jazz@human ~
$ hadoop fs -ls input/slaves
Found 1 items
-rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
}}}
=== -copyFromLocal, -put ===
* 從本機(local)上傳檔案到 HDFS
{{{
Jazz@human ~
$ hadoop fs -put fromHDFS dfs_input
Jazz@human ~
$ hadoop fs -ls
Found 2 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
}}}
=== -copyToLocal, -get ===
* 把 HDFS 上的檔案下載到本機(local)
{{{
Jazz@human ~
$ hadoop fs -get dfs_input input1
}}}
=== -cp ===
* 將文件從 HDFS 原本路徑複製到 HDFS 目標路徑
{{{
Jazz@human ~
$ hadoop fs -cp dfs_input input1
Jazz@human ~
$ hadoop fs -ls
Found 3 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
}}}
=== -du ===
* 顯示目錄中所有文件的大小
{{{
Jazz@human ~
$ hadoop fs -du input
Found 12 items
3936 hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml
535 hdfs://localhost:9000/user/Jazz/input/configuration.xsl
326 hdfs://localhost:9000/user/Jazz/input/core-site.xml
2409 hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh
1245 hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties
4190 hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml
196 hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml
2815 hdfs://localhost:9000/user/Jazz/input/log4j.properties
212 hdfs://localhost:9000/user/Jazz/input/mapred-site.xml
10 hdfs://localhost:9000/user/Jazz/input/slaves
1243 hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example
1195 hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example
}}}
=== -dus ===
* 顯示該目錄/文件的總大小
{{{
Jazz@human ~
$ hadoop fs -dus input
hdfs://localhost:9000/user/Jazz/input 18312
}}}
=== -expunge ===
* 清空垃圾桶
{{{
Jazz@human ~
$ hadoop fs -expunge
}}}
=== -getmerge ===
* 將來源目錄 下所有的文件都集合到本機一個 檔案內
* 語法:hadoop fs -getmerge
{{{
Jazz@human ~
$ mkdir -p in1
Jazz@human ~
$ echo "this is one; " > in1/input
Jazz@human ~
$ echo "this is two; " > in1/input2
Jazz@human ~
$ hadoop fs -put in1 in1
Jazz@human ~
$ hadoop fs -getmerge in1 merge.txt
Jazz@human ~
$ cat ./merge.txt
this is one;
this is two;
}}}
=== -ls ===
* 列出文件或目錄的資訊
* 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID
* 目錄名 修改日期 修改時間 權限 用戶ID 組ID
{{{
Jazz@human ~
$ hadoop fs -ls
Found 3 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
}}}
=== -lsr ===
* ls 命令的遞迴版本
{{{
Jazz@human ~
$ hadoop fs -lsr
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:33 /user/Jazz/dfs_input/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:33 /user/Jazz/dfs_input/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:33 /user/Jazz/dfs_input/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:33 /user/Jazz/dfs_input/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:33 /user/Jazz/dfs_input/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:33 /user/Jazz/dfs_input/mapred-site.xml
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/masters
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-server.xml.example
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1
-rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input
-rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input2
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml
-rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
-rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:34 /user/Jazz/input1/capacity-scheduler.xml
-rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:34 /user/Jazz/input1/configuration.xsl
-rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:34 /user/Jazz/input1/core-site.xml
-rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:34 /user/Jazz/input1/hadoop-env.sh
-rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:34 /user/Jazz/input1/hadoop-metrics.properties
-rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:34 /user/Jazz/input1/hadoop-policy.xml
-rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:34 /user/Jazz/input1/hdfs-site.xml
-rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:34 /user/Jazz/input1/log4j.properties
-rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:34 /user/Jazz/input1/mapred-site.xml
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/masters
-rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/slaves
-rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:34 /user/Jazz/input1/ssl-client.xml.example
-rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:34 /user/Jazz/input1/ssl-server.xml.example
}}}
=== -mkdir ===
* 建立資料夾
{{{
Jazz@human ~
$ hadoop fs -mkdir tmp
Jazz@human ~
$ hadoop fs -ls
Found 5 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
}}}
=== -moveFromLocal ===
* 將 local 端的資料夾剪下移動到 HDFS 上
{{{
Jazz@human ~
$ hadoop fs -moveFromLocal in1 in2
Jazz@human ~
$ hadoop fs -ls
Found 6 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in2
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
}}}
=== -mv ===
* 更改資料的名稱
{{{
Jazz@human ~
$ hadoop fs -mv in2 in3
Jazz@human ~
$ hadoop fs -ls
Found 6 items
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in3
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1
drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp
}}}
=== -rm ===
* 刪除指定的檔案(不能是資料夾)
{{{
Jazz@human ~
$ hadoop fs -rm in1/input
Deleted hdfs://localhost:9000/user/Jazz/in1/input
}}}
=== -rmr ===
* 遞迴刪除資料夾(包含在內的所有檔案),可以是多個資料夾
{{{
Jazz@human ~
$ hadoop fs -rmr dfs_input in1 in3 input1
Deleted hdfs://localhost:9000/user/Jazz/dfs_input
Deleted hdfs://localhost:9000/user/Jazz/in1
Deleted hdfs://localhost:9000/user/Jazz/in3
Deleted hdfs://localhost:9000/user/Jazz/input1
}}}
=== -setrep ===
* 設定副本係數
* 語法:hadoop fs -setrep [-R] [-w]
{{{
Jazz@human ~
$ hadoop fs -setrep -w 1 -R input
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/configuration.xsl
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/core-site.xml
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/log4j.properties
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/mapred-site.xml
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/slaves
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example
Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example
Waiting for hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/configuration.xsl ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/core-site.xml ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties ...done
Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/log4j.properties ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/mapred-site.xml ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/slaves ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example ... done
Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example ... done
$ bin/hadoop fs -setrep -w 2 -R input
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done
}}}
=== -stat ===
* 印出時間資訊
{{{
Jazz@human ~
$ hadoop fs -stat input
2011-10-21 04:00:44
}}}
=== -tail ===
* 將文件的最後1k內容輸出
* 用法:hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大,則秀出被append上得內容)
{{{
Jazz@human ~
$ hadoop fs -tail input/log4j.properties
g4j.RollingFileAppender
#log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
# Logfile size and and 30-day backups
#log4j.appender.RFA.MaxFileSize=1MB
#log4j.appender.RFA.MaxBackupIndex=30
#log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n
#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L))
- %m%n
#
# FSNamesystem Audit logging
# All audit events are logged at INFO level
#
log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN
# Custom Logging levels
#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
#log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG
# Jets3t library
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
#
# Event Counter Appender
# Sends counts of logging messages at different severity levels to Hadoop Metric
s.
#
log4j.appender.EventCounter=org.apache.hadoop.metrics.jvm.EventCounter
}}}
=== -test ===
* 測試檔案, -e 檢查文件是否存在(1=存在, 0=否), -z 檢查文件是否為空(1=空, 0=不為空), -d 檢查是否為目錄(1=存在, 0=否)
* 要用echo $? 來看回傳值為 0 or 1
* 用法: bin/hadoop fs -test -[ezd] URI
{{{
########## -e 用來判斷檔案是否存在,回傳 0 為真,回傳 1 為偽 ##########
Jazz@human ~
$ hadoop fs -test -e input/slaves
Jazz@human ~
$ echo $?
0
Jazz@human ~
$ hadoop fs -test -e input/masters
Jazz@human ~
$ echo $?
1
########## -z 用來判斷檔案大小是否為零,回傳 0 為真,回傳 1 為偽 ##########
Jazz@human ~
$ hadoop fs -test -z input/slaves
Jazz@human ~
$ echo $?
1
Jazz@human ~
$ hadoop fs -test -z input/masters
test: File does not exist: input/masters
########## -d 用來判斷是不是目錄,回傳 0 為真,回傳 1 為偽 ##########
Jazz@human ~
$ hadoop fs -test -d input/slaves
Jazz@human ~
$ echo $?
1
Jazz@human ~
$ hadoop fs -test -d input
Jazz@human ~
$ echo $?
0
}}}
=== -text ===
* 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式
* hadoop fs -text
{{{
Jazz@human ~
$ tar zcvf input.tar.gz input1
input1/
input1/capacity-scheduler.xml
input1/configuration.xsl
input1/core-site.xml
input1/hadoop-env.sh
input1/hadoop-metrics.properties
input1/hadoop-policy.xml
input1/hdfs-site.xml
input1/log4j.properties
input1/mapred-site.xml
input1/masters
input1/slaves
input1/ssl-client.xml.example
input1/ssl-server.xml.example
Jazz@human ~
$ hadoop fs -put input1.tar.gz .
Jazz@human ~
$ hadoop fs -text input.tar.gz
<略>
}}}
* 註:目前沒支援 zip 的函式庫
{{{
Jazz@human ~
$ zip -r input1.zip input1/
updating: input1/ (stored 0%)
adding: input1/capacity-scheduler.xml (deflated 71%)
adding: input1/configuration.xsl (deflated 50%)
adding: input1/core-site.xml (deflated 46%)
adding: input1/hadoop-env.sh (deflated 58%)
adding: input1/hadoop-metrics.properties (deflated 78%)
adding: input1/hadoop-policy.xml (deflated 83%)
adding: input1/hdfs-site.xml (deflated 35%)
adding: input1/log4j.properties (deflated 67%)
adding: input1/mapred-site.xml (deflated 34%)
adding: input1/masters (stored 0%)
adding: input1/slaves (stored 0%)
adding: input1/ssl-client.xml.example (deflated 79%)
adding: input1/ssl-server.xml.example (deflated 78%)
Jazz@human ~
$ hadoop fs -put input1.zip .
Jazz@human ~
$ hadoop fs -text input1.zip
PK
<略>
}}}
=== -touchz ===
* 建立一個空文件
{{{
Jazz@human ~
$ hadoop fs -touchz empty
Jazz@human ~
$ hadoop fs -test -z empty ; echo $?
0
}}}