{{{
#!html
<div style="text-align: center;"><big
 style="font-weight: bold;"><big><big>實作二： HDFS Shell操作練習</big></big></big><br><big><big>Lab2: HDFS Shell in practice</big></big></div>
}}}

[[PageOutline]]

== 前言 Preface ==

 * 此部份接續[wiki:NCHCCloudCourse100802/Lab1 實作一][[BR]]Please refer to [wiki:NCHCCloudCourse100802/Lab1 Lab1]
    
== Content 1: HDFS Shell 基本操作 ==
== Content 1: Basic HDFS Shell Commands ==

=== 1.1 瀏覽你HDFS目錄 ===
=== 1.1 Browsing Your HDFS Folder ===

{{{
/opt/hadoop$ bin/hadoop fs -ls
/opt/hadoop$ bin/hadoop fs -lsr
}}}

=== 1.2 上傳資料到 HDFS 目錄 ===
=== 1.2 Upload Files or Folder to HDFS ===

 * 上傳 Upload

{{{
/opt/hadoop$ bin/hadoop fs -put conf input
}}} 

 * 檢查 Check

{{{
/opt/hadoop$ bin/hadoop fs -ls
/opt/hadoop$ bin/hadoop fs -ls input
}}}
 
=== 1.3 下載 HDFS 的資料到本地目錄 ===
=== 1.3 Download HDFS Files or Folder to Local ===

 * 下載 Download

{{{
/opt/hadoop$ bin/hadoop fs -get input fromHDFS
}}} 

 * 檢查 Check

{{{
/opt/hadoop$ ls -al | grep fromHDFS
/opt/hadoop$ ls -al fromHDFS
}}}  

=== 1.4 刪除檔案 ===
=== 1.4 Remove Files or Folder ===

{{{
/opt/hadoop$ bin/hadoop fs -ls input
/opt/hadoop$ bin/hadoop fs -rm input/masters
}}} 

=== 1.5 直接看檔案 ===
=== 1.5 Browse Files Directly ===

{{{
/opt/hadoop$ bin/hadoop fs -ls input
/opt/hadoop$ bin/hadoop fs -cat input/slaves
}}} 

=== 1.6 更多指令操作 ===
=== 1.6 More Commands -- Help message ===

{{{
hadooper@vPro:/opt/hadoop$ bin/hadoop fs 

Usage: java FsShell
           [-ls <path>]
           [-lsr <path>]
           [-du <path>]
           [-dus <path>]
           [-count[-q] <path>]
           [-mv <src> <dst>]
           [-cp <src> <dst>]
           [-rm <path>]
           [-rmr <path>]
           [-expunge]
           [-put <localsrc> ... <dst>]
           [-copyFromLocal <localsrc> ... <dst>]
           [-moveFromLocal <localsrc> ... <dst>]
           [-get [-ignoreCrc] [-crc] <src> <localdst>]
           [-getmerge <src> <localdst> [addnl]]
           [-cat <src>]
           [-text <src>]
           [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]
           [-moveToLocal [-crc] <src> <localdst>]
           [-mkdir <path>]
           [-setrep [-R] [-w] <rep> <path/file>]
           [-touchz <path>]
           [-test -[ezd] <path>]
           [-stat [format] <path>]
           [-tail [-f] <file>]
           [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
           [-chown [-R] [OWNER][:[GROUP]] PATH...]
           [-chgrp [-R] GROUP PATH...]
           [-help [cmd]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
}}}  
 
== Content 2: 使用網頁 GUI 瀏覽資訊 ==
== Content 2: User Web GUI to browse HDFS ==
 
 * [http://localhost:50030 JobTracker Web Interface]
 * [http://localhost:50070 NameNode Web Interface]
 
== Content 3: 更多 HDFS Shell 的用法 ==
== Content 3: More about HDFS Shell ==
 
 * bin/hadoop fs <args> ，下面則列出 <args> 的用法[[BR]]Following are the examples of hadoop fs related commands.
 * 以下操作預設的目錄在 /user/<$username>/ 下[[BR]]By default, your working directory will be at /user/<$username>/.
{{{
$ bin/hadoop fs -ls input
Found 4 items
-rw-r--r--   2 hadooper supergroup  115045564 2009-04-02 11:51 /user/hadooper/input/1.txt
-rw-r--r--   2 hadooper supergroup     987864 2009-04-02 11:51 /user/hadooper/input/2.txt
-rw-r--r--   2 hadooper supergroup    1573048 2009-04-02 11:51 /user/hadooper/input/3.txt
-rw-r--r--   2 hadooper supergroup   25844527 2009-04-02 11:51 /user/hadooper/input/4.txt
}}}
 * 完整的路徑則是 '''hdfs://node:port/path''' 如：[[BR]]Or you have to give a __''absolute path''__, such as '''hdfs://node:port/path'''
{{{
$ bin/hadoop fs -ls hdfs://gm1.nchc.org.tw:9000/user/hadooper/input
Found 4 items
-rw-r--r--   2 hadooper supergroup  115045564 2009-04-02 11:51 /user/hadooper/input/1.txt
-rw-r--r--   2 hadooper supergroup     987864 2009-04-02 11:51 /user/hadooper/input/2.txt
-rw-r--r--   2 hadooper supergroup    1573048 2009-04-02 11:51 /user/hadooper/input/3.txt
-rw-r--r--   2 hadooper supergroup   25844527 2009-04-02 11:51 /user/hadooper/input/4.txt
}}}

=== -cat  === 

 * 將路徑指定文件的內容輸出到 STDOUT [[BR]] Print given file content to STDOUT
{{{
$ bin/hadoop fs -cat quota/hadoop-env.sh
}}} 

===  -chgrp  ===

 * 改變文件所屬的組 [[BR]] Change '''owner group''' of given file or folder
{{{
$ bin/hadoop fs -chgrp -R hadooper own
}}}

=== -chmod === 

 * 改變文件的權限 [[BR]] Change '''read and write permission''' of given file or folder
{{{
$ bin/hadoop fs -chmod -R 755 own
}}}

=== -chown ===

 * 改變文件的擁有者 [[BR]] Change '''owner''' of given file or folder
{{{
$ bin/hadoop fs -chown -R hadooper own
}}}

=== -copyFromLocal, -put ===

 * 從 local 放檔案到 hdfs [[BR]] Both commands will copy given file or folder from local to HDFS
{{{
$ bin/hadoop fs -put input dfs_input
}}}

=== -copyToLocal, -get ===

 * 把hdfs上得檔案下載到 local [[BR]] Both commands will copy given file or folder from HDFS to local
{{{
$ bin/hadoop fs -get dfs_input input1
}}}

=== -cp ===

 * 將文件從 hdfs 原本路徑複製到 hdfs 目標路徑 [[BR]] Copy given file or folder from HDFS source path to HDFS target path
{{{
$ bin/hadoop fs -cp own hadooper
}}}

=== -du ===

 * 顯示目錄中所有文件的大小 [[BR]] Display the size of files in given folder
{{{
$ bin/hadoop fs -du input

Found 4 items
115045564   hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt
987864      hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt
1573048     hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt
25844527    hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt
}}}
=== -dus ===

 * 顯示該目錄/文件的總大小 [[BR]] Display total size of given folder
{{{
$ bin/hadoop fs -dus input

hdfs://gm1.nchc.org.tw:9000/user/hadooper/input	143451003
}}}

=== -expunge ===

 * 清空垃圾桶 [[BR]] Clean up Recycled
{{{
$ bin/hadoop fs -expunge
}}}

=== -getmerge ===

 * 將來源目錄<src>下所有的文件都集合到本地端一個<localdst>檔案內 [[BR]] Merge all files in HDFS source folder <src> into one local file
{{{
$ bin/hadoop fs -getmerge <src> <localdst> 
}}}
{{{
$ echo "this is one; " >> in1/input
$ echo "this is two; " >> in1/input2
$ bin/hadoop fs -put in1 in1
$ bin/hadoop fs -getmerge in1 merge.txt
$ cat ./merge.txt
}}}

== -ls ===

 * 列出文件或目錄的資訊 [[BR]] List files and folders
 * 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID  [[BR]] <file name> <replication> <size> <modified date> <modified time> <permission> <user id> <group id>
 * 目錄名 <dir> 修改日期 修改時間 權限 用戶ID 組ID [[BR]] <folder name> <modified date> <modified time> <permission> <user id> <group id>
{{{
$ bin/hadoop fs -ls
}}}

=== -lsr ===

 * ls 命令的遞迴版本 [[BR]] list files and folders with recursive
{{{
$ bin/hadoop fs -lsr /
}}}

=== -mkdir ===

 * 建立資料夾 [[BR]] create directories
{{{
$ bin/hadoop fs -mkdir a b c
}}}

=== -moveFromLocal ===

 * 將 local 端的資料夾剪下移動到 hdfs 上 [[BR]] move local files or folder to HDFS ( it will delete local files or folder. )
{{{
$ bin/hadoop fs -moveFromLocal in1 in2
}}}

=== -mv ===

 * 更改資料的名稱 [[BR]] Change file name or folder name.
{{{
$ bin/hadoop fs -mv in2 in3
}}}

=== -rm ===

 * 刪除指定的檔案（不可資料夾）[[BR]] Remove given files (not folders)
{{{
$ bin/hadoop fs -rm in1/input
}}}
=== -rmr ===

 * 遞迴刪除資料夾（包含在內的所有檔案） [[BR]] Remove given files and folders with recursive
{{{
$ bin/hadoop fs -rmr in1
}}}

=== -setrep ===

 * 設定副本係數 [[BR]] setup replication numbers of given files or folder
{{{
$ bin/hadoop fs -setrep [-R] [-w] <rep> <path/file>
}}}
{{{
$ bin/hadoop fs -setrep -w 2 -R input 
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done
}}}

=== -stat ===

 * 印出時間資訊 [[BR]] Print Status of time stamp of folder
{{{
$ bin/hadoop fs -stat input
2009-04-02 03:51:29
}}}
=== -tail ===

 * 將文件的最後 1K 內容輸出 [[BR]] Display the last 1K contents of given file
 * 用法  Usage
{{{
bin/hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大，則秀出被append上得內容)
bin/hadoop fs -tail [-f] <path/file> (-f is used when file had appended)
}}}
{{{
$ bin/hadoop fs -tail input/1.txt
}}}

=== -test ===

 * 測試檔案， -e 檢查文件是否存在(1=存在, 0＝否)， -z 檢查文件是否為空(1=空, 0＝不為空)， -d 檢查是否為目錄(1=存在, 0＝否) [[BR]] test files or folders [[BR]] -e : check if file or folder existed ( 1 = exist , 0 = false )[[BR]] -z : check if file is empty ( 1 = empty , 0 = false ) [[BR]] -d : check if given path is folder ( 1 = it's folder , 0 = false )
   * 要用 echo $? 來看回傳值為 0 or 1 [[BR]] You have to use '''echo $?''' to get the return value
 * 用法 Usage
{{{
$ bin/hadoop fs -test -[ezd] URI
}}}
 
{{{
$ bin/hadoop fs -test -e /user/hadooper/input/5.txt
$ bin/hadoop fs -test -z /user/hadooper/input/5.txt
test: File does not exist: /user/hadooper/input/5.txt
$ bin/hadoop fs -test -d /user/hadooper/input/5.txt

test: File does not exist: /user/hadooper/input/5.txt
}}}

=== -text ===

 * 將檔案（如壓縮檔, textrecordinputstream）輸出為純文字格式 [[BR]] Display archive file contents into STDOUT
{{{
$ hadoop fs -text <src>
}}}
{{{
$ hadoop fs -text macadr-eth1.txt.gz
00:1b:fc:61:75:b1
00:1b:fc:58:9c:23
}}}
 * ps : 目前沒支援zip的函式庫 [[BR]] PS. It does not support zip files yet.
{{{
$ bin/hadoop fs -text b/a.txt.zip
PK
���:��H{
        a.txtUT	b��Ib��IUx��sssss
test
PK
���:��H{
��a.txtUTb��IUxPK@C
}}}

=== -touchz ===

 * 建立一個空文件 [[BR]] creat an empty file
{{{
$ bin/hadoop fs -touchz b/kk
$ bin/hadoop fs -test -z b/kk
$ echo $?
1
$ bin/hadoop fs -test -z b/a.txt.zip
$ echo $?
0
}}}