Context Navigation

← Previous Version
View Latest Version
Next Version →

Version 1 (modified by jazz, 16 years ago) (diff)
--

實作二： HDFS Shell操作練習

1. 前言
Content 1. HDFS Shell基本操作
Content 2. 使用網頁Gui瀏覽資訊
Content 3. 更多HDFS shell 的用法
1. -cat
2. -chgrp
3. -chmod
4. -chown
5. -copyFromLocal, -put
6. -copyToLocal, -get
7. -cp
8. -du
9. -dus
10. -expunge
11. -getmerge
12. -ls
13. -lsr
14. -mkdir
15. -moveFromLocal
16. -mv
17. -rm
18. -rmr
19. -setrep
20. -stat
21. -tail
22. -test
23. -text
24. -touchz

前言

此部份接續實做一

Content 1. HDFS Shell基本操作

1.1 瀏覽你HDFS目錄

/opt/hadoop$ bin/hadoop fs -ls

1.2 上傳資料到HDFS目錄

上傳

/opt/hadoop$ bin/hadoop fs -put conf input

檢查

/opt/hadoop$ bin/hadoop fs -ls
/opt/hadoop$ bin/hadoop fs -ls input

1.3 下載HDFS的資料到本地目錄

下載

/opt/hadoop$ bin/hadoop fs -get input fromHDFS

檢查

/opt/hadoop$ ls -al | grep fromHDFS
/opt/hadoop$ ls -al fromHDFS

1.4 刪除檔案

/opt/hadoop$ bin/hadoop fs -ls input
/opt/hadoop$ bin/hadoop fs -rm input/masters

1.5 直接看檔案

/opt/hadoop$ bin/hadoop fs -ls input
/opt/hadoop$ bin/hadoop fs -cat input/slaves

1.6 更多指令操作

hadooper@vPro:/opt/hadoop$ bin/hadoop fs 

Usage: java FsShell
           [-ls <path>]
           [-lsr <path>]
           [-du <path>]
           [-dus <path>]
           [-count[-q] <path>]
           [-mv <src> <dst>]
           [-cp <src> <dst>]
           [-rm <path>]
           [-rmr <path>]
           [-expunge]
           [-put <localsrc> ... <dst>]
           [-copyFromLocal <localsrc> ... <dst>]
           [-moveFromLocal <localsrc> ... <dst>]
           [-get [-ignoreCrc] [-crc] <src> <localdst>]
           [-getmerge <src> <localdst> [addnl]]
           [-cat <src>]
           [-text <src>]
           [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]
           [-moveToLocal [-crc] <src> <localdst>]
           [-mkdir <path>]
           [-setrep [-R] [-w] <rep> <path/file>]
           [-touchz <path>]
           [-test -[ezd] <path>]
           [-stat [format] <path>]
           [-tail [-f] <file>]
           [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
           [-chown [-R] [OWNER][:[GROUP]] PATH...]
           [-chgrp [-R] GROUP PATH...]
           [-help [cmd]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

Content 2. 使用網頁Gui瀏覽資訊

Content 3. 更多HDFS shell 的用法

bin/hadoop fs <args> ，下面則列出 <args> 的用法

以下操作預設的目錄在 /user/<$username>/ 下

$ bin/hadoop fs -ls input
Found 4 items
-rw-r--r--   2 hadooper supergroup  115045564 2009-04-02 11:51 /user/hadooper/input/1.txt
-rw-r--r--   2 hadooper supergroup     987864 2009-04-02 11:51 /user/hadooper/input/2.txt
-rw-r--r--   2 hadooper supergroup    1573048 2009-04-02 11:51 /user/hadooper/input/3.txt
-rw-r--r--   2 hadooper supergroup   25844527 2009-04-02 11:51 /user/hadooper/input/4.txt

完整的路徑則是 hdfs://node:port/path 如：

$ bin/hadoop fs -ls hdfs://gm1.nchc.org.tw:9000/user/hadooper/input
Found 4 items
-rw-r--r--   2 hadooper supergroup  115045564 2009-04-02 11:51 /user/hadooper/input/1.txt
-rw-r--r--   2 hadooper supergroup     987864 2009-04-02 11:51 /user/hadooper/input/2.txt
-rw-r--r--   2 hadooper supergroup    1573048 2009-04-02 11:51 /user/hadooper/input/3.txt
-rw-r--r--   2 hadooper supergroup   25844527 2009-04-02 11:51 /user/hadooper/input/4.txt

-cat

將路徑指定文件的內容輸出到stdout
```
$ bin/hadoop fs -cat quota/hadoop-env.sh
```

-chgrp

改變文件所屬的組
```
$ bin/hadoop fs -chgrp -R hadooper own
```

-chmod

改變文件的權限
```
$ bin/hadoop fs -chmod -R 755 own
```

-chown

改變文件的擁有者
```
$ bin/hadoop fs -chown -R hadooper own
```

-copyFromLocal, -put

從local放檔案到hdfs
```
$ bin/hadoop fs -put input dfs_input
```

-copyToLocal, -get

把hdfs上得檔案下載到 local
```
$ bin/hadoop fs -get dfs_input input1
```

-cp

將文件從hdfs原本路徑複製到hdfs目標路徑
```
$ bin/hadoop fs -cp own hadooper
```

-du

顯示目錄中所有文件的大小

$ bin/hadoop fs -du input

Found 4 items
115045564   hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt
987864      hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt
1573048     hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt
25844527    hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt

-dus

顯示該目錄/文件的總大小

$ bin/hadoop fs -dus input

hdfs://gm1.nchc.org.tw:9000/user/hadooper/input	143451003

-expunge

清空垃圾桶
```
$ bin/hadoop fs -expunge
```

-getmerge

將來源目錄<src>下所有的文件都集合到本地端一個<localdst>檔案內

bin/hadoop fs -getmerge <src> <localdst>

$ echo "this is one; " >> in1/input
$ echo "this is two; " >> in1/input2
$ bin/hadoop fs -put in1 in1
$ bin/hadoop fs -getmerge in1 merge.txt
$ cat ./merge.txt

-ls

列出文件或目錄的資訊
文件名 <副本數> 文件大小修改日期修改時間權限用戶ID 組ID
目錄名 <dir> 修改日期修改時間權限用戶ID 組ID
```
$ bin/hadoop fs -ls
```

-lsr

ls命令的遞迴版本
```
$ bin/hadoop fs -lsr /
```

-mkdir

建立資料夾
```
$ bin/hadoop fs -mkdir a b c
```

-moveFromLocal

將local端的資料夾剪下移動到hdfs上
```
$ bin/hadoop fs -moveFromLocal in1 in2
```

-mv

更改資料的名稱
```
$ bin/hadoop fs -mv in2 in3
```

-rm

刪除指定的檔案（不可資料夾）
```
$ bin/hadoop fs -rm in1/input
```

-rmr

遞迴刪除資料夾（包含在內的所有檔案）
```
$ bin/hadoop fs -rmr in1
```

-setrep

設定副本係數

bin/hadoop fs -setrep [-R] [-w] <rep> <path/file>

$ bin/hadoop fs -setrep -w 2 -R input 
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt
Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done
Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done

-stat

印出時間資訊

$ bin/hadoop fs -stat input
2009-04-02 03:51:29

-tail

將文件的最後1k內容輸出
用法： bin/hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大，則秀出被append上得內容)
```
$ bin/hadoop fs -tail input/1.txt
```

-test

測試檔案， -e 檢查文件是否存在(1=存在, 0＝否)， -z 檢查文件是否為空(1=空, 0＝不為空)， -d 檢查是否為目錄(1=存在, 0＝否)
- 要用echo $? 來看回傳值為 0 or 1
用法： bin/hadoop fs -test -[ezd] URI

$ bin/hadoop fs -test -e /user/hadooper/input/5.txt
$ bin/hadoop fs -test -z /user/hadooper/input/5.txt
test: File does not exist: /user/hadooper/input/5.txt
$ bin/hadoop fs -test -d /user/hadooper/input/5.txt

test: File does not exist: /user/hadooper/input/5.txt

-text

將檔案（如壓縮檔, textrecordinputstream）輸出為純文字格式

hadoop fs -text <src>

$ hadoop fs -text macadr-eth1.txt.gz
00:1b:fc:61:75:b1
00:1b:fc:58:9c:23

ps : 目前沒支援zip的函式庫

$ bin/hadoop fs -text b/a.txt.zip
PK
���:��H{
        a.txtUT	b��Ib��IUx��sssss
test
PK
���:��H{
��a.txtUTb��IUxPK@C

-touchz

建立一個空文件

$ bin/hadoop fs -touchz b/kk
$ bin/hadoop fs -test -z b/kk
$ echo $?
1
$ bin/hadoop fs -test -z b/a.txt.zip
$ echo $?
0

Download in other formats:

Plain Text