Changes between Version 3 and Version 4 of NCTU110329/Lab4


Ignore:
Timestamp:
Apr 19, 2011, 9:57:33 AM (13 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • NCTU110329/Lab4

    v3 v4  
    1010}}}
    1111
    12   * 以下練習,請連線至 hadoop.nchc.org.tw 操作。
     12{{{
     13#!text
     14以下練習,請連線至 hadoop.nchc.org.tw 操作。底下的 hXXXX 等於您的用戶名稱。
     15}}}
    1316
    1417== Content 1: HDFS Shell 基本操作 ==
     
    2023{{{
    2124~$ hadoop fs -ls
     25Found 1 items
     26drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    2227~$ hadoop fs -lsr
     28drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    2329}}}
    2430
     
    2935
    3036{{{
    31 ~$ hadoop fs -put conf input
     37~$ hadoop fs -put /etc/hadoop/conf input
    3238}}}
    3339
     
    3642{{{
    3743~$ hadoop fs -ls
     44Found 2 items
     45drwxr-xr-x   - hXXXX supergroup          0 2011-04-19 09:16 /user/hXXXX/input
     46drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    3847~$ hadoop fs -ls input
     48Found 25 items
     49-rw-r--r--   2 hXXXX supergroup        321 2011-04-19 09:16 /user/hXXXX/input/README
     50-rw-r--r--   2 hXXXX supergroup       3936 2011-04-19 09:16 /user/hXXXX/input/capacity-scheduler.xml
     51-rw-r--r--   2 hXXXX supergroup        196 2011-04-19 09:16 /user/hXXXX/input/commons-logging.properties
     52(.... skip ....)
    3953}}}
    4054 
     
    4963
    5064 * 檢查 Check
    51 
    5265{{{
    5366~$ ls -al | grep fromHDFS
     67drwxr-xr-x    2 hXXXX hXXXX  4096 2011-04-19 09:18 fromHDFS
    5468~$ ls -al fromHDFS
     69總計 160
     70drwxr-xr-x 2 hXXXX hXXXX  4096 2011-04-19 09:18 .
     71drwx--x--x 3 hXXXX hXXXX  4096 2011-04-19 09:18 ..
     72-rw-r--r-- 1 hXXXX hXXXX  3936 2011-04-19 09:18 capacity-scheduler.xml
     73-rw-r--r-- 1 hXXXX hXXXX   196 2011-04-19 09:18 commons-logging.properties
     74-rw-r--r-- 1 hXXXX hXXXX   535 2011-04-19 09:18 configuration.xsl
     75(.... skip ....)
     76~$ diff /etc/hadoop/conf fromHDFS/
    5577}}} 
    5678
     
    5981
    6082{{{
    61 ~$ hadoop fs -ls input
     83~$ hadoop fs -ls input/masters
     84Found 1 items
     85-rw-r--r--   2 hXXXX supergroup         10 2011-04-19 09:16 /user/hXXXX/input/masters
    6286~$ hadoop fs -rm input/masters
     87Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/input/masters
    6388}}}
    6489
     
    6792
    6893{{{
    69 ~$ hadoop fs -ls input
     94~$ hadoop fs -ls input/slaves
     95Found 1 items
     96-rw-r--r--   2 hXXXX supergroup         10 2011-04-19 09:16 /user/hXXXX/input/slaves
    7097~$ hadoop fs -cat input/slaves
     98localhost
    7199}}}
    72100
     
    75103
    76104{{{
    77 hXXXX@vPro:~$ hadoop fs
     105hXXXX@hadoop:~$ hadoop fs
    78106
    79107Usage: java FsShell
     
    133161{{{
    134162$ hadoop fs -ls input
    135 Found 4 items
    136 -rw-r--r--   2 hXXXX supergroup  115045564 2009-04-02 11:51 /user/hXXXX/input/1.txt
    137 -rw-r--r--   2 hXXXX supergroup     987864 2009-04-02 11:51 /user/hXXXX/input/2.txt
    138 -rw-r--r--   2 hXXXX supergroup    1573048 2009-04-02 11:51 /user/hXXXX/input/3.txt
    139 -rw-r--r--   2 hXXXX supergroup   25844527 2009-04-02 11:51 /user/hXXXX/input/4.txt
     163Found 25 items
     164-rw-r--r--   2 hXXXX supergroup        321 2011-04-19 09:16 /user/hXXXX/input/README
     165-rw-r--r--   2 hXXXX supergroup       3936 2011-04-19 09:16 /user/hXXXX/input/capacity-scheduler.xml
     166-rw-r--r--   2 hXXXX supergroup        196 2011-04-19 09:16 /user/hXXXX/input/commons-logging.properties
     167(.... skip ....)
    140168}}}
    141169 * 完整的路徑則是 '''hdfs://node:port/path''' 如:[[BR]]Or you have to give a __''absolute path''__, such as '''hdfs://node:port/path'''
    142170{{{
    143 $ hadoop fs -ls hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input
    144 Found 4 items
    145 -rw-r--r--   2 hXXXX supergroup  115045564 2009-04-02 11:51 /user/hXXXX/input/1.txt
    146 -rw-r--r--   2 hXXXX supergroup     987864 2009-04-02 11:51 /user/hXXXX/input/2.txt
    147 -rw-r--r--   2 hXXXX supergroup    1573048 2009-04-02 11:51 /user/hXXXX/input/3.txt
    148 -rw-r--r--   2 hXXXX supergroup   25844527 2009-04-02 11:51 /user/hXXXX/input/4.txt
     171$ hadoop fs -ls hdfs://hadoop.nchc.org.tw/user/hXXXX/input
     172Found 25 items
     173-rw-r--r--   2 hXXXX supergroup        321 2011-04-19 09:16 /user/hXXXX/input/README
     174-rw-r--r--   2 hXXXX supergroup       3936 2011-04-19 09:16 /user/hXXXX/input/capacity-scheduler.xml
     175-rw-r--r--   2 hXXXX supergroup        196 2011-04-19 09:16 /user/hXXXX/input/commons-logging.properties
     176(.... skip ....)
    149177}}}
    150178
     
    153181 * 將路徑指定文件的內容輸出到 STDOUT [[BR]] Print given file content to STDOUT
    154182{{{
    155 $ hadoop fs -cat quota/hadoop-env.sh
     183$ hadoop fs -cat input/hadoop-env.sh
    156184}}}
    157185
    158 ===  -chgrp  ===
     186=== -chgrp  ===
    159187
    160188 * 改變文件所屬的組 [[BR]] Change '''owner group''' of given file or folder
    161189{{{
    162 $ hadoop fs -chgrp -R hXXXX own
     190$ hadoop fs -ls
     191Found 2 items
     192drwxr-xr-x   - hXXXX supergroup          0 2011-04-19 09:16 /user/hXXXX/input
     193drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
     194$ hadoop fs -chgrp -R ${USER} input
     195$ hadoop fs -ls
     196Found 2 items
     197drwxr-xr-x   - hXXXX hXXXX               0 2011-04-19 09:21 /user/hXXXX/input
     198drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    163199}}}
    164200
     
    167203 * 改變文件的權限 [[BR]] Change '''read and write permission''' of given file or folder
    168204{{{
    169 $ hadoop fs -chmod -R 755 own
     205$ hadoop fs -ls
     206Found 2 items
     207drwxr-xr-x   - hXXXX hXXXX               0 2011-04-19 09:21 /user/hXXXX/input
     208drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
     209$ hadoop fs -chmod -R 755 input
     210$ hadoop fs -ls
     211Found 2 items
     212drwxrwxrwx   - hXXXX hXXXX               0 2011-04-19 09:21 /user/hXXXX/input
     213drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    170214}}}
    171215
     
    174218 * 改變文件的擁有者 [[BR]] Change '''owner''' of given file or folder
    175219{{{
    176 $ hadoop fs -chown -R hXXXX own
     220$ hadoop fs -chown -R ${USER} input
     221}}}
     222 * 注意:因為在 hadoop.nchc.org.tw 上您沒有管理者權限,因此若要改成其他使用者時,會看到類似以下的錯誤訊息:
     223 * Note: Since you don't have the super user permission, you will see error message as following:
     224{{{
     225$ hadoop fs -chown -R h1000 input
     226chown: changing ownership of 'hdfs://hadoop.nchc.org.tw/user/hXXXX/input':org.apache.hadoop.security.AccessControlException: Non-super user cannot change owner.
    177227}}}
    178228
     
    181231 * 從 local 放檔案到 hdfs [[BR]] Both commands will copy given file or folder from local to HDFS
    182232{{{
    183 $ hadoop fs -put input dfs_input
     233$ hadoop fs -copyFromLocal /etc/hadoop/conf dfs_input
    184234}}}
    185235
     
    188238 * 把hdfs上得檔案下載到 local [[BR]] Both commands will copy given file or folder from HDFS to local
    189239{{{
    190 $ hadoop fs -get dfs_input input1
     240$ hadoop fs -copyToLocal dfs_input input1
    191241}}}
    192242
     
    195245 * 將文件從 hdfs 原本路徑複製到 hdfs 目標路徑 [[BR]] Copy given file or folder from HDFS source path to HDFS target path
    196246{{{
    197 $ hadoop fs -cp own hXXXX
     247$ hadoop fs -cp input input1
    198248}}}
    199249
     
    203253{{{
    204254$ hadoop fs -du input
    205 
    206 Found 4 items
    207 115045564   hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/1.txt
    208 987864      hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/2.txt
    209 1573048     hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/3.txt
    210 25844527    hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/4.txt
    211 }}}
     255Found 24 items
     256321         hdfs://hadoop.nchc.org.tw/user/hXXXX/input/README
     2573936        hdfs://hadoop.nchc.org.tw/user/hXXXX/input/capacity-scheduler.xml
     258196         hdfs://hadoop.nchc.org.tw/user/hXXXX/input/commons-logging.properties
     259( .... skip .... )
     260}}}
     261
    212262=== -dus ===
    213263
     
    215265{{{
    216266$ hadoop fs -dus input
    217 
    218 hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input 143451003
     267hdfs://hadoop.nchc.org.tw/user/hXXXX/input      84218
    219268}}}
    220269
     
    233282}}}
    234283{{{
     284$ mkdir -p in1
    235285$ echo "this is one; " >> in1/input
    236286$ echo "this is two; " >> in1/input2
     
    239289$ cat ./merge.txt
    240290}}}
    241 
    242 == -ls ===
     291 * 您應該會看到類似底下的結果:[[BR]]You should see results like this:
     292{{{
     293this is one;
     294this is two;
     295}}}
     296
     297=== -ls ===
    243298
    244299 * 列出文件或目錄的資訊 [[BR]] List files and folders
     
    247302{{{
    248303$ hadoop fs -ls
     304Found 5 items
     305drwxr-xr-x   - hXXXX supergroup          0 2011-04-19 09:32 /user/hXXXX/dfs_input
     306drwxr-xr-x   - hXXXX supergroup          0 2011-04-19 09:34 /user/hXXXX/in1
     307drwxrwxrwx   - hXXXX hXXXX               0 2011-04-19 09:21 /user/hXXXX/input
     308drwxr-xr-x   - hXXXX supergroup          0 2011-04-19 09:33 /user/hXXXX/input1
     309drwxr-xr-x   - hXXXX supergroup          0 2010-01-24 17:23 /user/hXXXX/tmp
    249310}}}
    250311
     
    253314 * ls 命令的遞迴版本 [[BR]] list files and folders with recursive
    254315{{{
    255 $ hadoop fs -lsr /
     316$ hadoop fs -lsr in1
     317-rw-r--r--   2 hXXXX supergroup         14 2011-04-19 09:34 /user/hXXXX/in1/input
     318-rw-r--r--   2 hXXXX supergroup         14 2011-04-19 09:34 /user/hXXXX/in1/input2
    256319}}}
    257320
     
    282345{{{
    283346$ hadoop fs -rm in1/input
     347Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/in1/input
    284348}}}
    285349=== -rmr ===
     
    287351 * 遞迴刪除資料夾(包含在內的所有檔案) [[BR]] Remove given files and folders with recursive
    288352{{{
    289 $ hadoop fs -rmr in1
     353$ hadoop fs -rmr a b c dfs_input in3 input input1
     354Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/a
     355Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/b
     356Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/c
     357Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/dfs_input
     358Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/in3
     359Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/input
     360Deleted hdfs://hadoop.nchc.org.tw/user/hXXXX/input1
    290361}}}
    291362
     
    297368}}}
    298369{{{
    299 $ hadoop fs -setrep -w 2 -R input
    300 Replication 2 set: hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/1.txt
    301 Replication 2 set: hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/2.txt
    302 Replication 2 set: hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/3.txt
    303 Replication 2 set: hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/4.txt
    304 Waiting for hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/1.txt ... done
    305 Waiting for hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/2.txt ... done
    306 Waiting for hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/3.txt ... done
    307 Waiting for hdfs://hadoop.nchc.org.tw:9000/user/hXXXX/input/4.txt ... done
     370$ hadoop fs -setrep -w 2 -R in1
     371Replication 2 set: hdfs://hadoop.nchc.org.tw/user/hXXXX/in1/input2
     372Waiting for hdfs://hadoop.nchc.org.tw/user/hXXXX/in1/input2 ... done
    308373}}}
    309374
     
    312377 * 印出時間資訊 [[BR]] Print Status of time stamp of folder
    313378{{{
    314 $ hadoop fs -stat input
    315 2009-04-02 03:51:29
     379$ hadoop fs -stat in1
     3802011-04-19 09:34:49
    316381}}}
    317382=== -tail ===
     
    324389}}}
    325390{{{
    326 $ hadoop fs -tail input/1.txt
     391$ hadoop fs -tail in1/input2
     392this is two;
    327393}}}
    328394
     
    337403 
    338404{{{
    339 $ hadoop fs -test -e /user/hXXXX/input/5.txt
    340 $ hadoop fs -test -z /user/hXXXX/input/5.txt
    341 test: File does not exist: /user/hXXXX/input/5.txt
    342 $ hadoop fs -test -d /user/hXXXX/input/5.txt
    343 
    344 test: File does not exist: /user/hXXXX/input/5.txt
    345 }}}
    346 
    347 === -text ===
    348 
    349  * 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式 [[BR]] Display archive file contents into STDOUT
    350 {{{
    351 $ hadoop fs -text <src>
    352 }}}
    353 {{{
    354 $ hadoop fs -text macadr-eth1.txt.gz
    355 00:1b:fc:61:75:b1
    356 00:1b:fc:58:9c:23
    357 }}}
    358  * ps : 目前沒支援zip的函式庫 [[BR]] PS. It does not support zip files yet.
    359 {{{
    360 $ hadoop fs -text b/a.txt.zip
    361 PK
    362 ���:��H{
    363         a.txtUT b��Ib��IUx��sssss
    364 test
    365 PK
    366 ���:��H{
    367 ��a.txtUTb��IUxPK@C
    368 }}}
    369 
    370 === -touchz ===
    371 
    372  * 建立一個空文件 [[BR]] creat an empty file
    373 {{{
    374 $ hadoop fs -touchz b/kk
    375 $ hadoop fs -test -z b/kk
     405$ hadoop fs -test -e in1/input2
     406$ echo $?
     4070
     408$ hadoop fs -test -z in1/input3
    376409$ echo $?
    3774101
    378 $ hadoop fs -test -z b/a.txt.zip
     411$ hadoop fs -test -d in1/input2
     412$ echo $?
     4131
     414}}}
     415
     416=== -text ===
     417
     418 * 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式 [[BR]] Display archive file contents into STDOUT
     419{{{
     420$ hadoop fs -text <src>
     421}}}
     422{{{
     423$ gzip merge.txt
     424$ hadoop fs -put merge.txt.gz .
     425$ hadoop fs -text merge.txt.gz
     42611/04/19 09:54:16 INFO util.NativeCodeLoader: Loaded the native-hadoop library
     42711/04/19 09:54:16 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
     428this is one;
     429this is two;
     430}}}
     431 * ps : 目前沒支援zip的函式庫 [[BR]] PS. It does not support zip files yet.
     432{{{
     433$ gunzip merge.txt.gz
     434$ zip merge.zip merge.txt
     435$ hadoop fs -put merge.zip .
     436$ hadoop fs -text merge.zip
     437PK�N�>E73
     438       
     439merge.txtUT     ���Mq��Mux
     440                                           ��+��,V���Tk�(��<�PK
     441�N�>E73
     442        ��merge.txtUT���Mux
     443                              ��PKOY
     444}}}
     445
     446=== -touchz ===
     447
     448 * 建立一個空文件 [[BR]] creat an empty file
     449{{{
     450$ hadoop fs -touchz in1/kk
     451$ hadoop fs -test -z in1/kk
    379452$ echo $?
    3804530
    381454}}}
     455
     456----
     457
     458 * 您可以用以下指令把以上練習產生的暫存目錄與檔案清除:[[BR]]You can clean up the temporary folders and files using following command:
     459{{{
     460~$ hadoop fs -rmr in1 merge.txt.gz merge.zip
     461~$ rm -rf input1/ fromHDFS/ merge.txt
     462}}}