[[PageOutline]] = Hadoop分佈式文件系統使用指南二 = == 升級 == * 由於換版本的話,資料夾內的conf設定檔也勢必被更改,因此目前作法為: 把conf 移至/opt/conf ,hadoop 0.16 與 hadoop 0.18用 ln 做捷徑代換。由於conf已不在hadoop_home內,因此記得匯入conf/hadoop-env.sh {{{ $ source /opt/conf/hadoop-env.sh }}} * 先看狀態 {{{ $ bin/hadoop dfsadmin -upgradeProgress status There are no upgrades in progress. }}} * 停止hdfs * 注意不可使用bin/stop-all.sh來停止 {{{ $ bin/stop-dfs.sh }}} * 部署新版本的Hadoop * 注意每個node的版本都要統一,否則會出現問題 * 啟動 {{{ $ bin/start-dfs.sh -upgrade }}} ps:之後有介紹到 bin/hadoop namenode -upgrade ,應該要查查看與 $ bin/start-dfs.sh -upgrade 有何不同 * namenode管理網頁會出現升級狀態 == 退回 == * 停止集群 {{{ $ bin/stop-dfs.sh }}} * 部署老版本的Hadoop * 退回之前版本 {{{ $ bin/start-dfs.sh -rollback }}} ps:之後有介紹到 bin/hadoop namenode -rollback ,應該要查查看與 $ bin/start-dfs.sh -rollback 有何不同 = bin/hadoop 的使用者用法 = {{{ $ hadoop [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS] }}} [GENERIC_OPTIONS] : || -conf || 指定應用程序的配置文件。 || || -D || 為指定property指定值value。 || || -fs || 指定namenode。 || || -jt || 指定job tracker。只適用於job。 || == archieve == * archieve就是把資料壓縮成一個檔案,在壓縮的過程中,還會將被壓縮的目錄結構紀錄在index與masterindex內。 * 由於每個上傳上去的檔案都被放在一個block中,因此我的input資料夾內共有四個檔,但是每個檔都會佔用一個block,用此方法就可以按照整個打包大小來分配共用去多少個block數。 * hadoop archive -archiveName name * {{{ $ bin/hadoop archive -archiveName foo.har input/* output 09/04/02 14:02:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 09/04/02 14:02:30 INFO mapred.JobClient: Running job: job_200904021140_0001 09/04/02 14:02:31 INFO mapred.JobClient: map 0% reduce 0% 09/04/02 14:02:44 INFO mapred.JobClient: map 20% reduce 0% 09/04/02 14:02:49 INFO mapred.JobClient: map 100% reduce 0% 09/04/02 14:02:56 INFO mapred.JobClient: Job complete: job_200904021140_0001 ...略 }}} * 看har裡面的檔案結構 {{{ $ bin/hadoop dfs -lsr /user/waue/output/foo.har }}} * 看har內檔案的內容 {{{ $ bin/hadoop dfs -cat /user/waue/output/foo.har/part-0 }}} * ps: 官方文件介紹的 hadoop dfs -lsr har:///user/hadoop/output/foo.har 會出現錯誤! {{{ #!sh lsr: could not get get listing for 'har:/user/waue/output/foo.har/user/waue' : File: har://hdfs-gm1.nchc.org.tw:9000/user/waue/output/foo.har/user/waue/input does not exist in har:///user/waue/output/foo.har }}} == distCp == * 是用於大規模集群內部和集群之間拷貝的工具 * 使用Map/Reduce實現文件分發,錯誤處理和恢復,以及報告生成 * 舉例為: {{{ hadoop distcp hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo }}} ?? 然而8020 port 在機器上沒有開,且不是應該檔案會均勻散佈在每個節點上嗎?怎麼還會知道nn1的節點上有這個檔要複製到nn2呢? * 參考:http://cn.hadoop.org/doc/distcp.html == fsck == * HDFS文件系統檢查工具 {{{ $ bin/hadoop fsck / . /user/waue/input/1.txt: Under replicated blk_-90085106852013388_1001. Target Replicas is 3 but found 2 replica(s). /user/waue/input/1.txt: Under replicated blk_-4027196261436469955_1001. Target Replicas is 3 but found 2 replica(s). . /user/waue/input/2.txt: Under replicated blk_-2300843106107816641_1002. Target Replicas is 3 but found 2 replica(s). . /user/waue/input/3.txt: Under replicated blk_-1561577350198661966_1003. Target Replicas is 3 but found 2 replica(s). . /user/waue/input/4.txt: Under replicated blk_1316726598778579026_1004. Target Replicas is 3 but found 2 replica(s). Status: HEALTHY Total size: 143451003 B Total dirs: 8 Total files: 4 Total blocks (validated): 5 (avg. block size 28690200 B) Minimally replicated blocks: 5 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 5 (100.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.0 Corrupt blocks: 0 Missing replicas: 5 (50.0 %) Number of data-nodes: 2 Number of racks: 1 The filesystem under path '/' is HEALTHY }}} * 加不同的參數有不同的用處,如 {{{ $ bin/hadoop fsck / -files /tmp /tmp/hadoop /tmp/hadoop/hadoop-waue /tmp/hadoop/hadoop-waue/mapred /tmp/hadoop/hadoop-waue/mapred/system /user /user/waue /user/waue/input /user/waue/input/1.txt 115045564 bytes, 2 block(s): Under replicated blk_-90085106852013388_1001. Target Replicas is 3 but found 2 replica(s). Under replicated blk_-4027196261436469955_1001. Target Replicas is 3 but found 2 replica(s). /user/waue/input/2.txt 987864 bytes, 1 block(s): Under replicated blk_-2300843106107816641_1002. Target Replicas is 3 but found 2 replica(s). /user/waue/input/3.txt 1573048 bytes, 1 block(s): Under replicated blk_-1561577350198661966_1003. Target Replicas is 3 but found 2 replica(s). /user/waue/input/4.txt 25844527 bytes, 1 block(s): Under replicated blk_1316726598778579026_1004. Target Replicas is 3 but found 2 replica(s). Status: HEALTHY ....(同上) }}} || -move || -delete || -openforwrite || || 移動受損文件到/lost+found || 刪除受損文件 || 印出寫打開的文件 || || -files || -blocks || -locations || -racks || || 印出正被檢查的文件 || 印出區塊的資訊 || 印出每個區塊的位置 || 印出data-node的網絡拓撲結構 || 如: {{{ $ bin/hadoop fsck /user/waue/input/1.txt -files -blocks -locations /user/waue/input/1.txt 115045564 bytes, 2 block(s): Under replicated blk_-90085106852013388_1001. Target Replicas is 3 but found 2 replica(s). Under replicated blk_-4027196261436469955_1001. Target Replicas is 3 but found 2 replica(s). 0. blk_-90085106852013388_1001 len=67108864 repl=2 [140.110.138.191:50010, 140.110.141.129:50010] 1. blk_-4027196261436469955_1001 len=47936700 repl=2 [140.110.138.191:50010, 140.110.141.129:50010] Status: HEALTHY Total size: 115045564 B Total dirs: 0 Total files: 1 ....(略) }}} == job == * 用以跟Map Reduce 的作業程序溝通 * 在測試此指令之前,請確認已經先執行過mapReduce的程序過 * 可到JobTracker:50030網頁來看程序的Jobid === -status === * 查看工作狀態 {{{ $ bin/hadoop job -status job_200904021140_0001 }}} === -kill === * 終止正在執行的程序,其id為 job_200904021140_0001 {{{ $ bin/hadoop job -kill job_200904021140_0001 }}} === -list === * 印出所有程序的狀態 {{{ $ bin/hadoop job -list all 5 jobs submitted States are: Running : 1 Succeded : 2 Failed : 3 Prep : 4 JobId State StartTime UserName job_200904021140_0001 2 1238652150499 waue job_200904021140_0002 3 1238657754096 waue job_200904021140_0004 3 1238657989495 waue job_200904021140_0005 2 1238658076347 waue job_200904021140_0006 2 1238658644666 waue }}} === -history === * 印出程序的歷史狀態 {{{ $ bin/hadoop job -history /user/waue/stream-output1 Hadoop job: job_200904021140_0005 ===================================== Job tracker host name: gm1.nchc.org.tw job tracker start time: Thu Apr 02 11:40:06 CST 2009 User: waue JobName: streamjob9019.jar JobConf: hdfs://gm1.nchc.org.tw:9000/tmp/hadoop/hadoop-waue/mapred/system/job_200904021140_0005/job.xml Submitted At: 2-四月-2009 15:41:16 Launched At: 2-四月-2009 15:41:16 (0sec) Finished At: 2-四月-2009 15:42:04 (48sec) Status: SUCCESS ===================================== ...略 }}} == version == * 印出目前的hadoop 版本 {{{ bin/hadoop version Hadoop 0.18.3 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 736250 Compiled by ndaley on Thu Jan 22 23:12:08 UTC 2009 }}} = bin/hadoop 的管理者用法 = == balancer == * 用來讓hdfs的資料能均勻分散 {{{ $ bin/hadoop balancer }}} == daemonlog == * 不太知道用法?? {{{ $ bin/hadoop daemonlog -getlevel gm1.nchc.org.tw:50070 dfshealth.jsp Connecting to http://gm1.nchc.org.tw:50070/logLevel?log=dfshealth.jsp Submitted Log Name: dfshealth.jsp Log Class: org.apache.commons.logging.impl.Log4JLogger Effective level: INFO }}} == dfsadmin == === -report === * 呈現報告 {{{ $ bin/hadoop dfsadmin -report }}} === -safemode === || enter || leave || get || wait || {{{ $ bin/hadoop dfsadmin enter $ bin/hadoop dfsadmin -safemode leave }}} === -refreshNodes === * 更新Namenode 需要退出或加入的Datanode 的清單 {{{ $ bin/hadoop dfsadmin -refreshNodes }}} === -finalizeUpgrade === * 終結升級程序 === -upgradeProgress === || status || details || force || || 升級狀態 || 狀態的細節 || 強制升級操作進行 || === -setQuota === * 目錄配額是對目錄樹上該目錄下的名字數量做硬性限制 * 設定配額,數字代表個數 (如:我上傳了一個2個block的檔案可以上傳,但我上傳兩個檔案很小的檔上去卻不行) * 配額為1可以強制目錄保持為空 * 重命名不會改變該目錄的配額 {{{ $ bin/hadoop dfs -mkdir quota $ bin/hadoop dfsadmin -setQuota 2 quota $ bin/hadoop dfs -put ../conf/hadoop-env.sh quota/ $ bin/hadoop dfs -put ../conf/hadoop-site.xml quota/ put: org.apache.hadoop.dfs.QuotaExceededException: The quota of /user/waue/quota is exceeded: quota=2 count=3 }}} * 檢查目錄的配額方法: "bin/hadoop fs -count -q <目錄> " {{{ $ bin/hadoop fs -count -q own none inf 1 0 0 hdfs://gm1.nchc.org.tw:9000/user/waue/own $ bin/hadoop dfsadmin -setQuota 4 own $ bin/hadoop fs -count -q own 4 3 1 0 0 hdfs://gm1.nchc.org.tw:9000/user/waue/own }}} === -clrQuota === * 清除之前設定的配額 {{{ $ bin/hadoop dfsadmin -clrQuota quota/ }}} == namenode == * hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] == secondarynamenode == * 用法:hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize] == tasktracker、datanode == * 不可直接下 bin/hadoop tasktracker 這個指令,下此指令的節點 tasktracker:50060 網頁會出現錯誤訊息 * 可以搭配 bin/hadoop-daemon.sh --config "start|stop" "datanode|tasktracker" 來新加入或停止節點 = Hadoop Streaming 函式庫用法 = * Hadoop streaming是Hadoop的一個工具, 它幫助用戶創建和運行一類特殊的map/reduce作業, 這些特殊的map/reduce作業是由一些可執行文件或腳本文件充當mapper或者reducer * 最簡單的透過shell執行stream的map reduce: {{{ $ bin/hadoop jar hadoop-0.18.3-streaming.jar -input input -output stream-output1 -mapper /bin/cat -reducer /usr/bin/wc }}} * 輸出的結果為: (代表 行、字數、字元數) {{{ #!sh 2910628 24507806 143451003 }}} = HDFS權限管理用戶 = * hdfs的權限有owner, group, other三種 * 而用戶的身份取決於client上的使用者 (用 whoami),群組為(bash -c groups) * 相關的操作: {{{ $ bin/hadoop dfs -mkdir own $ bin/hadoop dfs -chmod -R 755 own $ bin/hadoop dfs -chgrp -R waue own $ bin/hadoop dfs -chown -R waue own $ bin/hadoop dfs -lsr own }}} * conf/hadoop-site.xml 可用參數: {{{ #!php dfs.permissions = true dfs.web.ugi = webuser,webgroup dfs.permissions.supergroup = supergroup dfs.upgrade.permission = 777 dfs.umask = 022 }}}