Version 25 (modified by waue, 16 years ago) (diff) |
---|
DRBL叢集上運行HADOOP
Hadoop Cluster Based on DRBL
- 此篇的目的在於利用DRBL統整一個Cluster,並在上面運行Hadoop。
- 由於DRBL為無碟系統,並非一般的Cluster,因此有些地方需要注意。
零、環境說明
環境中共有七台機器,一台為drbl server,也是hadoop的namenode,其他節點則client 與datanode,如下:
名稱 | ip | drbl用途 | hadoop 用途 |
hadoop | 192.168.1.254 | drbl server | namenode |
hadoop | 192.168.1.2 | drbl server | namenode |
hadoop | 192.168.1.3 | drbl clinet | datanode |
hadoop | 192.168.1.4 | drbl clinet | datanode |
hadoop | 192.168.1.5 | drbl clinet | datanode |
hadoop | 192.168.1.6 | drbl clinet | datanode |
hadoop | 192.168.1.7 | drbl clinet | datanode |
介紹drbl server環境如下:
debian | etch (4.0) | server - 64 bit |
DRBL為無碟系統,因此只要將drbl server系統與所需服務安裝好,則其他的client網路開機後,就會載入以server為依據的檔案系統,也就是說,只有某些特定資料夾內的內容(如 /etc /root /home /tmp /var ...)會各自不同之外,其他都一樣。舉例若改了server內/etc/hosts檔的,則其他的client都會自動即時一起更改(因為是用NFS mount 上來的)。
因此,只要先在drbl server上完成了一、安裝,二、設定之後,在將其他的client開機然後依照三、操作 就可以了。
一、安裝
1.1 安裝drbl
- 詳見 DRBL的安裝
1.2 安裝 java 6
- 在套件庫裡 /etc/apt/sources.list 加入 non-free 庫以及 backports 網址才能安裝 sun-java6
deb http://free.nchc.org.tw/debian/ etch main contrib non-free deb-src http://free.nchc.org.tw/debian/ etch main contrib non-free deb http://security.debian.org/ etch/updates main contrib non-free deb-src http://security.debian.org/ etch/updates main contrib non-free deb http://www.backports.org/debian etch-backports main non-free deb http://free.nchc.org.tw/drbl-core drbl stable
- 安裝key及java6
$ wget http://www.backports.org/debian/archive.key $ sudo apt-key add archive.key $ apt-get update $ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre
1.3 安裝 Hadoop 0.18.3
$ cd /opt $ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.3/hadoop-0.18.3.tar.gz $ tar zxvf hadoop-0.18.3.tar.gz hadoop:/opt# ln -sf hadoop-0.18.3 hadoop
二、設定 Hadoop
- 在 /etc/bash.bashrc 的最末加入 以下資訊
PATH=$PATH:/opt/drbl/bin:/opt/drbl/sbin export JAVA_HOME=/usr/lib/jvm/java-6-sun export HADOOP_HOME=/opt/hadoop/
- 編輯 /etc/hosts 把下面內容貼在最後
192.168.1.254 gm2.nchc.org.tw 192.168.1.1 hadoop101 192.168.1.10 hadoop110 192.168.1.11 hadoop111 192.168.1.2 hadoop102 192.168.1.3 hadoop103 192.168.1.4 hadoop104 192.168.1.5 hadoop105 192.168.1.6 hadoop106 192.168.1.7 hadoop107 192.168.1.8 hadoop108 192.168.1.9 hadoop109
- 編輯 /opt/hadoop-0.18.3/conf/hadoop-env.sh
-
hadoop-0.18.3/conf/hadoop-env.sh
old new 6 6 # remote nodes. 7 7 # The java implementation to use. Required. 8 # export JAVA_HOME=/usr/lib/j2sdk1.5-sun 8 export JAVA_HOME=/usr/lib/jvm/java-6-sun 9 export HADOOP_HOME=/opt/hadoop-0.18.3 10 export HADOOP_CONF_DIR=$HADOOP_HOME/conf 11 export HADOOP_LOG_DIR=/root/hadoop/logs 9 12 # Extra Java CLASSPATH elements. Optional. 10 13 # export HADOOP_CLASSPATH=
-
- 編輯 /opt/hadoop-0.18.3/conf/hadoop-site.xml
-
hadoop-0.18.3/conf/hadoop-site.xml
old new 4 4 <!-- Put site-specific property overrides in this file. --> 5 5 <configuration> 6 6 <property> 7 <name>fs.default.name</name> 8 <value>hdfs://gm2.nchc.org.tw:9000/</value> 9 <description> 10 The name of the default file system. Either the literal string 11 "local" or a host:port for NDFS. 12 </description> 13 </property> 14 <property> 15 <name>mapred.job.tracker</name> 16 <value>hdfs://gm2.nchc.org.tw:9001</value> 17 <description> 18 The host and port that the MapReduce job tracker runs at. If 19 "local", then jobs are run in-process as a single map and 20 reduce task. 21 </description> 22 </property> 7 23 </configuration>
-
- 編輯 /opt/hadoop/conf/slaves
hadoop102 hadoop103 hadoop104 hadoop105 hadoop106 hadoop107 hadoop
三、操作
3.1 開啟DRBL Client
- 將所有的 client 開啟,並且如下
****************************************************** NIC NIC IP Clients +------------------------------+ | DRBL SERVER | | | | +-- [eth2] 140.110.X.X +- to WAN | | | +-- [eth1] 192.168.1.254 +- to clients group 1 [ 6 clients, their IP | | from 192.168.1.2 - 192.168.1.7] +------------------------------+ ****************************************************** Total clients: 6 ******************************************************
3.2 設定ssh
- 編寫 /etc/ssh/ssh_config
StrictHostKeyChecking no
- 執行
$ ssh-keygen -t rsa -b 1024 -N "" -f ~/.ssh/id_rsa $ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys $ /etc/init.d/ssh restart
- 寫個自動化 auto.shell 並執行
#!/bin/bash for ((i=2;i<=7;i++)); do scp -r ~/.ssh/ "192.168.1.$i":~/ scp /etc/ssh/ssh_config "192.168.1.$i":/etc/ssh/ssh_config ssh "192.168.1.$i" /etc/init.d/ssh restart done
- 正確無誤則可免密碼登入
3.2.1 dsh
$ sudo apt-get install dsh $ mkdir -p .dsh $ for ((i=2;i<=7;i++)); do echo "192.168.1.$i" >> .dsh/machines.list; done
並執行
$ dsh -a scp hadoop:/etc/hosts /etc/ $ dsh -a source /etc/bash.bashrc
3.3 啟動 Hadoop
- 啟動
$ cd /opt/hadoop $ bin/hadoop namenode -format $ bin/start-all.sh
3.4 Hadoop 測試範例
- 運作WordCount以測試
$ mkdir input $ cp *.txt input/ $ bin/hadoop dfs -put input input $ bin/hadoop jar hadoop-*-examples.jar wordcount input ouput
- 執行畫面:
hadoop:/opt/hadoop# bin/hadoop jar hadoop-*-examples.jar wordcount input ouput 09/02/26 06:16:34 INFO mapred.FileInputFormat: Total input paths to process : 4 09/02/26 06:16:34 INFO mapred.FileInputFormat: Total input paths to process : 4 09/02/26 06:16:35 INFO mapred.JobClient: Running job: job_200902260615_0001 09/02/26 06:16:36 INFO mapred.JobClient: map 0% reduce 0% 09/02/26 06:16:39 INFO mapred.JobClient: map 80% reduce 0% 09/02/26 06:16:40 INFO mapred.JobClient: map 100% reduce 0% 09/02/26 06:16:50 INFO mapred.JobClient: Job complete: job_200902260615_0001 09/02/26 06:16:50 INFO mapred.JobClient: Counters: 16 09/02/26 06:16:50 INFO mapred.JobClient: File Systems 09/02/26 06:16:50 INFO mapred.JobClient: HDFS bytes read=267854 09/02/26 06:16:50 INFO mapred.JobClient: HDFS bytes written=100895 09/02/26 06:16:50 INFO mapred.JobClient: Local bytes read=133897 09/02/26 06:16:50 INFO mapred.JobClient: Local bytes written=292260 09/02/26 06:16:50 INFO mapred.JobClient: Job Counters 09/02/26 06:16:50 INFO mapred.JobClient: Launched reduce tasks=1 09/02/26 06:16:50 INFO mapred.JobClient: Rack-local map tasks=5 09/02/26 06:16:50 INFO mapred.JobClient: Launched map tasks=5 09/02/26 06:16:50 INFO mapred.JobClient: Map-Reduce Framework 09/02/26 06:16:50 INFO mapred.JobClient: Reduce input groups=8123 09/02/26 06:16:50 INFO mapred.JobClient: Combine output records=17996 09/02/26 06:16:50 INFO mapred.JobClient: Map input records=6515 09/02/26 06:16:50 INFO mapred.JobClient: Reduce output records=8123 09/02/26 06:16:50 INFO mapred.JobClient: Map output bytes=385233 09/02/26 06:16:50 INFO mapred.JobClient: Map input bytes=265370 09/02/26 06:16:50 INFO mapred.JobClient: Combine input records=44786 09/02/26 06:16:50 INFO mapred.JobClient: Map output records=34913 09/02/26 06:16:50 INFO mapred.JobClient: Reduce input records=8123 hadoop:/opt/hadoop#
- http://gm2.nchc.org.tw:50030/
- 網頁中可以看到node數為7則代表所有的節點都有加入
- http://gm2.nchc.org.tw:50075/browseDirectory.jsp?dir=%2Fuser%2Froot&namenodeInfoPort=50070
- 可以看到輸出結果
3.5 停止hadoop
$ bin/stop-all.sh
3.6 重新建立 hadoop
$ bin/stop-all.sh $ dsh -a rm -rf /root/hadoop/* /tmp/hadoop-root* $ bin/hadoop namenode -format $ bin/start-all.sh
四、操作
4.1 帳號
- 增加一個hadoop帳號huser,使之可以在hdfs上自己的目錄內進行存取瀏覽的操作
- 在drbl系統新增帳號 huser
<root>$ /opt/drbl/sbin/drbl-useradd -s huser huser
- 用hdfs的superuser(此篇文章為root)在hdfs上建立資料夾
<root>$ /opt/hadoop/bin/hadoop dfs -mkdir /user/huser
- 用superuser 設定hdfs上該資料夾的權限與擁有者
<root>$ /opt/hadoop/bin/hadoop dfs -chown -R huser /user/huser <root>$ /opt/hadoop/bin/hadoop dfs -chmod -R 775 /user/huser
- 測試:用huser瀏覽或寫入檔案
<root>$ su - huser <huser>$ cd /opt/hadoop/ <huser>$ /opt/hadoop/bin/hadoop dfs -put input /user/huser/input <huser>$ /opt/hadoop/bin/hadoop dfs -ls /user/huser/input
4.1 多帳號
- 測試兩個user: rock , waue 同時執行,沒有問題
bin/hadoop jar hadoop-*-examples.jar wordcount input/ ouput/
網頁結果:
Completed Jobs
Completed Jobs | ||||||||||
Jobid | User | Name | Map % Complete | Map Total | Maps Completed | Reduce % Complete | Reduce Total | Reduces Completed | ||
job_200903061742_0001 | waue | wordcount | 100.00% | 1 | 1 | 100.00% | 1 | 1 | ||
job_200903061742_0002 | rock | wordcount | 100.00% | 1 | 1 | 100.00% | 1 | 1 | ||
job_200903061742_0003 | waue | wordcount | 100.00% | 1 | 1 | 100.00% | 1 | 1 |
Failed Jobs
五、參考資料
六、問題排解
- drbl似乎安裝不順
drblsrv -i 出現以下錯誤訊息
Kernel 2.6 was found, so default to use initramfs. The requested kernel "" 2.6.18-6-amd64 kernel files are NOT found in /tftpboot/node_root/lib/modules/s and /tftpboot/node_root/boot in the server! The necessary modules in the network initrd can NOT be created! Client will NOT remote boot correctly! Program terminated! Done!
原因: apt 的鏡像站台沒有複製到資料因此無法安裝新kernel,導致出現問題