[[PageOutline]] = Hadoop Cluster Based on DRBL = == drbl server 作業環境 == || debian || etch (4.0) || server - 64 bit || * 安裝drbl * 安裝 java 6 在套件庫裡 /etc/apt/sources.list 加入 non-free 庫以及 backports 網址才能安裝 sun-java6 {{{ deb http://opensource.nchc.org.tw/debian/ etch main contrib non-free deb-src http://opensource.nchc.org.tw/debian/ etch main contrib non-free deb http://security.debian.org/ etch/updates main contrib non-free deb-src http://security.debian.org/ etch/updates main contrib non-free deb http://www.backports.org/debian etch-backports main non-free deb http://free.nchc.org.tw/drbl-core drbl stable }}} 安裝key及java6 {{{ $ wget http://www.backports.org/debian/archive.key $ sudo apt-key add archive.key $ apt-get update $ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre }}} = Hadoop Install = * download Hadoop 0.18.3 {{{ $ cd /opt $ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.3/hadoop-0.18.3.tar.gz $ tar zxvf hadoop-0.18.3.tar.gz hadoop:/opt# ln -sf hadoop-0.18.3 hadoop }}} * 在 ~/.bashrc 的最末加入 以下資訊 {{{ PATH=$PATH:/opt/drbl/bin:/opt/drbl/sbin export JAVA_HOME=/usr/lib/jvm/java-6-sun export HADOOP_HOME=/opt/hadoop/ }}} 並執行 {{{ $ source ~/.bashrc }}} * edit hadoop-0.18.3/conf/hadoop-env.sh {{{ #!diff --- hadoop-0.18.3/conf/hadoop-env.sh.org +++ hadoop-0.18.3/conf/hadoop-env.sh @@ -6,7 +6,9 @@ # remote nodes. # The java implementation to use. Required. -# export JAVA_HOME=/usr/lib/j2sdk1.5-sun +export JAVA_HOME=/usr/lib/jvm/java-6-sun +export HADOOP_HOME=/opt/hadoop-0.18.3 +export HADOOP_CONF_DIR=$HADOOP_HOME/conf +export HADOOP_LOG_DIR=/root/hadoop/logs # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= }}} * edit hadoop-0.18.3/conf/hadoop-site.xml {{{ #!diff --- hadoop-0.18.3/conf/hadoop-site.xml.org +++ hadoop-0.18.3/conf/hadoop-site.xml @@ -4,5 +4,31 @@ - + + fs.default.name + hdfs://192.168.1.254:9000/ + + The name of the default file system. Either the literal string + "local" or a host:port for NDFS. + + + + mapred.job.tracker + hdfs://192.168.1.254:9001 + + The host and port that the MapReduce job tracker runs at. If + "local", then jobs are run in-process as a single map and + reduce task. + + }}} = DRBL setup = == Environment == {{{ ****************************************************** NIC NIC IP Clients +------------------------------+ | DRBL SERVER | | | | +-- [eth2] 140.110.xxx.130| +- to WAN | | | +-- [eth1] 192.168.1.254 +- to clients group 1 [ 16 clients, their IP | | from 192.168.1.1 - 192.168.1.16] +------------------------------+ ****************************************************** Total clients: 16 ****************************************************** }}} == ssh == * 編寫 /etc/ssh/ssh_config {{{ StrictHostKeyChecking no }}} * 執行 {{{ $ ssh-keygen -t rsa -b 1024 -N "" -f ~/.ssh/id_rsa $ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys }}} * 寫個自動化 auto.shell 並執行 {{{ #!sh #!/bin/bash for ((i=1;i<=16;i++)); do scp scp -r ~/.ssh/ "192.168.1.$i":~/ scp /etc/ssh/ssh_config "192.168.1.$i":/etc/ssh/ssh_config ssh "192.168.1.$i" /etc/init.d/ssh restart done }}} * 正確無誤則可免密碼登入 === dsh === {{{ $ sudo apt-get install dsh $ mkdir -p .dsh $ for ((i=1;i<=16;i++)); do echo "192.168.1.$i" >> .dsh/machines.list; done }}} == DRBL Server as Hadoop namenode == * edit /etc/rc.local for DRBL Server as Hadoop namenode {{{ #!diff --- /etc/rc.local.org 2008-11-07 18:09:10.000000000 +0800 +++ /etc/rc.local 2008-11-07 17:58:14.000000000 +0800 @@ -11,4 +11,7 @@ # # By default this script does nothing. +echo 3 > /proc/sys/vm/drop_caches +/opt/hadoop-0.18.3/bin/hadoop namenode -format +/opt/hadoop-0.18.3/bin/hadoop-daemon.sh start namenode +/opt/hadoop-0.18.3/bin/hadoop-daemon.sh start jobtracker +/opt/hadoop-0.18.3/bin/hadoop-daemon.sh start tasktracker exit 0 }}} * edit hadoop_datanode for DRBL client as datanode {{{ $ cat > hadoop_datanode << EOF }}} {{{ #! /bin/sh set -e # /etc/init.d/hadoop_datanode: start and stop Hadoop DFS datanode for DRBL Client export PATH="${PATH:+$PATH:}/usr/sbin:/sbin" case "\$1" in start) echo -n "starting datanode:" /opt/hadoop-0.18.3/bin/hadoop-daemon.sh start datanode echo "[OK]" ;; stop) echo -n "stoping datanode:" /opt/hadoop-0.18.3/bin/hadoop-daemon.sh stop datanode echo "[OK]" ;; *) echo "Usage: /etc/init.d/hadoop_datanode {start|stop}" exit 1 esac exit 0 EOF }}} {{{ $ chmod a+x hadoop_datanode $ sudo /opt/drbl/sbin/drbl-cp-host hadoop_datanode /etc/init.d/ $ sudo /opt/drbl/bin/drbl-doit update-rc.d hadoop_datanode defaults 99 }}} * shutdown DRBL clients * reboot DRBL server * use "Wake on LAN" for DRBL clients * browse http://192.168.1.254:50070 for DFS status == 參考 == [http://trac.nchc.org.tw/grid/wiki/jazz/DRBL_Hadoop Jazz: DRBL_Hadoop ] [http://trac.nchc.org.tw/cloud/wiki/MR_manual Hadoop手冊] == 問題排解 == * drbl似乎安裝不順 drblsrv -i 出現以下錯誤訊息 {{{ Kernel 2.6 was found, so default to use initramfs. The requested kernel "" 2.6.18-6-amd64 kernel files are NOT found in /tftpboot/node_root/lib/modules/s and /tftpboot/node_root/boot in the server! The necessary modules in the network initrd can NOT be created! Client will NOT remote boot correctly! Program terminated! Done! }}} > 我這邊用 VMWare 裝 Debian 4.0r6 amd64 沒有這個問題耶 > [[Image(debian_4.0r6_drbl.jpg)]] ps: 原因為 apt 的鏡像站台沒有複製到資料因此無法安裝新kernel,導致出現問題