= Deploy Hadoop to PC Classroom using DRBL = * java is required for Hadoop, so you need to install java runtime or jdk first. {{{ ~$ echo "deb http://free.nchc.org.tw/debian/ etch non-free" > /tmp/etch-non-free.list ~$ sudo mv /tmp/etch-non-free.list /etc/apt/sources.list.d/. ~$ sudo apt-get update ~$ sudo apt-get install sun-java5-jdk }}} * download Hadoop 0.18.2 {{{ ~$ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.2/hadoop-0.18.2.tar.gz ~$ tar zxvf hadoop-0.18.2.tar.gz }}} * setup JAVA_HOME environment variable {{{ ~$ echo "export JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun" >> ~/.bash_profile ~$ source ~/.bash_profile }}} * edit hadoop-0.18.2/conf/hadoop-env.sh {{{ #!diff --- hadoop-0.18.2/conf/hadoop-env.sh.org 2008-11-06 22:57:40.000000000 +0800 +++ hadoop-0.18.2/conf/hadoop-env.sh 2008-11-06 22:58:42.000000000 +0800 @@ -6,7 +6,9 @@ # remote nodes. # The java implementation to use. Required. -# export JAVA_HOME=/usr/lib/j2sdk1.5-sun +export JAVA_HOME=/usr/lib/jvm/java-1.5.0-sun +export HADOOP_HOME=/home/jazz/hadoop-0.18.2 +export HADOOP_CONF_DIR=$HADOOP_HOME/conf # Extra Java CLASSPATH elements. Optional. # export HADOOP_CLASSPATH= }}} * here is current DRBL setup {{{ NIC NIC IP Clients +------------------------------+ | DRBL SERVER | | | | +-- [eth0] X.X.X.X +- to WAN | | | +-- [eth1] 192.168.61.254 +- to clients group 1 [ 10 clients, their IP | | from 192.168.61.1 - 192.168.61.10] | +-- [eth2] 192.168.62.254 +- to clients group 2 [ 11 clients, their IP | | from 192.168.62.1 - 192.168.62.11] | +-- [eth3] 192.168.63.254 +- to clients group 3 [ 10 clients, their IP | | from 192.168.63.1 - 192.168.63.10] | +-- [eth4] 192.168.64.254 +- to clients group 4 [ 10 clients, their IP | | from 192.168.64.1 - 192.168.64.10] +------------------------------+ }}} * Hadoop will use ssh connections for internal connection, thus we have to do SSH key exchange. {{{ ~$ ssh-keygen ~$ cp .ssh/id_rsa.pub .ssh/authorized_keys ~$ sudo apt-get install dsh ~$ mkdir -p .dsh ~$ nmap -v -sP 192.168.61-63.1-11 | grep '(.*) .* up' | awk '{ print $3 }' | sort -n | sed 's#(##' | sed 's#)##' > .dsh/machines.list"192.168.63.$i" >> .dsh/machines.list; echo "192.168.64.$i" >> .dsh/machines.list; done }}} * edit