wiki:waue/Hadoop_DRBL

Version 16 (modified by waue, 15 years ago) (diff)

--

DRBL叢集上運行HADOOP

Hadoop Cluster Based on DRBL

  • 此篇的目的在於利用DRBL統整一個Cluster,並在上面運行Hadoop。
  • 由於DRBL為無碟系統,並非一般的Cluster,因此有些地方需要注意。

零、環境說明

環境中共有七台機器,一台為drbl server,也是hadoop的namenode,其他節點則client 與datanode,如下:

名稱 ip drbl用途 hadoop 用途
hadoop 192.168.1.254 drbl server namenode
hadoop 192.168.1.2 drbl server namenode
hadoop 192.168.1.3 drbl clinet datanode
hadoop 192.168.1.4 drbl clinet datanode
hadoop 192.168.1.5 drbl clinet datanode
hadoop 192.168.1.6 drbl clinet datanode
hadoop 192.168.1.7 drbl clinet datanode

介紹drbl server環境如下:

debian etch (4.0) server - 64 bit

DRBL為無碟系統,因此只要將drbl server系統與所需服務安裝好,則其他的client網路開機後,就會載入以server為依據的檔案系統,也就是說,只有某些特定資料夾內的內容(如 /etc /root /home /tmp /var ...)會各自不同之外,其他都一樣。舉例若改了server內/etc/hosts檔的,則其他的client都會自動即時一起更改(因為是用NFS mount 上來的)。

因此,只要先在drbl server上完成了一、安裝,二、設定之後,在將其他的client開機然後依照三、操作 就可以了。

一、安裝

安裝drbl

安裝 java 6

  • 在套件庫裡 /etc/apt/sources.list 加入 non-free 庫以及 backports 網址才能安裝 sun-java6
    deb http://opensource.nchc.org.tw/debian/ etch main contrib non-free
    deb-src http://opensource.nchc.org.tw/debian/ etch main contrib non-free
    deb http://security.debian.org/ etch/updates main contrib non-free
    deb-src http://security.debian.org/ etch/updates main contrib non-free
    deb http://www.backports.org/debian etch-backports main non-free
    deb http://free.nchc.org.tw/drbl-core drbl stable
    
  • 安裝key及java6
    $ wget http://www.backports.org/debian/archive.key
    $ sudo apt-key add archive.key
    $ apt-get update
    $ apt-get install sun-java6-bin  sun-java6-jdk sun-java6-jre
    

安裝 Hadoop 0.18.3

$ cd /opt
$ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.3/hadoop-0.18.3.tar.gz
$ tar zxvf hadoop-0.18.3.tar.gz
hadoop:/opt# ln -sf hadoop-0.18.3 hadoop

二、設定 Hadoop

  • 在 /etc/bash.bashrc 的最末加入 以下資訊
    PATH=$PATH:/opt/drbl/bin:/opt/drbl/sbin
    export JAVA_HOME=/usr/lib/jvm/java-6-sun
    export HADOOP_HOME=/opt/hadoop/
    
  • 編輯 /etc/hosts 把下面內容貼在最後
    192.168.1.254 gm2.nchc.org.tw
    192.168.1.1 hadoop101
    192.168.1.10 hadoop110
    192.168.1.11 hadoop111
    192.168.1.2 hadoop102
    192.168.1.3 hadoop103
    192.168.1.4 hadoop104
    192.168.1.5 hadoop105
    192.168.1.6 hadoop106
    192.168.1.7 hadoop107
    192.168.1.8 hadoop108
    192.168.1.9 hadoop109
    
  • 編輯 /opt/hadoop-0.18.3/conf/hadoop-env.sh
    • hadoop-0.18.3/conf/hadoop-env.sh

      old new  
      66# remote nodes.
      77# The java implementation to use.  Required.
      8 # export JAVA_HOME=/usr/lib/j2sdk1.5-sun
       8export JAVA_HOME=/usr/lib/jvm/java-6-sun
       9export HADOOP_HOME=/opt/hadoop-0.18.3
       10export HADOOP_CONF_DIR=$HADOOP_HOME/conf
       11export HADOOP_LOG_DIR=/root/hadoop/logs
      912# Extra Java CLASSPATH elements.  Optional.
      1013# export HADOOP_CLASSPATH=
  • 編輯 /opt/hadoop-0.18.3/conf/hadoop-site.xml
    • hadoop-0.18.3/conf/hadoop-site.xml

      old new  
      44<!-- Put site-specific property overrides in this file. -->
      55
      66<configuration>
      7 
       7  <property>
       8    <name>fs.default.name</name>
       9    <value>hdfs://gm2.nchc.org.tw:9000/</value>
       10    <description>
       11      The name of the default file system. Either the literal string
       12      "local" or a host:port for NDFS.
       13    </description>
       14  </property>
       15  <property>
       16    <name>mapred.job.tracker</name>
       17    <value>hdfs://gm2.nchc.org.tw:9001</value>
       18    <description>
       19      The host and port that the MapReduce job tracker runs at. If
       20      "local", then jobs are run in-process as a single map and
       21      reduce task.
       22    </description>
       23  </property>
      824</configuration>
  • 編輯 /opt/hadoop/conf/slaves
    hadoop102
    hadoop103
    hadoop104
    hadoop105
    hadoop106
    hadoop107
    hadoop
    
    

三、DRBL 操作

開啟client

  • 將所有的 client 開啟,並且如下
    ******************************************************
              NIC    NIC IP                    Clients
    +------------------------------+
    |         DRBL SERVER          |
    |                              |
    |    +-- [eth2] 140.110.xxx.130|   +- to WAN
    |                              |
    |    +-- [eth1] 192.168.1.254 +- to clients group 1 [ 6 clients, their IP
    |                              |             from 192.168.1.2 - 192.168.1.7]
    +------------------------------+
    ******************************************************
    Total clients: 6
    ******************************************************
    

ssh

  • 編寫 /etc/ssh/ssh_config
    StrictHostKeyChecking no
    
  • 執行
    $ ssh-keygen -t rsa -b 1024 -N "" -f ~/.ssh/id_rsa
    $ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
    
  • 寫個自動化 auto.shell 並執行
    #!/bin/bash
    
    for ((i=2;i<=7;i++));
    do
     scp scp -r ~/.ssh/ "192.168.1.$i":~/
     scp /etc/ssh/ssh_config "192.168.1.$i":/etc/ssh/ssh_config
     ssh "192.168.1.$i" /etc/init.d/ssh restart
    done
    
  • 正確無誤則可免密碼登入

dsh

  • 此節非必要可不做
$ sudo apt-get install dsh
$ mkdir -p .dsh
$ for ((i=2;i<=7;i++)); do echo "192.168.1.$i" >> .dsh/machines.list; done

並執行

$ dsh -a source /etc/bash.bashrc

DRBL Server as Hadoop namenode

  • 啟動
    bin/hadoop namenode -format
    bin/start-all
    
  • 測試
    mkdir input
    cp *.txt input/
    bin/hadoop dfs -put input input 
    bin/hadoop jar hadoop-*-examples.jar wordcount input ouput
    

參考

Jazz: DRBL_Hadoop

Hadoop手冊

問題排解

  • drbl似乎安裝不順

drblsrv -i 出現以下錯誤訊息

Kernel 2.6 was found, so default to use initramfs.
The requested kernel "" 2.6.18-6-amd64 kernel files are NOT found in  /tftpboot/node_root/lib/modules/s and /tftpboot/node_root/boot in the server! The necessary modules in the network initrd can NOT be created!
Client will NOT remote boot correctly!
Program terminated!
Done!

我這邊用 VMWare 裝 Debian 4.0r6 amd64 沒有這個問題耶 No image "debian_4.0r6_drbl.jpg" attached to waue/Hadoop_DRBL

ps: 原因為 apt 的鏡像站台沒有複製到資料因此無法安裝新kernel,導致出現問題