= Hadoop 簡化叢集安裝設定 = == 原始安裝設定步驟 == === 執行環境 === 所有主機環境需已安裝下列套件 * openssh-server * sun-java6-bin * sun-java6-jdk * sun-java6-jre {{{ ~$ sudo apt-get install openssh-server ~$ sudo apt-get purge java-gcj-compat ~$ sudo apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre }}} == 設定表(以兩台主機為例) == ||Steps||主機一||主機二|| ||角色||Data: namenode + datanode[[BR]] Job: jobtracker + tasktracker||Data: datanode[[BR]] Job: tasktracker || ||Step1||登入免key密碼||登入免key密碼|| ||Step2||install Java||install Java|| ||Step3|| 下載安裝Hadoop || || ||Step4|| 設定 hadoop-env.sh || || ||Step5|| 設定 hadoop-site.xml || || ||Step6|| 設定masters及slaves || || ||Step7|| || 複製Hadoop_Home內的資料至主機二(or其他slave) || ||Step8|| 格式化HDFS || || ||Step9|| 啟動Hadoop || || == 找不到Datanode時 == === 錯誤訊息 === * /tmp/hadoop/logs/hadoop-shunfa-datanode-shunfa-VBox1.log {{{ #!text 2010-05-03 15:27:26,322 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = shunfa-VBox1/127.0.1.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 2010-05-03 15:27:30,640 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /var/hadoop/hadoop-shunfa/dfs/data: namenode namespaceID = 812261000; datanode namespaceID = 2021031637 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368) 2010-05-03 15:27:30,648 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at shunfa-VBox1/127.0.1.1 ************************************************************/ }}} === 解法 === * 修改 datanode 的 namespaceID (路徑:/var/hadoop/hadoop-shunfa/dfs/data/current/VESSION) {{{ #!text namespaceID=2021031637 改成 namespaceID=812261000 }}} * 重新啟動Datanode {{{ hadooper-1:~$ /opt/hadoop/bin/hadoop-daemon.sh start datanode hadooper-2:~$ /opt/hadoop/bin/hadoop-daemon.sh start datanode }}} == 簡化 == * 利用Shell Scrirt Dialog 簡化安裝流程 === 單機安裝流程(done) === * Setp1:詢問主機IP位址 * Step2:詢問使用者名稱(Hadoop的owner) * Step3:確認資訊(User & IP address) * Step4:開始安裝.. === 叢集安裝流程 === * Step1:詢問使用者名稱(Hadoop的owner) * Step2:設定Master IP address * Step3:設定Slaves數量 * Step4:設定Slaves IP address(此檔案為hadoop/conf/slaves) * Step5:開始安裝.. == References == * [http://archive.cloudera.com/docs/cdh.html Cloudera’s Distribution for Hadoop (CDH)] - Jazz提供 * [http://forum.hadoop.tw/viewtopic.php?f=4&t=43 Hadoop Taiwan-Jazz的回覆] * [http://trac.nchc.org.tw/cloud/wiki/Hadoop_Lab7 Waue的教材]