Changes between Version 27 and Version 28 of MR_manual
- Timestamp:
- Sep 3, 2008, 1:17:30 PM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
MR_manual
v27 v28 40 40 [http://tech.ccidnet.com/art/5833/20080318/1393525_1.html copied from 詳細講解HBase] 41 41 = 二、環境設定 = 42 43 == 2.1 Prepare == 44 System : 45 * Ubuntu 7.10 46 * Hadoop 0.16 47 * Hbase 0.1.3 48 ps : hbase 0.1.4 <--> hadoop 0.2.0 49 Requirement : 50 * Eclipse (3.2.2) 51 {{{ 52 $ apt-get install eclipse 53 }}} 54 java 6 55 {{{ 56 $ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre sun-java6-plugin 57 }}} 58 suggest to remove the default java compiler 「 gcj 」 59 {{{ 60 $ apt-get purge java-gcj-compat 61 }}} 62 Append two codes to /etc/bash.bashrc to setup Java Class path 63 {{{ 64 export JAVA_HOME=/usr/lib/jvm/java-6-sun 65 export HADOOP_HOME=/home/waue/workspace/hadoop/ 66 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar 67 }}} 68 Building UP Path 42 所對應到的路徑為 69 43 || Name || Path || 70 44 || Java Home || /usr/lib/jvm/java-6-sun || … … 72 46 || Hbase Home || /home/waue/workspace/hbase/ || 73 47 74 Nodes set 48 節點 75 49 || node name || server || 76 50 || cloud1 || v || 77 51 || cloud2 || || 78 52 || cloudn || || 79 53 == 2.1 準備 == 54 系統 : 55 * Ubuntu 7.10 56 * Hadoop 0.16 57 * Hbase 0.1.3 58 ps : 若要升級則需要兩者都升級 hbase 0.1.4 <--> hadoop 0.2.0 59 * Eclipse (3.2.2) 60 {{{ 61 $ apt-get install eclipse 62 }}} 63 java 6 64 {{{ 65 $ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre sun-java6-plugin 66 }}} 67 建議刪除原本的 「 gcj 」 68 {{{ 69 $ apt-get purge java-gcj-compat 70 }}} 71 加入以下內容到 /etc/bash.bashrc 72 {{{ 73 export JAVA_HOME=/usr/lib/jvm/java-6-sun 74 export HADOOP_HOME=/home/waue/workspace/hadoop/ 75 export HBASE_HOME=/home/waue/workspace/hbase/ 76 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar 77 }}} 80 78 == 2.2 Hadoop Setup == 81 79 === 2.2.1. Generate an SSH key for the user === … … 83 81 $ ssh-keygen -t rsa -P "" 84 82 $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 85 $ ssh localhost83 $ ssh cloud1 86 84 $ exit 87 85 }}} … … 89 87 {{{ 90 88 $ cd /home/waue/workspace 91 $ sudo tar xzf hadoop-0.16. 0.tar.gz92 $ sudo mv hadoop-0.16.0hadoop89 $ sudo tar xzf hadoop-0.16.3.tar.gz 90 $ sudo ln -sf hadoop-0.16.3 hadoop 93 91 $ sudo chown -R waue:waue hadoop 94 92 $ cd hadoop … … 106 104 export JAVA_HOME=/usr/lib/jvm/java-6-sun 107 105 export HADOOP_HOME=/home/waue/workspace/hadoop 106 export HBASE_HOME=/home/waue/workspace/hbase 108 107 export HADOOP_LOG_DIR=$HADOOP_HOME/logs 109 108 export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves 110 }}} 109 export HADOOP_CLASSPATH= $HBASE_HOME/hbase-0.1.3.jar:$HBASE_HOME/conf 110 }}} 111 ps. HADOOP_CLASSPATH 要設hbase 的環境,而HBASE_CLASSPATH要設hadoop的環境, 112 有了這行可以解決編譯hbase 程式時出現run time error 113 111 114 2. hadoop-site.xml ($HADOOP_HOME/conf/)[[BR]] 112 115 modify the contents of conf/hadoop-site.xml as below … … 127 130 <property> 128 131 <name>mapred.map.tasks</name> 129 <value> 1</value>132 <value>9</value> 130 133 <description> 131 134 define mapred.map tasks to be number of slave hosts … … 134 137 <property> 135 138 <name>mapred.reduce.tasks</name> 136 <value> 1</value>139 <value>9</value> 137 140 <description> 138 141 define mapred.reduce tasks to be number of slave hosts … … 175 178 }}} 176 179 * 用hbase連接hadoop DFS 177 * 編輯 conf/hbase-site.xml檔案,如下 178 180 * 編輯 conf/hbase-site.xml檔案如下,並複製一份到$HADOOP_HOME/conf 下 179 181 {{{ 180 182 <configuration> 181 183 <property> 182 184 <name>hbase.master</name> 183 <value> localhost:60000</value>185 <value>cloud1:60000</value> 184 186 </property> 185 187 <property> 186 188 <name>hbase.master.info.bindAddress</name> 187 <value> localhost</value>189 <value>cloud1</value> 188 190 <description>The address for the hbase master web UI</description> 189 191 </property> 190 192 <property> 191 193 <name>hbase.regionserver.info.bindAddress</name> 192 <value> localhost</value>194 <value>cloud1</value> 193 195 <description>The address for the hbase regionserver web UI 194 196 </description> … … 197 199 <name>hbase.rootdir</name> 198 200 <value>file:///tmp/hbase-${user.home}/hbase</value> 199 <value>hdfs:// localhost:9000/hbase</value>201 <value>hdfs://cloud1:9000/hbase</value> 200 202 <description> 201 203 The directory shared by region servers. … … 206 208 207 209 }}} 208 * 編輯 conf/hbase-site.xml檔案,如下210 * 多nodes模式,編輯 conf/regionservers檔案,如下 209 211 {{{ 210 212 cloud1 … … 249 251 {{{ 250 252 starting namenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-namenode-Dx7200.out 251 localhost: starting datanode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-datanode-Dx7200.out252 localhost: starting secondarynamenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-secondarynamenode-Dx7200.out253 cloud1: starting datanode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-datanode-Dx7200.out 254 cloud1: starting secondarynamenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-secondarynamenode-Dx7200.out 253 255 starting jobtracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-jobtracker-Dx7200.out 254 localhost: starting tasktracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-tasktracker-Dx7200.out255 }}} 256 * Then make sure http:// localhost:50030/ by your explorer is on going. [[br]]256 cloud1: starting tasktracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-tasktracker-Dx7200.out 257 }}} 258 * Then make sure http://cloud1:50030/ by your explorer is on going. [[br]] 257 259 258 260 * Ps : if your system had error after restart, you could do there for resolving and renewing one. And repeat to 「4. start up Hadoop」 … … 381 383 Click blue elephant to add a new MapReduce server location. 382 384 Server name : any_you_want 383 Hostname : localhost385 Hostname : cloud1 384 386 Installation directory: /home/waue/workspace/nutch/ 385 387 Username : waue … … 412 414 * A 「console」 tag will show beside 「!MapReduce Server」 tag. 413 415 414 * While Map Reduce is running, you can visit http:// localhost:50030/ to view that Hadoop is dispatching jobs by Map Reduce.415 416 * After finish, you can go to http:// localhost:50060/ to see the result.416 * While Map Reduce is running, you can visit http://cloud1:50030/ to view that Hadoop is dispatching jobs by Map Reduce. 417 418 * After finish, you can go to http://cloud1:50060/ to see the result. 417 419 418 420 … … 422 424 423 425 = 七、Reference = 424