[[PageOutline]]
= Hadoop Hands-on Labs (1) =
== Basic DFS command / Hadoop DFS 基本測試環境建立 ==
1. download hadoop-0.18.2
{{{
$ cd ~
$ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.2/hadoop-0.18.2.tar.gz
$ tar zxvf hadoop-0.18.2.tar.gz
}}}
2. Hadoop 會用 SSH 進行內部連線,因此需要做 SSH Key exchange
{{{
~$ ssh-keygen
~$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
}}}
3. 需要 JAVA_HOME 環境變數才能執行 hadoop namenode
{{{
$ echo "export JAVA_HOME=/usr/lib/jvm/java-6-sun" >> ~/.bash_profile
$ cd ~/hadoop-0.18.2
}}}
4. 編輯 conf/hadoop-evn.sh (HADOOP_HOME要設定到你的hadoop安裝目錄)
{{{
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_HOME=/home/jazz/hadoop-0.18.2/
export HADOOP_CONF_DIR=$HADOOP_HOME/conf
}}}
5. 編輯 conf/hadoop-site.xml 在 configuration 那一段加入以下設定
{{{
fs.default.name
hdfs://localhost:9000/
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
mapred.job.tracker
localhost:9001
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
}}}
6. 啟動hadoop 的兩道指令
{{{
~/hadoop-0.18.2$ bin/hadoop namenode -format
~/hadoop-0.18.2$ bin/start-all.sh
}}}
7. 完成後可以看到以下三個網頁
* http://localhost:50030/
* http://localhost:50060/
* http://localhost:50070/
8. 也可以放的東西上hdfs去看看
{{{
~/hadoop-0.18.2$ bin/hadoop dfs -put conf conf
~/hadoop-0.18.2$ bin/hadoop dfs -ls
Found 1 items
drwxr-xr-x - jazz supergroup 0 2008-11-04 15:56 /user/jazz/conf
}}}
= Hadoop Hands-on Labs (2) =
== MapReduce 程式設計練習 ==
1. 執行 Wordcount 範例
{{{
~/hadoop-0.18.2$ bin/hadoop fs -put conf conf
~/hadoop-0.18.2$ bin/hadoop fs -ls
Found 1 items
drwxr-xr-x - jazz supergroup 0 2008-11-05 19:34 /user/jazz/conf
~/hadoop-0.18.2$ bin/hadoop jar /home/jazz/hadoop-0.18.2/hadoop-0.18.2-examples.jar wordcount
ERROR: Wrong number of parameters: 0 instead of 2.
wordcount [-m ] [-r ]