實做六: Hadoop 程式編譯
前言:啟動Hadoop環境
- 重新啟動昨天的環境
- 在 node1 上操作
$ cd ~ $ wget http://hadoop.nchc.org.tw/~waue/clean.sh $ chmod 755 clean.sh $ ./clean.sh $
- 請檢查 hadoop 是否正確運作.
- 在 node1 上操作
練習 1 : Word Count 初級版
- 上傳內容到hdfs內
$ cd /opt/hadoop $ bin/hadoop dfs -mkdir input $ echo "I like NCHC Cloud Course." > input1 $ echo "I like nchc Cloud Course, and we enjoy this course." > input2 $ bin/hadoop dfs -put input1 input $ bin/hadoop dfs -put input2 input $ bin/hadoop dfs -ls input
- 點此連結 WordCount.java 並將他存到 /opt/hadoop;
- 運作程式
$ mkdir MyJava $ javac -classpath hadoop-*-core.jar -d MyJava WordCount.java $ jar -cvf wordcount.jar -C MyJava . $ bin/hadoop jar wordcount.jar WordCount input/ output/ $ bin/hadoop dfs -cat output/part-00000
練習 2 : Word Count 進階版
$ echo "\." >pattern.txt && echo "\," >>pattern.txt $ bin/hadoop dfs -put pattern.txt ./ $ mkdir MyJava2
- 點此連結 WordCount2.java 並將他存到 /opt/hadoop;
$ javac -classpath hadoop-*-core.jar -d MyJava2 WordCount2.java $ jar -cvf wordcount2.jar -C MyJava2 . $ bin/hadoop jar wordcount2.jar WordCount2 input output2 -skip pattern.txt $ bin/hadoop dfs -cat output2/part-00000 $ bin/hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false input output3 -skip pattern.txt $ bin/hadoop dfs -cat output3/part-00000
Last modified 15 years ago
Last modified on Jan 18, 2010, 8:35:02 PM
Attachments (1)
- WordCount.png (5.8 KB) - added by jazz 15 years ago.
Download all attachments as: .zip