實作八 Lab 8
在完全分散模式下編譯 MapReduce 程式
Compiling Hadoop MapReduce Java Program in Hadoop Cluster
Compiling Hadoop MapReduce Java Program in Hadoop Cluster
以下練習,請連線至 hadoop.nchc.org.tw 操作。底下的 hXXXX 等於您的用戶名稱。
Practice 1 : Word Count (Basic)
- 上傳內容到 HDFS 內
upload data to HDFS$ mkdir lab8_input $ echo "I like HINET Cloud Course." > lab8_input/input1 $ echo "I like hinet Cloud Course, and we enjoy this course." > lab8_input/input2 $ hadoop fs -put lab8_input lab8_input $ hadoop fs -ls lab8_input Found 2 items -rw-r--r-- 2 hXXXX supergroup 26 2011-04-19 10:07 /user/hXXXX/lab8_input/input1 -rw-r--r-- 2 hXXXX supergroup 52 2011-04-19 10:07 /user/hXXXX/lab8_input/input2
- 下載 WordCount.java 並存到家目錄;
Download WordCount.java and save to your home directory~$ wget http://hadoop.nchc.org.tw/WordCount.java
- 運作程式
Compile WordCount.java and run it by hadoop jar command
$ mkdir MyJava $ ln -s /usr/lib/hadoop/hadoop-*-core.jar hadoop-core.jar $ javac -classpath hadoop-core.jar -d MyJava WordCount.java $ jar -cvf wordcount.jar -C MyJava . $ hadoop jar wordcount.jar WordCount lab8_input/ lab8_out1/ $ hadoop fs -cat lab8_out1/part-00000
- lab8_out1 執行結果
You should see results like this :Cloud 2 Course, 1 Course. 1 I 2 HINET 1 and 1 course. 1 enjoy 1 like 2 hinet 1 this 1 we 1
Practice 2 : Word Count (Advanced)
$ echo "\." >pattern.txt && echo "\," >>pattern.txt $ hadoop fs -put pattern.txt . $ mkdir -p MyJava2
- 下載 WordCount2.java 並存到家目錄;
Download WordCount2.java to home directory~$ wget http://hadoop.nchc.org.tw/WordCount2.java
$ javac -classpath hadoop-core.jar -d MyJava2 WordCount2.java $ jar -cvf wordcount2.jar -C MyJava2 . $ hadoop jar wordcount2.jar WordCount2 lab8_input lab8_out2 -skip pattern.txt $ hadoop fs -cat lab8_out2/part-00000
- lab8_out2 執行結果
You should see results like this:Cloud 2 Course 2 I 2 HINET 1 and 1 course 1 enjoy 1 like 2 hinet 1 this 1 we 1
- Let's given case insensitive and ignore pattern for this example
$ hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab8_input lab8_out3 -skip pattern.txt $ hadoop fs -cat lab8_out3/part-00000
- lab8_out3 執行結果
You should see results like this:and 1 cloud 2 course 3 enjoy 1 i 2 like 2 hinet 2 this 1 we 1
Last modified 13 years ago
Last modified on Jul 3, 2012, 12:11:55 PM