[[PageOutline]]
◢ <[wiki:NCTU110329/Lab5 實作五]> | <[wiki:NCTU110329 回課程大綱]> ▲ | <[wiki:NCTU110329/Lab7 實作七]> ◣
= 實作六 =
{{{
#!html
MapReduce 程式編譯
Compiling Hadoop MapReduce Java Program
}}}
= Practice 1 : Word Count (Basic) =
* 上傳內容到 HDFS 內[[BR]]upload data to HDFS
{{{
$ cd /opt/hadoop
$ mkdir lab4_input
$ echo "I like NCHC Cloud Course." > lab4_input/input1
$ echo "I like nchc Cloud Course, and we enjoy this course." > lab4_input/input2
$ bin/hadoop fs -put lab4_input lab4_input
$ bin/hadoop fs -ls lab4_input
}}}
* 下載 [http://secuse.nchc.org.tw/class/WordCount.java WordCount.java] 並存到/opt/hadoop;[[BR]]Download [http://trac.nchc.org.tw/cloud/raw-attachment/wiki/WordCount/WordCount.java WordCount.java] and save to /opt/hadoop
{{{
~$ cd /opt/hadoop
/opt/hadoop$ wget http://trac.nchc.org.tw/cloud/raw-attachment/wiki/WordCount/WordCount.java
}}}
* 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
{{{
$ mkdir MyJava
$ javac -classpath hadoop-*-core.jar -d MyJava WordCount.java
$ jar -cvf wordcount.jar -C MyJava .
$ bin/hadoop jar wordcount.jar WordCount lab4_input/ lab4_out1/
$ bin/hadoop fs -cat lab4_out1/part-00000
}}}
* lab4_out1 執行結果 [[BR]]You should see results like this :
{{{
#!text
Cloud 2
Course, 1
Course. 1
I 2
NCHC 1
and 1
course. 1
enjoy 1
like 2
nchc 1
this 1
we 1
}}}
-----
= Practice 2 : Word Count (Advanced) =
{{{
$ echo "\." >pattern.txt && echo "\," >>pattern.txt
$ bin/hadoop fs -put pattern.txt ./
$ mkdir MyJava2
}}}
* 下載 [raw-attachment:wiki:Hadoop_Lab4:WordCount2.java WordCount2.java] 並存到/opt/hadoop;[[BR]]Download [raw-attachment:wiki:Hadoop_Lab4:WordCount2.java WordCount2.java] to /opt/hadoop
{{{
~$ cd /opt/hadoop
/opt/hadoop$ wget http://trac.nchc.org.tw/cloud/raw-attachment/wiki/Hadoop_Lab4/WordCount2.java
}}}
{{{
$ javac -classpath hadoop-*-core.jar -d MyJava2 WordCount2.java
$ jar -cvf wordcount2.jar -C MyJava2 .
$ bin/hadoop jar wordcount2.jar WordCount2 lab4_input lab4_out2 -skip pattern.txt
$ bin/hadoop fs -cat lab4_out2/part-00000
}}}
* lab4_out2 執行結果[[BR]]You should see results like this:
{{{
#!text
Cloud 2
Course 2
I 2
NCHC 1
and 1
course 1
enjoy 1
like 2
nchc 1
this 1
we 1
}}}
* Let's given case insensitive and ignore pattern for this example
{{{
/opt/hadoop$ bin/hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab4_input lab4_out3 -skip pattern.txt
/opt/hadoop$ bin/hadoop fs -cat lab4_out3/part-00000
}}}
* lab4_out3 執行結果[[BR]]You should see results like this:
{{{
#!text
and 1
cloud 2
course 3
enjoy 1
i 2
like 2
nchc 2
this 1
we 1
}}}