Changes between Initial Version and Version 1 of NCTU110329/Lab6


Ignore:
Timestamp:
Apr 12, 2011, 11:47:31 AM (13 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • NCTU110329/Lab6

    v1 v1  
     1[[PageOutline]]
     2
     3◢ <[wiki:NCTU110329/Lab5 實作五]> | <[wiki:NCTU110329 回課程大綱]> ▲ | <[wiki:NCTU110329/Lab7 實作七]> ◣
     4
     5= 實作六 =
     6
     7{{{
     8#!html
     9<div style="text-align: center;"><big style="font-weight: bold;"><big>MapReduce 程式編譯<br/>Compiling Hadoop MapReduce Java Program</big></big></div>
     10}}}
     11
     12= Practice 1 : Word Count (Basic) =
     13
     14 * 上傳內容到 HDFS 內[[BR]]upload data to HDFS
     15
     16{{{
     17$ cd /opt/hadoop
     18$ mkdir lab4_input
     19$ echo "I like NCHC Cloud Course." > lab4_input/input1
     20$ echo "I like nchc Cloud Course, and we enjoy this course." > lab4_input/input2
     21$ bin/hadoop fs -put lab4_input lab4_input
     22$ bin/hadoop fs -ls lab4_input
     23}}}
     24
     25 * 下載 [http://secuse.nchc.org.tw/class/WordCount.java WordCount.java] 並存到/opt/hadoop;[[BR]]Download [http://trac.nchc.org.tw/cloud/raw-attachment/wiki/WordCount/WordCount.java WordCount.java] and save to /opt/hadoop
     26{{{
     27~$ cd /opt/hadoop
     28/opt/hadoop$ wget http://trac.nchc.org.tw/cloud/raw-attachment/wiki/WordCount/WordCount.java
     29}}}
     30
     31 * 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
     32
     33{{{
     34$ mkdir MyJava
     35$ javac -classpath hadoop-*-core.jar -d MyJava WordCount.java
     36$ jar -cvf wordcount.jar -C MyJava .
     37$ bin/hadoop jar wordcount.jar WordCount lab4_input/ lab4_out1/
     38$ bin/hadoop fs -cat lab4_out1/part-00000
     39}}}
     40
     41 * lab4_out1 執行結果 [[BR]]You should see results like this :
     42{{{
     43#!text
     44Cloud   2
     45Course, 1
     46Course. 1
     47I       2
     48NCHC    1
     49and     1
     50course. 1
     51enjoy   1
     52like    2
     53nchc    1
     54this    1
     55we      1
     56}}}
     57-----
     58
     59= Practice 2 : Word Count (Advanced) =
     60
     61{{{
     62$ echo "\." >pattern.txt && echo "\," >>pattern.txt
     63$ bin/hadoop fs -put pattern.txt ./
     64$ mkdir MyJava2
     65}}}
     66
     67
     68 * 下載 [raw-attachment:wiki:Hadoop_Lab4:WordCount2.java WordCount2.java] 並存到/opt/hadoop;[[BR]]Download [raw-attachment:wiki:Hadoop_Lab4:WordCount2.java WordCount2.java] to /opt/hadoop
     69{{{
     70~$ cd /opt/hadoop
     71/opt/hadoop$ wget http://trac.nchc.org.tw/cloud/raw-attachment/wiki/Hadoop_Lab4/WordCount2.java
     72}}}
     73
     74{{{
     75$ javac -classpath hadoop-*-core.jar -d MyJava2 WordCount2.java
     76$ jar -cvf wordcount2.jar -C MyJava2 .
     77$ bin/hadoop jar wordcount2.jar WordCount2 lab4_input lab4_out2 -skip pattern.txt
     78$ bin/hadoop fs -cat lab4_out2/part-00000
     79}}}
     80
     81 * lab4_out2 執行結果[[BR]]You should see results like this:
     82{{{
     83#!text
     84Cloud   2
     85Course  2
     86I       2
     87NCHC    1
     88and     1
     89course  1
     90enjoy   1
     91like    2
     92nchc    1
     93this    1
     94we      1
     95}}}
     96
     97 * Let's given case insensitive and ignore pattern for this example
     98{{{
     99/opt/hadoop$ bin/hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab4_input lab4_out3 -skip pattern.txt
     100/opt/hadoop$ bin/hadoop fs -cat lab4_out3/part-00000
     101}}}
     102
     103 * lab4_out3 執行結果[[BR]]You should see results like this:
     104{{{
     105#!text
     106and     1
     107cloud   2
     108course  3
     109enjoy   1
     110i       2
     111like    2
     112nchc    2
     113this    1
     114we      1
     115}}}