Changes between Initial Version and Version 1 of FDC110829/Lab4


Ignore:
Timestamp:
Aug 28, 2011, 11:02:43 PM (13 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FDC110829/Lab4

    v1 v1  
     1[[PageOutline]]
     2
     3◢ <[wiki:FDC110829/Lab3 實作三]> | <[wiki:FDC110829 回課程大綱]> ▲ | <[wiki:FDC110829/Lab5 實作五]> ◣
     4
     5= 實作四 Lab4 =
     6
     7{{{
     8#!html
     9<div style="text-align: center;"><big style="font-weight: bold;"><big>MapReduce 程式編譯<br/>Compiling Hadoop MapReduce Java Program</big></big></div>
     10}}}
     11
     12= Practice 1 : Word Count (Basic) =
     13
     14 * 上傳內容到 HDFS 內[[BR]]upload data to HDFS
     15
     16{{{
     17$ mkdir lab6_input
     18$ echo "I like NCTU Cloud Course." > lab6_input/input1
     19$ echo "I like nctu Cloud Course, and we enjoy this course." > lab6_input/input2
     20$ hadoop fs -put lab6_input lab6_input
     21$ hadoop fs -ls lab6_input
     22Found 2 items
     23-rw-r--r--   2 hXXXX supergroup         26 2011-04-19 10:07 /user/hXXXX/lab6_input/input1
     24-rw-r--r--   2 hXXXX supergroup         52 2011-04-19 10:07 /user/hXXXX/lab6_input/input2
     25}}}
     26
     27 * 下載 [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] and save to your home directory
     28{{{
     29~$ wget http://hadoop.nchc.org.tw/WordCount.java
     30}}}
     31
     32 * 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
     33
     34{{{
     35$ mkdir MyJava
     36$ ln -s /usr/lib/hadoop/hadoop-*-core.jar hadoop-core.jar
     37$ javac -classpath hadoop-core.jar -d MyJava WordCount.java
     38$ jar -cvf wordcount.jar -C MyJava .
     39$ hadoop jar wordcount.jar WordCount lab6_input/ lab6_out1/
     40$ hadoop fs -cat lab6_out1/part-00000
     41}}}
     42
     43 * lab6_out1 執行結果 [[BR]]You should see results like this :
     44{{{
     45#!text
     46Cloud   2
     47Course, 1
     48Course. 1
     49I       2
     50NCTU    1
     51and     1
     52course. 1
     53enjoy   1
     54like    2
     55nctu    1
     56this    1
     57we      1
     58}}}
     59-----
     60
     61= Practice 2 : Word Count (Advanced) =
     62
     63{{{
     64$ echo "\." >pattern.txt && echo "\," >>pattern.txt
     65$ hadoop fs -put pattern.txt .
     66$ mkdir -p MyJava2
     67}}}
     68
     69
     70 * 下載 [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] to home directory
     71{{{
     72~$ wget http://hadoop.nchc.org.tw/WordCount2.java
     73}}}
     74
     75{{{
     76$ javac -classpath hadoop-core.jar -d MyJava2 WordCount2.java
     77$ jar -cvf wordcount2.jar -C MyJava2 .
     78$ hadoop jar wordcount2.jar WordCount2 lab6_input lab6_out2 -skip pattern.txt
     79$ hadoop fs -cat lab6_out2/part-00000
     80}}}
     81
     82 * lab6_out2 執行結果[[BR]]You should see results like this:
     83{{{
     84#!text
     85Cloud   2
     86Course  2
     87I       2
     88NCTU    1
     89and     1
     90course  1
     91enjoy   1
     92like    2
     93nctu    1
     94this    1
     95we      1
     96}}}
     97
     98 * Let's given case insensitive and ignore pattern for this example
     99{{{
     100$ hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab6_input lab6_out3 -skip pattern.txt
     101$ hadoop fs -cat lab6_out3/part-00000
     102}}}
     103
     104 * lab6_out3 執行結果[[BR]]You should see results like this:
     105{{{
     106#!text
     107and     1
     108cloud   2
     109course  3
     110enjoy   1
     111i       2
     112like    2
     113nctu    2
     114this    1
     115we      1
     116}}}