Changes between Initial Version and Version 1 of III110813/Lab8


Ignore:
Timestamp:
Oct 21, 2011, 2:50:49 PM (13 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • III110813/Lab8

    v1 v1  
     1◢ <[wiki:III110813/Lab7 實作七]> | <[wiki:III110813回課程大綱]> ▲ | <[wiki:III110813/Lab9 實作九]> ◣
     2
     3= 實作八 Lab 8 =
     4[[PageOutline]]
     5{{{
     6#!html
     7<div style="text-align: center;"><big style="font-weight: bold;"><big>在完全分散模式下編譯 MapReduce 程式<br/>Compiling Hadoop MapReduce Java Program in Hadoop Cluster</big></big></div>
     8}}}
     9
     10= Practice 1 : Word Count (Basic) =
     11
     12 * 上傳內容到 HDFS 內[[BR]]upload data to HDFS
     13{{{
     14$ mkdir lab6_input
     15$ echo "I like NCTU Cloud Course." > lab6_input/input1
     16$ echo "I like nctu Cloud Course, and we enjoy this course." > lab6_input/input2
     17$ hadoop fs -put lab6_input lab6_input
     18$ hadoop fs -ls lab6_input
     19Found 2 items
     20-rw-r--r--   2 hXXXX supergroup         26 2011-04-19 10:07 /user/hXXXX/lab6_input/input1
     21-rw-r--r--   2 hXXXX supergroup         52 2011-04-19 10:07 /user/hXXXX/lab6_input/input2
     22}}}
     23
     24 * 下載 [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] and save to your home directory
     25{{{
     26~$ wget http://hadoop.nchc.org.tw/WordCount.java
     27}}}
     28
     29 * 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
     30
     31{{{
     32$ mkdir MyJava
     33$ ln -s /usr/lib/hadoop/hadoop-*-core.jar hadoop-core.jar
     34$ javac -classpath hadoop-core.jar -d MyJava WordCount.java
     35$ jar -cvf wordcount.jar -C MyJava .
     36$ hadoop jar wordcount.jar WordCount lab6_input/ lab6_out1/
     37$ hadoop fs -cat lab6_out1/part-00000
     38}}}
     39
     40 * lab6_out1 執行結果 [[BR]]You should see results like this :
     41{{{
     42#!text
     43Cloud   2
     44Course, 1
     45Course. 1
     46I       2
     47NCTU    1
     48and     1
     49course. 1
     50enjoy   1
     51like    2
     52nctu    1
     53this    1
     54we      1
     55}}}
     56-----
     57
     58= Practice 2 : Word Count (Advanced) =
     59
     60{{{
     61$ echo "\." >pattern.txt && echo "\," >>pattern.txt
     62$ hadoop fs -put pattern.txt .
     63$ mkdir -p MyJava2
     64}}}
     65
     66
     67 * 下載 [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] to home directory
     68{{{
     69~$ wget http://hadoop.nchc.org.tw/WordCount2.java
     70}}}
     71
     72{{{
     73$ javac -classpath hadoop-core.jar -d MyJava2 WordCount2.java
     74$ jar -cvf wordcount2.jar -C MyJava2 .
     75$ hadoop jar wordcount2.jar WordCount2 lab6_input lab6_out2 -skip pattern.txt
     76$ hadoop fs -cat lab6_out2/part-00000
     77}}}
     78
     79 * lab6_out2 執行結果[[BR]]You should see results like this:
     80{{{
     81#!text
     82Cloud   2
     83Course  2
     84I       2
     85NCTU    1
     86and     1
     87course  1
     88enjoy   1
     89like    2
     90nctu    1
     91this    1
     92we      1
     93}}}
     94
     95 * Let's given case insensitive and ignore pattern for this example
     96{{{
     97$ hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab6_input lab6_out3 -skip pattern.txt
     98$ hadoop fs -cat lab6_out3/part-00000
     99}}}
     100
     101 * lab6_out3 執行結果[[BR]]You should see results like this:
     102{{{
     103#!text
     104and     1
     105cloud   2
     106course  3
     107enjoy   1
     108i       2
     109like    2
     110nctu    2
     111this    1
     112we      1
     113}}}