Context Navigation

Changes between Version 18 and Version 19 of YMU110509

Timestamp:: May 30, 2011, 3:39:47 PM (14 years ago)
Author:: jazz
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

YMU110509

-                      v18
+                      v19
  * http://hadoop.nchc.org.tw/hadoop-doc/api/index.html - Hadoop 0.20.2 javadoc 文件
  * http://forum.hadoop.tw - 台灣 Hadoop 使用者討論區
+= 作業一 Homework 1 =
+ * 題目：請嘗試將 [wiki:NCTU110329/Lab6 實作六] 的 WordCount2.java 改成逆向索引（Reverse Index） !ReverseIndex.java。使 !ReverseIndex 執行之結果為「"關鍵字"\t"檔案名稱(用逗點隔開)"」型態。以實作六最後的執行方法，忽略句點（\.）與逗點（\,），並且忽略大小寫（case.sensitive=false），
+ * Please try to modified WordCount2.java downloaded from [wiki:NCTU110329/Lab6 Lab6]. Rename it to !ReverseIndex.java. Let !ReverseIndex output as "Keyword <TAB> filename(separated by comma)". Try to run it by ignoring "\." and "\," pattern and case-insensitive.
+ * 參考步驟：[[BR]]Here is the reference steps:
+{{{
+#!sh
+$ wget http://hadoop.nchc.org.tw/WordCount2.java -O ReverseIndex.java
+$ vi ReverseIndex.java #### DO YOUR MODIFICATION - 修改對應的程式碼
+$ mkdir -p MyJava3
+$ javac -classpath hadoop-core.jar -d MyJava3 ReverseIndex.java
+$ jar -cvf reverseindex.jar -C MyJava3 .
+$ hadoop jar reverseindex.jar ReverseIndex -Dwordcount.case.sensitive=false lab6_input lab6_out4 -skip pattern.txt
+$ hadoop fs -cat lab6_out4/part-00000
+}}}
+ * 參考結果應該為：(路徑不限）[[BR]]The reference result should be as following:（no limitation for the format of "path"）
+{{{
+and     input2
+cloud   input1,input2
+course  input1,input2,input2
+enjoy   input2
+i       input1,input2
+like    input1,input2
+nctu    input1,input2
+this    input2
+we      input2
+}}}
+ * 繳交期限：2011年6月13日（一） 上午 11:59
+ * Due date: 11:59 AM, Monday, June 13th, Year 2011
+ * 繳交方式：將原始碼與報告以附件方式寄至 jazz _AT_ nchc _DOT_ org _DOT_ tw (1) 程式原始碼一份：以 ${學號}.zip 方式壓縮與命名 (2) 報告一份：以 ${學號} 命名。
+ * Please e-mail the java source code and report (doc or PDF) to jazz _AT_ nchc _DOT_ org _DOT_ tw
+ * 提示：[[BR]]Hint:
+  * 請將 Mapper 輸出、Reducer 輸入輸出的 (Key,Value) 由原本的 (Text, !IntWritable) 改成 (Text, Text)
+  * Replace (Key,Value) pair from (Text, !IntWritable) to (Text, Text)
+ * 加分題：(Extra)
+  * 試將出現次數統計加入結果，亦即參考結果如下：[[BR]]Try to add count of each file in the result, i.e. The reference result should be as following:
+{{{
+and     input2(1)
+cloud   input1(1),input2(1)
+course  input1(1),input2(2)
+enjoy   input2(1)
+i       input1(1),input2(1)
+like    input1(1),input2(1)
+nctu    input1(1),input2(1)
+this    input2(1)
+we      input2(1)
+}}}
+ * 配分比例：
+  * 標準題原始碼 Source Code：60%
+  * 報告 Report ：20%
+    * 參考內容入下：Reference Items should be shown in your report
+    * 封面 Cover : 姓名、學號 ( Your Name and ID ）
+    * 於 hadoop.nchc.org.tw 執行的擷圖（Screenshot of your program running on hadoop.nchc.org.tw）
+    * 執行結果 The result of your program
+  * 加分題：20%