| 1 | {{{ |
| 2 | #!html |
| 3 | <div style="text-align: center;"><big |
| 4 | style="font-weight: bold;"><big><big>Hadoop Streaming</big></big></big></div> |
| 5 | }}} |
| 6 | * Hadoop streaming是Hadoop的一個工具, 它幫助用戶創建和運行一類特殊的map/reduce作業, 這些特殊的map/reduce作業是由一些可執行文件或腳本文件充當mapper或者reducer |
| 7 | |
| 8 | = 用 shell實做mapReduce = |
| 9 | * 最簡單的透過shell執行stream的map reduce: |
| 10 | {{{ |
| 11 | $ bin/hadoop jar hadoop-0.18.3-streaming.jar -input input -output stream-output1 -mapper /bin/cat -reducer /usr/bin/wc |
| 12 | }}} |
| 13 | * 輸出的結果為: (代表 行、字數、字元數) |
| 14 | {{{ |
| 15 | #!sh |
| 16 | 2910628 24507806 143451003 |
| 17 | }}} |
| 18 | = 用php實做mapReduce = |
| 19 | * [http://www.hadoop.tw/2008/09/php-hadoop.html 用 "單機" 跟 "PHP" 開發 Hadoop 程式] from Hadoop Taiwan User Group |
| 20 | |
| 21 | = Python 實做 = |
| 22 | * [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-example/ Hadoop Example Program] from brandeis University |