| 1 | {{{ |
| 2 | #!html |
| 3 | <div style="text-align: center;"><big |
| 4 | style="font-weight: bold;"><big><big>實作一: Hadoop Streaming 操作練習</big></big></big></div> |
| 5 | }}} |
| 6 | [[PageOutline]] |
| 7 | [wiki:NCHCCloudCourse110721 回課程大綱] |
| 8 | == 前言 == |
| 9 | |
| 10 | * 本實作基於 Ubuntu 10.04 LTS 版本(Lucid)。 |
| 11 | * '''黑底白字的部分為指令或console秀出的畫面''',請自行剪貼提示符號 "$"(代表一般使用者) 或 "#"(代表最高權限 root 管理者) 之後的指令。如: |
| 12 | {{{ |
| 13 | /home/DIR$ Copy_Command From To ... |
| 14 | }}} |
| 15 | 則複製''' Copy_Command From To ... ''' 這個指令,貼到你的console來執行。(/home/DIR 代表目前所在的目錄路徑) |
| 16 | * '''白底黑字的部分為文件內的內容''' ,如 |
| 17 | {{{ |
| 18 | #!sh |
| 19 | I am context. |
| 20 | }}} |
| 21 | 如果熟悉vi,nano,joe等編輯器可複製此區內容貼到文件內(雖然此頁面的指令都已經簡化過) |
| 22 | |
| 23 | == Hadoop Streaming with commands == |
| 24 | |
| 25 | * 製作 input 檔 |
| 26 | {{{ |
| 27 | hadoop@lucid:~$ cd /opt/hadoop |
| 28 | hadoop@lucid:/opt/hadoop$ mkdir ./input; cp README.txt ./input/; |
| 29 | }}} |
| 30 | |
| 31 | * 範例一:使用 cat 當 mapper,使用 wc 當 reducer |
| 32 | {{{ |
| 33 | |
| 34 | hadoop@lucid:/opt/hadoop$ bin/hadoop jar ./contrib/streaming/hadoop-0.20.2-streaming.jar -input input -output output -mapper /bin/cat -reducer /usr/bin/wc |
| 35 | hadoop@lucid:/opt/hadoop$ cat output/part-00000 |
| 36 | }}} |
| 37 | * 範例二:使用 Bash Shell Script 當 Mapper 與 Reducer |
| 38 | {{{ |
| 39 | hadoop@lucid:/opt/hadoop$ echo "sed -e \"s/ /\n/g\" | grep ." > streamingMapper.sh |
| 40 | hadoop@lucid:/opt/hadoop$ echo "uniq -c | awk '{print \$2 \"\t\" \$1}'" > streamingReducer.sh |
| 41 | hadoop@lucid:/opt/hadoop$ chmod a+x streamingMapper.sh |
| 42 | hadoop@lucid:/opt/hadoop$ chmod a+x streamingReducer.sh |
| 43 | hadoop@lucid:/opt/hadoop$ bin/hadoop jar ./contrib/streaming/hadoop-0.20.2-streaming.jar -input input -output output -mapper streamingMapper.sh -reducer streamingReducer.sh -file streamingMapper.sh -file streamingReducer.sh |
| 44 | }}} |
| 45 | * 觀看執行結果 |
| 46 | {{{ |
| 47 | hadoop@lucid:/opt/hadoop$ bin/hadoop fs -cat output/part-00000 |
| 48 | }}} |