◢ <[wiki:Hinet120814/Lab8 實作八]> | <[wiki:Hinet120814 回課程大綱]> ▲ | <[wiki:Hinet120814/Lab10 實作十]> ◣
= 實作九 Lab 9 =
[[PageOutline]]
{{{
#!html
Hadoop Streaming 搭配不同程式語言練習
Hadoop Streaming in different Language
}}}
{{{
#!text
以下練習,請連線至 hadoop.nchc.org.tw 操作。底下的 hXXXX 等於您的用戶名稱。
}}}
== 搭配現存二進位執行檔 ==
== Existing Binary ==
{{{
~$ hadoop fs -put /etc/hadoop/conf lab9_input
~$ hadoop jar hadoop-streaming.jar -input lab9_input -output lab9_out1 -mapper /bin/cat -reducer /usr/bin/wc
~$ hadoop fs -cat lab9_out1/part-00000
}}}
== 搭配 Bash Shell Script ==
{{{
~$ echo "sed -e \"s/ /\n/g\" | grep ." > streamingMapper.sh
~$ echo "uniq -c | awk '{print \$2 \"\t\" \$1}'" > streamingReducer.sh
~$ chmod a+x streamingMapper.sh
~$ chmod a+x streamingReducer.sh
~$ hadoop jar hadoop-streaming.jar -input lab9_input -output lab9_out2 -mapper streamingMapper.sh -reducer streamingReducer.sh -file streamingMapper.sh -file streamingReducer.sh
~$ hadoop fs -cat lab9_out2/part-00000
}}}
== 搭配 PHP Script ==
* 編輯 mapper 的 php 程式
{{{
~$ cat > mapper.php << EOF
#!/usr/bin/php
\$count) {
// 印出 [字 , "tab符號" , "數字" , "結束字元"]
echo \$word, chr(9), \$count, PHP_EOL;
}
?>
EOF
}}}
* 編輯 reduce 的 php 程式
{{{
~$ cat > reducer.php << EOF
#!/usr/bin/php
int
\$count = intval(\$count);
// 加總
if (\$count > 0) \$word2count[\$word] += \$count;
}
// 此行不必要,但可讓output排列更完整
ksort(\$word2count);
// 將結果寫到 STDOUT (standard output)
foreach (\$word2count as \$word => \$count) {
echo \$word, chr(9), \$count, PHP_EOL;
}
?>
EOF
}}}
* 修改執行權限
{{{
~$ chmod a+x *.php
}}}
* 測試是否能運作
{{{
~$ echo "i love hadoop, hadoop love u" | ./mapper.php | ./reducer.php
}}}
* 開始執行
{{{
~$ hadoop jar hadoop-streaming.jar -mapper mapper.php -reducer reducer.php -input lab9_input -output lab9_out3 -file mapper.php -file reducer.php
}}}
* 檢查結果
{{{
~$ hadoop fs -cat lab9_out3/part-00000
}}}