Version 4 (modified by waue, 13 years ago) (diff) |
---|
HBase 進階課程
1. Streaming
回課程大綱 << Stream 範例
Hadoop Streaming with commands
- 範例一:使用 cat 當 mapper,使用 wc 當 reducer
$ cd /opt/hadoop $ bin/start-all.sh $ bin/hadoop fs -put conf input $ bin/hadoop jar ./contrib/streaming/hadoop-0.20.2-streaming.jar -input input -output output -mapper /bin/cat -reducer /usr/bin/wc
- 範例二:使用 Bash Shell Script 當 Mapper 與 Reducer
$ echo "sed -e \"s/ /\n/g\" | grep ." > streamingMapper.sh $ echo "uniq -c | awk '{print \$2 \"\t\" \$1}'" > streamingReducer.sh $ chmod 755 streaming*.sh $ bin/hadoop fs -rmr input output $ bin/hadoop fs -put conf input $ bin/hadoop jar ./contrib/streaming/hadoop-0.20.2-streaming.jar -input input -output output -mapper streamingMapper.sh -reducer streamingReducer.sh -file streamingMapper.sh -file streamingReducer.sh
- 結果
$ bin/hadoop dfs -cat output/part-00000 restriction, 1 rights 1 sell 1 shall 2 so, 1 software 3 source 3 subject 1 sublicense, 1 substantial 1 the 10 this 4 to 7 use, 1 used 1 whom 1 without 2