Version 1 (modified by jazz, 15 years ago) (diff) |
---|
2010-08-04 ¶
Hadoop : Steaming ¶
- [範例] 使用 gzip 當作輸入格式
- 目前只支援三種壓縮格式,詳org.apache.hadoop.io.compress.CompressionCodec
hadoop dfs -rmr $4 hadoop jar /usr/local/share/hadoop/contrib/streaming/hadoop-*-streaming.jar -mapper $1 -reducer $2 -input $3/* -output $4 -file $1 -file $2 -jobconf mapred.job.name="$5" -jobconf stream.recordreader.compression=gzip \ -jobconf mapred.output.compress=true \ -jobconf mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
- 目前只支援三種壓縮格式,詳org.apache.hadoop.io.compress.CompressionCodec