close Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_repos.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.

Changes between Initial Version and Version 1 of jazz/10-08-04


Ignore:
Timestamp:
Aug 4, 2010, 11:26:31 AM (14 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • jazz/10-08-04

    v1 v1  
     1= 2010-08-04 =
     2
     3== Hadoop : Steaming ==
     4
     5 * [範例] [http://www.mail-archive.com/common-user@hadoop.apache.org/msg00422.html 使用 gzip 當作輸入格式]
     6   * 目前只支援三種壓縮格式,詳[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html org.apache.hadoop.io.compress.CompressionCodec]
     7{{{
     8#!sh
     9hadoop dfs -rmr $4
     10hadoop jar /usr/local/share/hadoop/contrib/streaming/hadoop-*-streaming.jar
     11-mapper $1 -reducer $2 -input $3/* -output
     12 $4 -file $1 -file $2 -jobconf mapred.job.name="$5"   -jobconf
     13stream.recordreader.compression=gzip \
     14-jobconf mapred.output.compress=true \
     15-jobconf
     16mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
     17}}}