Changes between Initial Version and Version 1 of III140412/Lab8


Ignore:
Timestamp:
Apr 11, 2014, 11:23:19 PM (11 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • III140412/Lab8

    v1 v1  
     1◢ <[wiki:III140412/Lab7 實作七]> | <[wiki:III140412 回課程大綱]> ▲ | <[wiki:III140412/Lab9 實作九]> ◣
     2
     3= 實作八 Lab8 =
     4
     5{{{
     6#!html
     7<p style="text-align: center;"><big style="font-weight: bold;"><big>MapReduce 基本指令操作<br/>Basic Commands of Hadoop MapReduce</big></big></p>
     8}}}
     9
     10[[PageOutline]]
     11
     12{{{
     13#!text
     14請先連線至 nodeN.3du.me , N 為您的報名編號
     15}}}
     16
     17== Sample 1 : WordCount ==
     18 
     19 * 如名稱,WordCount會對所有的字作字數統計,並且從a-z作排列[[BR]]WordCount example will count each word shown in documents and sorting from a to z.
     20{{{
     21~$ hadoop fs -put ${HOME}/hadoop/conf lab11_input
     22~$ hadoop fs -rmr lab11_out2
     23~$ hadoop jar ${HOME}/hadoop/hadoop-examples-*.jar wordcount lab11_input lab11_out2
     24}}}
     25 * 檢查輸出結果的方法同之前方法[[BR]]Let's check the computed result of '''wordcount''' from HDFS :
     26{{{
     27~$ hadoop fs -ls lab11_out2
     28~$ hadoop fs -cat lab11_out2/part-r-00000
     29}}}
     30 * 結果如下[[BR]]You should see results like this:
     31{{{
     32"".     4
     33"*"     9
     34"127.0.0.1"     3
     35"AS     2
     36"License");     2
     37"_logs/history/"        1
     38"alice,bob      9
     39
     40( ... skip ... )
     41}}}
     42
     43== Sample 2: grep ==
     44 
     45 * grep 這個命令是擷取文件裡面特定的字元,在Hadoop example中此指令可以擷取文件中有此指定文字的字串,並作計數統計[[BR]]grep is a command to extract specific characters in documents. In hadoop examples, you can use this command to extract strings match the regular expression and count for matched strings.
     46{{{
     47~$ hadoop fs -ls lab11_input
     48~$ hadoop jar ${HOME}/hadoop/hadoop-examples-*.jar grep lab11_input lab11_out3 'dfs[a-z.]+'
     49}}}
     50 * 運作的畫面如下:[[BR]]You should see procedure like this: 
     51{{{
     5211/04/19 10:00:20 INFO mapred.FileInputFormat: Total input paths to process : 25
     5311/04/19 10:00:20 INFO mapred.JobClient: Running job: job_201104120101_0645
     5411/04/19 10:00:21 INFO mapred.JobClient:  map 0% reduce 0%
     55( ... skip ... )
     56}}}
     57 * 接著查看結果[[BR]]Let's check the computed result of '''grep''' from HDFS :
     58{{{
     59~$ hadoop fs -ls lab11_out3
     60~$ hadoop fs -cat lab11_out3/part-00000
     61}}}
     62 * 結果如下[[BR]]You should see results like this:
     63{{{
     644       dfs.permissions
     654       dfs.replication
     664       dfs.name.dir
     673       dfs.namenode.decommission.interval.
     683       dfs.namenode.decommission.nodes.per.interval
     693       dfs.
     70( ... skip ... )
     71}}}
     72
     73== More Examples ==
     74 
     75 可執行的指令一覽表:[[BR]]Here is a list of hadoop examples :
     76
     77 || aggregatewordcount ||  An Aggregate based map/reduce program that counts the words in the input files. ||
     78 || aggregatewordhist || An Aggregate based map/reduce program that computes the histogram of the words in the input files. ||
     79 || grep ||  A map/reduce program that counts the matches of a regex in the input. ||
     80 || join || A job that effects a join over sorted, equally partitioned datasets ||
     81 || multifilewc ||  A job that counts words from several files. ||
     82 || pentomino  || A map/reduce tile laying program to find solutions to pentomino problems. ||
     83 || pi ||  A map/reduce program that estimates Pi using monte-carlo method. ||
     84 || randomtextwriter ||  A map/reduce program that writes 10GB of random textual data per node. ||
     85 || randomwriter || A map/reduce program that writes 10GB of random data per node. ||
     86 || sleep ||  A job that sleeps at each map and reduce task. ||
     87 || sort || A map/reduce program that sorts the data written by the random writer. ||
     88 || sudoku ||  A sudoku solver. ||
     89 || wordcount || A map/reduce program that counts the words in the input files. ||
     90
     91You could find more detail at [http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/examples/package-summary.html org.apache.hadoop.examples]