Version 5 (modified by waue, 16 years ago) (diff) |
---|
實作二: HDFS 指令操作練習
前言
- 此部份接續實做一
Content 1. 基本操作
1.1 瀏覽你HDFS目錄
1.2 上傳資料到HDFS目錄
1.3 下載HDFS的資料到本地目錄
1.4 更多指令操作
Content 2. Hadoop 運算命令
2.1 Hadoop運算命令 grep
2.2 Hadoop運算命令 WordCount
2.3 更多運算命令
可執行的指令一覽表:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using monte-carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
sleep: A job that sleeps at each map and reduce task.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
wordcount: A map/reduce program that counts the words in the input files.
Class Summary | |
---|---|
AggregateWordCount | This is an example Aggregated Hadoop Map/Reduce application. It reads the text input files, breaks each line into words and counts them. The output is a locally sorted list of words and the count of how often they occurred. To run: bin/hadoop jar hadoop-*-examples.jar aggregatewordcount in-dir out-dir numOfReducers textinputformat |
AggregateWordHistogram | This is an example Aggregated Hadoop Map/Reduce application. Computes the histogram of the words in the input texts. To run: bin/hadoop jar hadoop-*-examples.jar aggregatewordhist in-dir out-dir numOfReducers textinputformat |
ExampleDriver | A description of an example program based on its class and a human-readable description. |
Grep | |
Join | This is the trivial map/reduce program that does absolutely nothing other than use the framework to fragment and sort the input values. |
MultiFileWordCount | MultiFileWordCount is an example to demonstrate the usage of MultiFileInputFormat. |
MultiFileWordCount.MapClass | This Mapper is similar to the one in WordCount.MapClass . |
MultiFileWordCount.MultiFileLineRecordReader | RecordReader is responsible from extracting records from the InputSplit. |
MultiFileWordCount.MyInputFormat | To use MultiFileInputFormat , one should extend it, to return a
(custom) RecordReader . |
MultiFileWordCount.WordOffset | This record keeps <filename,offset> pairs. |
PiEstimator | A Map-reduce program to estimaate the valu eof Pi using monte-carlo method. |
PiEstimator.PiMapper | Mappper class for Pi estimation. |
RandomTextWriter | This program uses map/reduce to just run a distributed job where there is no interaction between the tasks and each task writes a large unsorted random sequence of words. |
RandomWriter | This program uses map/reduce to just run a distributed job where there is no interaction between the tasks and each task write a large unsorted random binary sequence file of BytesWritable. |
SleepJob | Dummy class for testing MR framefork. |
Sort<K,V> | This is the trivial map/reduce program that does absolutely nothing other than use the framework to fragment and sort the input values. |
WordCount | This is an example Hadoop Map/Reduce application. |
WordCount.MapClass | Counts the words in each line. |
WordCount.Reduce | A reducer class that just emits the sum of the input values. |
Content 3. 使用網頁Gui瀏覽訊息
練習
Attachments (1)
- 2009-03-24-135001_872x741_scrot.png (59.1 KB) - added by waue 16 years ago.
Download all attachments as: .zip