wiki:jazz/Hadoop_Lab2

Version 1 (modified by waue, 16 years ago) (diff)

--

實作二: HDFS 指令操作練習

前言

  • 此部份接續實做一

Content 1. 基本操作

1.1 瀏覽你HDFS目錄

1.2 上傳資料到HDFS目錄

1.3 下載HDFS的資料到本地目錄

Content 2. Hadoop 運算命令

2.1 Hadoop運算命令 grep

2.2 Hadoop運算命令 WordCount

2.3 更多運算命令 ==

請參考 org.apache.hadoop.examples

AggregateWordCount? This is an example Aggregated Hadoop Map/Reduce? application.
AggregateWordCount?.WordCountPlugInClass?
AggregateWordHistogram? This is an example Aggregated Hadoop Map/Reduce? application.
AggregateWordHistogram?.AggregateWordHistogramPlugin?
DBCountPageView This is a demonstrative program, which uses DBInputFormat for reading the input data from a database, and DBOutputFormat for writing the data to the database.
ExampleDriver? A description of an example program based on its class and a human-readable description.
Grep
Join This is the trivial map/reduce program that does absolutely nothing other than use the framework to fragment and sort the input values.
MultiFileWordCount? MultiFileWordCount? is an example to demonstrate the usage of MultiFileInputFormat?.
MultiFileWordCount?.MapClass? This Mapper is similar to the one in WordCount.MapClass?.
MultiFileWordCount?.MultiFileLineRecordReader? RecordReader? is responsible from extracting records from the InputSplit?.
MultiFileWordCount?.MyInputFormat? To use MultiFileInputFormat?, one should extend it, to return a (custom) RecordReader?.
MultiFileWordCount?.WordOffset? This record keeps <filename,offset> pairs.
PiEstimator? A Map-reduce program to estimaate the valu eof Pi using monte-carlo method.
PiEstimator?.PiMapper? Mappper class for Pi estimation.

PiEstimator?.PiReducer?

RandomTextWriter? This program uses map/reduce to just run a distributed job where there is no interaction between the tasks and each task writes a large unsorted random sequence of words.
RandomWriter? This program uses map/reduce to just run a distributed job where there is no interaction between the tasks and each task write a large unsorted random binary sequence file of BytesWritable?.
SleepJob? Dummy class for testing MR framefork.
Sort<K,V> This is the trivial map/reduce program that does absolutely nothing other than use the framework to fragment and sort the input values.
WordCount This is an example Hadoop Map/Reduce? application.
WordCount.MapClass? Counts the words in each line.
WordCount.Reduce A reducer class that just emits the sum of the input values.

Content 6. 使用網頁Gui

練習

Attachments (1)

Download all attachments as: .zip