Changes between Initial Version and Version 1 of NTUT160220/Lab9


Ignore:
Timestamp:
Feb 19, 2016, 10:01:15 PM (8 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • NTUT160220/Lab9

    v1 v1  
     1[[PageOutline]]
     2
     3◢ <[wiki:NTUT160220/Lab8 實作八]> | <[wiki:NTUT160220 回課程大綱]> ▲ | <[wiki:NTUT160220/Lab10 實作十]> ◣
     4
     5= 實作九 Lab 9 =
     6
     7{{{
     8#!html
     9<div style="text-align: center;"><big style="font-weight: bold;"><big>練習豬的拉丁語<br/>Pig Latin in Practice</big></big></div>
     10}}}
     11
     12{{{
     13#!text
     14以下練習,請連線至 hdp01.3du.me 操作。底下的 userXX 等於您的用戶名稱。
     15以下練習,請連線至 hdp02.3du.me 操作。底下的 userXX 等於您的用戶名稱。
     16以下練習,請連線至 hdp03.3du.me 操作。底下的 userXX 等於您的用戶名稱。
     17以下練習,請連線至 hdp04.3du.me 操作。底下的 userXX 等於您的用戶名稱。
     18}}}
     19
     20== Aggregation (Local Mode) ==
     21
     22{{{
     23~$ wget http://www.hadoop.tw/excite-small.log
     24~$ pig -x local
     25grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
     26grunt> grpd = GROUP log BY user;
     27grunt> cntd = FOREACH grpd GENERATE group, COUNT(log);
     28grunt> STORE cntd INTO 'lab8_out1';
     29grunt> quit
     30~$ head lab8_out1/part-*
     31}}}
     32
     33== Filter (Local Mode) ==
     34
     35{{{
     36~$ pig -x local
     37grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
     38grunt> grpd = GROUP log BY user;
     39grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt;
     40grunt> fltrd = FILTER cntd BY cnt > 50;
     41grunt> STORE fltrd INTO 'lab8_out2';
     42grunt> quit
     43~$ head lab8_out2/part-*
     44}}}
     45
     46== Sorting (Local Mode) ==
     47
     48{{{
     49~$ pig -x local
     50grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
     51grunt> grpd = GROUP log BY user;
     52grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt;
     53grunt> fltrd = FILTER cntd BY cnt > 50;
     54grunt> srtd = ORDER fltrd BY cnt;
     55grunt> STORE srtd INTO 'lab8_out3';
     56grunt> quit
     57~$ head lab8_out3/part-*
     58}}}
     59
     60== Connect Pig to Hadoop (Full Distributed Mode) ==
     61
     62{{{
     63~$ hadoop fs -put excite-small.log .
     64~$ pig
     65grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
     66grunt> grpd = GROUP log BY user;
     67grunt> cntd = FOREACH grpd GENERATE group, COUNT(log);
     68grunt> STORE cntd INTO 'lab8_out1';
     69grunt> quit
     70~$ hadoop fs -cat lab8_out1/part-00000
     71}}}