[[PageOutline]]
◢ <[wiki:NTUT160220/Lab8 實作八]> | <[wiki:NTUT160220 回課程大綱]> ▲ | <[wiki:NTUT160220/Lab10 實作十]> ◣
= 實作九 Lab 9 =
{{{
#!html
練習豬的拉丁語
Pig Latin in Practice
}}}
{{{
#!text
以下練習,請連線至 hdp01.3du.me 操作。底下的 userXX 等於您的用戶名稱。
以下練習,請連線至 hdp02.3du.me 操作。底下的 userXX 等於您的用戶名稱。
以下練習,請連線至 hdp03.3du.me 操作。底下的 userXX 等於您的用戶名稱。
以下練習,請連線至 hdp04.3du.me 操作。底下的 userXX 等於您的用戶名稱。
}}}
== Aggregation (Local Mode) ==
{{{
~$ wget http://www.hadoop.tw/excite-small.log
~$ pig -x local
grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
grunt> grpd = GROUP log BY user;
grunt> cntd = FOREACH grpd GENERATE group, COUNT(log);
grunt> STORE cntd INTO 'lab8_out1';
grunt> quit
~$ head lab8_out1/part-*
}}}
== Filter (Local Mode) ==
{{{
~$ pig -x local
grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
grunt> grpd = GROUP log BY user;
grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt;
grunt> fltrd = FILTER cntd BY cnt > 50;
grunt> STORE fltrd INTO 'lab8_out2';
grunt> quit
~$ head lab8_out2/part-*
}}}
== Sorting (Local Mode) ==
{{{
~$ pig -x local
grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
grunt> grpd = GROUP log BY user;
grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt;
grunt> fltrd = FILTER cntd BY cnt > 50;
grunt> srtd = ORDER fltrd BY cnt;
grunt> STORE srtd INTO 'lab8_out3';
grunt> quit
~$ head lab8_out3/part-*
}}}
== Connect Pig to Hadoop (Full Distributed Mode) ==
{{{
~$ hadoop fs -put excite-small.log .
~$ pig
grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query);
grunt> grpd = GROUP log BY user;
grunt> cntd = FOREACH grpd GENERATE group, COUNT(log);
grunt> STORE cntd INTO 'lab8_out1';
grunt> quit
~$ hadoop fs -cat lab8_out1/part-00000
}}}