[[PageOutline]] ◢ <[wiki:NTUT160220/Lab8 實作八]> | <[wiki:NTUT160220 回課程大綱]> ▲ | <[wiki:NTUT160220/Lab10 實作十]> ◣ = 實作九 Lab 9 = {{{ #!html
練習豬的拉丁語
Pig Latin in Practice
}}} {{{ #!text 以下練習,請連線至 hdp01.3du.me 操作。底下的 userXX 等於您的用戶名稱。 以下練習,請連線至 hdp02.3du.me 操作。底下的 userXX 等於您的用戶名稱。 以下練習,請連線至 hdp03.3du.me 操作。底下的 userXX 等於您的用戶名稱。 以下練習,請連線至 hdp04.3du.me 操作。底下的 userXX 等於您的用戶名稱。 }}} == Aggregation (Local Mode) == {{{ ~$ wget http://www.hadoop.tw/excite-small.log ~$ pig -x local grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query); grunt> grpd = GROUP log BY user; grunt> cntd = FOREACH grpd GENERATE group, COUNT(log); grunt> STORE cntd INTO 'lab8_out1'; grunt> quit ~$ head lab8_out1/part-* }}} == Filter (Local Mode) == {{{ ~$ pig -x local grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query); grunt> grpd = GROUP log BY user; grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt; grunt> fltrd = FILTER cntd BY cnt > 50; grunt> STORE fltrd INTO 'lab8_out2'; grunt> quit ~$ head lab8_out2/part-* }}} == Sorting (Local Mode) == {{{ ~$ pig -x local grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query); grunt> grpd = GROUP log BY user; grunt> cntd = FOREACH grpd GENERATE group, COUNT(log) AS cnt; grunt> fltrd = FILTER cntd BY cnt > 50; grunt> srtd = ORDER fltrd BY cnt; grunt> STORE srtd INTO 'lab8_out3'; grunt> quit ~$ head lab8_out3/part-* }}} == Connect Pig to Hadoop (Full Distributed Mode) == {{{ ~$ hadoop fs -put excite-small.log . ~$ pig grunt> log = LOAD 'excite-small.log' AS (user, timestamp, query); grunt> grpd = GROUP log BY user; grunt> cntd = FOREACH grpd GENERATE group, COUNT(log); grunt> STORE cntd INTO 'lab8_out1'; grunt> quit ~$ hadoop fs -cat lab8_out1/part-00000 }}}