案例實務
TCRC 餐廳
事先準備
- 請先準備以下檔案
- /tmp/TCRC/store.txt
T01;GunLong; 01;20;40;30;50 T02;Esing; 02;50 T03;SunDon; 03;40;30 T04;StarBucks; 04;50;50;20
- /tmp/TCRC/store.txt
- 請準備以下檔案,並將 income 資料夾上傳到hdfs
- /tmp/TCRC/income/0202.txt
waue:T01:P1:xxxxx jazz:T01:P2:xxxxx lia:T01:P3:xxxxx hung:T02:P1:xxxxx lia:T04:P1:xxxxx lia:T04:P1:xxxxx hung:T04:P3:xxxxx hung:T04:P2:xxxxx
- /tmp/TCRC/income/0203.txt
xxx:T01:P4:xxxxx ooo:T02:P1:xxxxx oo:T03:P1:xxxxx xxx:T03:P1:xxxxx aaa:T03:P1:xxxxx
- 上傳到 HDFS
$ /opt/hadoop/bin/hadoop fs -put /tmp/TCRC/income/ income
- 加裝 tableindexed 函式庫
cp /opt/hbase/contrib/transactional/hbase-*-transactional.jar /opt/hbase/lib/
- /tmp/TCRC/income/0202.txt
- 在 /opt/hbase/conf/hbase-site.xml 檔內的 <conf> .... </conf> 加入以下幾行
<property> <name> hbase.regionserver.class </name> <value> org.apache.hadoop.hbase.ipc.IndexedRegionInterface</value> </property> <property> <name> hbase.regionserver.impl </name> <value> org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer </value> </property>
- 重新啟動 hbase
/opt/hbase/bin/stop-hbase.sh /opt/hbase/bin/start-hbase.sh
假設:
目前有四間商店進駐TCRC餐廳,分別為
位在第1區的GunLong,品項4項單價為<20,40,30,50>
第2區的ESing,品項1項單價為<50>
第3區的SunDon,品項2項單價為<40,30>
第4區的StarBucks,品項3項單價為<50,50,20>
1. 建立商店資料
[ TCRC1LoadFile.java ],
Detail Detail Products Products Products Products Turnover Name Locate P1 P2 P3 P4 T01 Gun-Long 01 20 40 30 50 T02 ESing 02 50 T03 Sun-Don 03 40 30 T04 Star Bucks 04 50 50
$ /opt/hadoop/bin/hadoop jar TCRCHBase_100204.jar TCRC1LoadFile
create new table: TCRC Put data :"GunLong" to Table: TCRC's Detail:Name Put data :"01" to Table: TCRC's Detail:Locate Put data :"20" to Table: TCRC's Products:P1 Put data :"40" to Table: TCRC's Products:P2 Put data :"30" to Table: TCRC's Products:P3 Put data :"50" to Table: TCRC's Products:P4 Put data :"Esing" to Table: TCRC's Detail:Name Put data :"02" to Table: TCRC's Detail:Locate Put data :"50" to Table: TCRC's Products:P1 Put data :"SunDon" to Table: TCRC's Detail:Name Put data :"03" to Table: TCRC's Detail:Locate Put data :"40" to Table: TCRC's Products:P1 Put data :"30" to Table: TCRC's Products:P2 Put data :"StarBucks" to Table: TCRC's Detail:Name Put data :"04" to Table: TCRC's Detail:Locate Put data :"50" to Table: TCRC's Products:P1 Put data :"50" to Table: TCRC's Products:P2 Put data :"20" to Table: TCRC's Products:P3
2 計算單月每個品項的購買次數
[ TCRC2Count.java] ,
$ /opt/hadoop/bin/hadoop jar TCRCHBase_100204.jar TCRC2Count
Detail Detail Products Products Products Products Turnover Turnover Turnover Turnover Name Locate P1 P2 P3 P4 P1 P2 P3 P4 T01 Gun-Long 01 20 40 30 50 1 1 1 1 T02 ESing 02 50 2 T03 Sun-Don 03 40 30 3 T04 Star Bucks 04 50 50 2 1 1
> scan 'TCRC' ROW COLUMN+CELL T01 column=Detail:Locate, timestamp=1265184360616, value=01 T01 column=Detail:Name, timestamp=1265184360548, value=GunLong T01 column=Products:P1, timestamp=1265184360694, value=20 T01 column=Products:P2, timestamp=1265184360758, value=40 T01 column=Products:P3, timestamp=1265184360815, value=30 T01 column=Products:P4, timestamp=1265184360866, value=50 T01 column=Turnover:P1, timestamp=1265187021528, value=1 T01 column=Turnover:P2, timestamp=1265187021528, value=1 T01 column=Turnover:P3, timestamp=1265187021528, value=1 T01 column=Turnover:P4, timestamp=1265187021528, value=1 T02 column=Detail:Locate, timestamp=1265184360951, value=02 T02 column=Detail:Name, timestamp=1265184360910, value=Esing T02 column=Products:P1, timestamp=1265184361051, value=50 T02 column=Turnover:P1, timestamp=1265187021528, value=2 T03 column=Detail:Locate, timestamp=1265184361124, value=03 T03 column=Detail:Name, timestamp=1265184361098, value=SunDon T03 column=Products:P1, timestamp=1265184361189, value=40 T03 column=Products:P2, timestamp=1265184361259, value=30 T03 column=Turnover:P1, timestamp=1265187021529, value=3 T04 column=Detail:Locate, timestamp=1265184361311, value=04 T04 column=Detail:Name, timestamp=1265184361287, value=StarBucks T04 column=Products:P1, timestamp=1265184361343, value=50 T04 column=Products:P2, timestamp=1265184361386, value=50 T04 column=Products:P3, timestamp=1265184361422, value=20 T04 column=Turnover:P1, timestamp=1265187021529, value=2 T04 column=Turnover:P2, timestamp=1265187021529, value=1 T04 column=Turnover:P3, timestamp=1265187021529, value=1 4 row(s) in 0.0310 seconds
3. 計算當天營業額
[ TCRC3CalculateMR.java],
$ /opt/hadoop/bin/hadoop jar TCRCHBase_100204.jar TCRC3CalculateMR
Detail Detail Products Products Products Products Turnover Turnover Turnover Turnover Turnover Name Locate P1 P2 P3 P4 P1 P2 P3 P4 Sum T01 Gun-Long 01 20 40 30 50 1 1 1 1 140 T02 ESing 02 50 2 100 T03 Sun-Don 03 40 30 3 120 T04 Star Bucks 04 50 50 2 1 1 170
> scan ‘TCRC’ ROW COLUMN+CELL T01 column=Detail:Locate, timestamp=1265184360616, value=01 T01 column=Detail:Name, timestamp=1265184360548, value=GunLong T01 column=Products:P1, timestamp=1265184360694, value=20 T01 column=Products:P2, timestamp=1265184360758, value=40 T01 column=Products:P3, timestamp=1265184360815, value=30 T01 column=Products:P4, timestamp=1265184360866, value=50 T01 column=Turnover:P1, timestamp=1265187021528, value=1 T01 column=Turnover:P2, timestamp=1265187021528, value=1 T01 column=Turnover:P3, timestamp=1265187021528, value=1 T01 column=Turnover:P4, timestamp=1265187021528, value=1 T01 column=Turnover:sum, timestamp=1265190421993, value=140 T02 column=Detail:Locate, timestamp=1265184360951, value=02 T02 column=Detail:Name, timestamp=1265184360910, value=Esing T02 column=Products:P1, timestamp=1265184361051, value=50 T02 column=Turnover:P1, timestamp=1265187021528, value=2 T02 column=Turnover:sum, timestamp=1265190421993, value=100 T03 column=Detail:Locate, timestamp=1265184361124, value=03 T03 column=Detail:Name, timestamp=1265184361098, value=SunDon T03 column=Products:P1, timestamp=1265184361189, value=40 T03 column=Products:P2, timestamp=1265184361259, value=30 T03 column=Turnover:P1, timestamp=1265187021529, value=3 T03 column=Turnover:sum, timestamp=1265190421993, value=120 T04 column=Detail:Locate, timestamp=1265184361311, value=04 T04 column=Detail:Name, timestamp=1265184361287, value=StarBucks T04 column=Products:P1, timestamp=1265184361343, value=50 T04 column=Products:P2, timestamp=1265184361386, value=50 T04 column=Products:P3, timestamp=1265184361422, value=20 T04 column=Turnover:P1, timestamp=1265187021529, value=2 T04 column=Turnover:P2, timestamp=1265187021529, value=1 T04 column=Turnover:P3, timestamp=1265187021529, value=1 T04 column=Turnover:sum, timestamp=1265190421993, value=170 4 row(s) in 0.0460 seconds
4. 建立索引資料表
[ TCRC4SortTurnover.java]
$ /opt/hadoop/bin/hadoop jar TCRCHBase_100204.jar TCRC4SortTurnover
> scan 'TCRC-Sum' ROW COLUMN+CELL 100T02 column=Turnover:Sum, timestamp=1265190782127, value=100 100T02 column=__INDEX__:ROW, timestamp=1265190782127, value=T02 120T03 column=Turnover:Sum, timestamp=1265190782128, value=120 120T03 column=__INDEX__:ROW, timestamp=1265190782128, value=T03 140T01 column=Turnover:Sum, timestamp=1265190782126, value=140 140T01 column=__INDEX__:ROW, timestamp=1265190782126, value=T01 170T04 column=Turnover:Sum, timestamp=1265190782129, value=170 170T04 column=__INDEX__:ROW, timestamp=1265190782129, value=T04 4 row(s) in 0.0140 seconds
4.b 產生最終報表
$ /opt/hadoop/bin/hadoop jar TCRCHBase_100204.jar TCRC5ShowReport 130
GunLong 's turnover is 140 $. StarBucks 's turnover is 170 $.
Last modified 13 years ago
Last modified on Jul 19, 2011, 3:57:47 PM