wiki:waue/2009/0629

ICAS 改寫進度

  • no error, no warnning
  • 但hbase 內也沒有結果
    • reduce error : 要將for迴圈改小於19,否則會出現 arrayIndexOutOfBoundsException
    • hbase 的table內沒有東西 : 感覺上是hbase 的語法有錯
  • (解法一) 預計重寫整個程式,並精簡化,之後再加上完整功能
  • (解法二) 資料庫的部份也許可換成Hive
    • (pros) cloudera有hive的支援
    • (pros) hive 有可用資料庫語法撈資料,而檔案是個純文字檔,方便於轉成圖檔
    • (cons) 要找到相關hadoop&hive的範例程式碼
    • (cons) 重新設計資料庫結構
    • (cons) 程式碼修改多

nutch on hadoop.nchc.org.tw

  • 試看看如何在hadoop.cloudura.0.18.3 上安裝nutch

步驟

我的vm為ubuntu 8.04, 故Disto=hardy

$ sudo su -
# cat "deb http://archive.cloudera.com/debian hardy contrib" > /etc/apt/sources.list.d/clouddera.list
# cat "deb-src http://archive.cloudera.com/debian hardy contrib" >> /etc/apt/sources.list.d/clouddera.list
# curl -s http://archive.cloudera.com/debian/archive.key | apt-key add -
# apt-cache search hadoop
# exit
$ sudo apt-get -y install hadoop hadoop-conf-pseudo  hadoop-namenode hadoop-secondarynamenode hadoop-datanode hadoop-jobtracker hadoop-tasktracker

  • 安裝目錄
     Hadoop wrapper script                  /usr/bin/hadoop
     Hadoop config script                   /etc/default/hadoop
     Hadoop Configuration Files             /etc/hadoop/conf
     Hadoop Jar and Library Files           /usr/lib/hadoop
     Hadoop Log Files                       /var/log/hadoop
     Hadoop Man pages                       /usr/share/man
     Hadoop service scripts                 /etc/init.d/hadoop-*
    
Last modified 15 years ago Last modified on Jun 29, 2009, 6:42:42 PM