= ICAS 改寫進度 = * no error, no warnning * 但hbase 內也沒有結果 * reduce error : 要將for迴圈改小於19,否則會出現 arrayIndexOutOfBoundsException * hbase 的table內沒有東西 : 感覺上是hbase 的語法有錯 * (解法一) 預計重寫整個程式,並精簡化,之後再加上完整功能 * (解法二) 資料庫的部份也許可換成Hive * (pros) cloudera有hive的支援 * (pros) hive 有可用資料庫語法撈資料,而檔案是個純文字檔,方便於轉成圖檔 * (cons) 要找到相關hadoop&hive的範例程式碼 * (cons) 重新設計資料庫結構 * (cons) 程式碼修改多 = nutch on hadoop.nchc.org.tw = * 試看看如何在hadoop.cloudura.0.18.3 上安裝nutch == 步驟 == 我的vm為ubuntu 8.04, 故Disto=hardy {{{ $ sudo su - # cat "deb http://archive.cloudera.com/debian hardy contrib" > /etc/apt/sources.list.d/clouddera.list # cat "deb-src http://archive.cloudera.com/debian hardy contrib" >> /etc/apt/sources.list.d/clouddera.list # curl -s http://archive.cloudera.com/debian/archive.key | apt-key add - # apt-cache search hadoop # exit $ sudo apt-get -y install hadoop hadoop-conf-pseudo hadoop-namenode hadoop-secondarynamenode hadoop-datanode hadoop-jobtracker hadoop-tasktracker }}} * 安裝目錄 {{{ Hadoop wrapper script /usr/bin/hadoop Hadoop config script /etc/default/hadoop Hadoop Configuration Files /etc/hadoop/conf Hadoop Jar and Library Files /usr/lib/hadoop Hadoop Log Files /var/log/hadoop Hadoop Man pages /usr/share/man Hadoop service scripts /etc/init.d/hadoop-* }}}