Changes between Version 9 and Version 10 of waue/2009/nutch_install


Ignore:
Timestamp:
Apr 24, 2009, 6:12:05 PM (15 years ago)
Author:
waue
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • waue/2009/nutch_install

    v9 v10  
    4747 == 2.2 部屬hadoop,nutch目錄結構 ==
    4848{{{
    49 $ cp -rf hadoop/* nutch
     49$ cp -rf /opt/hadoop/* /opt/nutch
     50}}}
     51
     52 == 2.3 複製函式庫檔 ==
     53{{{
    5054$ cd nutch
     55$ cp -rf *.jar lib/
    5156}}}
    5257
     
    7378
    7479
    75 == 3.3 conf/nutch-site.xml ==
     80== 3.2 conf/nutch-site.xml ==
    7681 * 重要的設定檔,新增了必要的內容於內,然而想要瞭解更多參數資訊,請見nutch-default.xml
    7782{{{
     
    150155}}}
    151156
    152 == 3.5 crawl-urlfilter.txt ==
     157== 3.3 crawl-urlfilter.txt ==
    153158 * 重新編輯爬檔規則,此檔重要在於若設定不好,則爬出來的結果幾乎是空的,也就是說最後你的搜尋引擎都找不到資料啦!
    154159{{{