wiki:waue/2009/0409
close Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": libsvn_delta-1.so.1: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.

Version 1 (modified by waue, 17 years ago) (diff)

--

Nutch 完整攻略

Nutch 完整攻略

前言

  • 雖然之前已經測試過了,網路上也有許多人分享過成功的經驗,然而這篇的重點
    • 完整的安裝nutch,並解決中文亂碼問題
    • 用hadoop的角度來架設nutch
    • 搜尋引擎不只是找網頁內的資料,也能爬到網頁內的檔案(如pdf,msword)

環境