NutchEZ安裝流程
假設條件
JAVA_HOME | /usr/lib/jvm/java-6-sun |
User | nutchuser |
NutchEZ安裝路徑 | /opt/nutchEZ |
Tomcat安裝路徑 | /opt/nutchEZ/tomcat |
Source檔路徑 | /opt/ |
三類檔案 | 安裝shell檔, nutch-1.0.tar.gz, apache-tomcat-6.0.18.tar.gz |
開始安裝
詢問使用者資訊及其他資訊
- Admin e-mail
- DNS name
- Master IP(程式設定)
Install Nutch
解壓縮.改資料夾名稱.擁有者
- tar zxvf nutch-1.0.tar.gz
- mv nutch-1.0 nutchEZ
- chown -R nutchuser:nutchuser /opt/nutchEZ
將設定寫入設定檔
- hadoop-env.sh
- hadoop-site.xml($MasterDNS)
- nutch-site.xml($Admin)
- slaves(叢集的client_install需更改此檔)
- crawl-urlfilter.txt(爬網規則)
啟動nutch
- 格式化HDFS
- startup nucth
Install Tomcat
解壓縮.改資料夾名稱.擁有者
- tar zxvf apache-tomcat-6.0.18.tar.gz /opt/nutchEZ/
- mv /opt/nutchEZ/apache-tomcat-6.0.18 /opt/nutchEZ/tomcat
- chown -R nutchuser:nutchuser /opt/nutchEZ/
環境設定
$ cd /opt/nutchEZ $ mkdir web $ cd web $ jar -xvf ../nutch-1.0.war $ rm ../nutch-1.0.war $ mv /opt/nuctcEZ/tomcat/webapps/ROOT /opt/tomcat/webapps/ROOT-ori $ cd /opt/nutchEZ $ mv /opt/nutchEZ/web /opt/nutchEZ/tomcat/webapps/ROOT $ mkdir /opt/nutchEZ/search
修改設定檔
/opt/nutchEZ/tomcat/conf/server.xml
/opt/nutchEZ/tomcat/webapps/ROOT/WEB-INF/classes/nutch-site.xml
啟動tomcat
執行階段
- 爬網
- 搬檔案
- 重新啟動tomcat
測試
WikiInclude(shunfa/2010/0524_NutchEZ_InstallTest)?
Reference
Last modified 15 years ago
Last modified on May 28, 2010, 1:32:36 PM