說明 | 路徑 | 擁有者身份
|
nutchez 家目錄 | /opt/nutchez/ | nutchuser
|
nutch 家目錄 | /opt/nutchez/nutch | nutchuser
|
nutch 工作目錄 | /var/nutchez/nutch-nutchuser | nutchuser
|
nutch 日誌檔 | /var/nutchez/logs | nutchuser
|
nutch 設定檔 | /opt/nutchez/nutch/conf | nutchuser
|
tomcat 家目錄 | /opt/nutchez/tomcat | nutchuser
|
nutchez 使用者目錄 | /home/nutchuser/nutchez/ | nutchuser
|
nutchez 索引資料庫 | /home/nutchuser/nutchez/search/ | 由nutch完成crawl後產生
|
- 修改 /opt/nutchez/nutch/conf/ 的 hadoop-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://secuse.nchc.org.tw:9000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>secuse.nchc.org.tw:9001</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/nutchez/nutch-nutchuser</value>
</property>
</configuration>
- 改tomcat port => /opt/nutchez/tomcat/conf/ 的 server.xml
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" URIEncoding="UTF-8"
useBodyEncodingForURI="true" />
- 最後的搜尋結果 => /opt/nutchez/tomcat/webapps/ROOT/WEB-INF/classes/ 的 nutch-site.xml
<configuration>
<property>
<name>searcher.dir</name>
<value>/home/nutchuser/nutchez/search</value>
</property>
</configuration>
- /opt/nutchez/nutch/bin/nutch 執行檔有改
NUTCH_HOME=/opt/nutchez/nutch
NUTCH_CONF_DIR=/opt/nutchez/nutch/conf
NUTCH_LOG_DIR=/var/nutchez/logs