close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_delta.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
May 24, 2010, 11:38:40 AM (16 years ago)
- Author:
-
waue
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v15
|
v16
|
|
| 26 | 26 | |
| 27 | 27 | == 5st (5/28) == |
| 28 | | * 測試Master & Slave 安裝程式 |
| 29 | | |
| | 28 | === rock === |
| | 29 | Slave 安裝程式demo |
| | 30 | === fafa === |
| | 31 | Master 安裝程式demo |
| | 32 | === waue === |
| | 33 | * 目錄架構 |
| 30 | 34 | || 說明 || 路徑 || 擁有者身份 || |
| 31 | 35 | || nutchez 家目錄 || /opt/nutchez/ || nutchuser || |
| … |
… |
|
| 37 | 41 | || nutchez 使用者目錄 || /home/nutchuser/nutchez/ || nutchuser || |
| 38 | 42 | || nutchez 索引資料庫 || /home/nutchuser/nutchez/search/ || 由nutch完成crawl後產生 || |
| 39 | | |
| 40 | | * 修改 /opt/nutchez/nutch/conf/ 的 hadoop-site.xml |
| | 43 | * [http://trac.nchc.org.tw/cloud/export/124/nutchez-0.2/package/nutchez-0.2-20100524.tar.gz 下載修改後 nutch 叢集版 的壓縮檔] |
| | 44 | * 設定叢集驅動模式( /opt/nutchez/nutch/conf/ 的 hadoop-site.xml ) |
| 41 | 45 | {{{ |
| 42 | 46 | #!xml |
| … |
… |
|
| 44 | 48 | <property> |
| 45 | 49 | <name>fs.default.name</name> |
| 46 | | <value>hdfs://secuse.nchc.org.tw:9000</value> |
| | 50 | <value>hdfs://localhost:9000</value> |
| 47 | 51 | </property> |
| 48 | 52 | <property> |
| 49 | 53 | <name>mapred.job.tracker</name> |
| 50 | | <value>secuse.nchc.org.tw:9001</value> |
| | 54 | <value>localhost:9001</value> |
| 51 | 55 | </property> |
| 52 | 56 | <property> |
| … |
… |
|
| 56 | 60 | </configuration> |
| 57 | 61 | }}} |
| 58 | | |
| 59 | | * 改tomcat port => /opt/nutchez/tomcat/conf/ 的 server.xml |
| 60 | | |
| 61 | | {{{ |
| 62 | | #!xml |
| 63 | | <Connector port="8080" protocol="HTTP/1.1" |
| 64 | | connectionTimeout="20000" |
| 65 | | redirectPort="8443" URIEncoding="UTF-8" |
| 66 | | useBodyEncodingForURI="true" /> |
| 67 | | }}} |
| 68 | | |
| 69 | | * 最後的搜尋結果 => /opt/nutchez/tomcat/webapps/ROOT/WEB-INF/classes/ 的 nutch-site.xml |
| 70 | | |
| | 62 | * 設定 最後的搜尋結果 ( /opt/nutchez/tomcat/webapps/ROOT/WEB-INF/classes/ 的 nutch-site.xml ) |
| 71 | 63 | {{{ |
| 72 | 64 | #!xml |
| … |
… |
|
| 78 | 70 | </configuration> |
| 79 | 71 | }}} |
| 80 | | |
| 81 | | * /opt/nutchez/nutch/bin/nutch 執行檔有改 |
| 82 | | |
| | 72 | * 使 nutch 主程式引入環境變數 (改 /opt/nutchez/nutch/bin/nutch ) |
| 83 | 73 | {{{ |
| 84 | 74 | #!sh |
| … |
… |
|
| 87 | 77 | NUTCH_LOG_DIR=/var/nutchez/logs |
| 88 | 78 | }}} |
| 89 | | |
| 90 | | * 用 改版的 nutchez 的 hadoop 還是要format 與 start-all.sh |
| 91 | | |
| 92 | | |
| | 79 | * 叢集版的 nutch ,要使用之前,要先執行 hadoop format 與 start-all.sh |
| 93 | 80 | |
| 94 | 81 | = [wiki:waue/2010/nutchez2_archi 二、系統架構(編輯)] = |