close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_core.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Jun 9, 2009, 1:24:29 PM (15 years ago)
- Author:
-
waue
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v1
|
v2
|
|
5 | 5 | - 希望在完整的看完Nutch的官方網頁後,得到更好的靈感與改進方式 |
6 | 6 | |
| 7 | == 更多指令 == |
| 8 | |
| 9 | === readdb === |
| 10 | |
| 11 | {{{ |
| 12 | $ nutch readdb /tmp/search/crawldb -stats |
| 13 | |
| 14 | 09/06/09 12:18:13 INFO mapred.MapTask: data buffer = 79691776/99614720 |
| 15 | |
| 16 | 09/06/09 12:18:13 INFO mapred.MapTask: record buffer = 262144/327680 |
| 17 | |
| 18 | 09/06/09 12:18:14 INFO crawl.CrawlDbReader: TOTAL urls: 1072 |
| 19 | 09/06/09 12:18:14 INFO crawl.CrawlDbReader: status 1 (db_unfetched): 1002 |
| 20 | |
| 21 | 09/06/09 12:18:14 INFO crawl.CrawlDbReader: status 2 (db_fetched): 68 |
| 22 | |
| 23 | }}} |
| 24 | === convdb === |
| 25 | |
| 26 | === === |
| 27 | |
| 28 | === === |
| 29 | |
| 30 | === === |
| 31 | |
| 32 | === === |
| 33 | |
| 34 | === === |
| 35 | |
| 36 | === === |
| 37 | |
7 | 38 | == 筆記 == |
8 | 39 | |
9 | | - 非 nutch crawler |