close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_repos.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Jun 9, 2009, 5:01:16 PM (15 years ago)
- Author:
-
waue
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v7
|
v8
|
|
38 | 38 | Signature: ce0202bbd593b09b86ce8a9aa991b321 |
39 | 39 | Metadata: _pst_: success(1), lastModified=0 |
| 40 | |
| 41 | $ nutch readdb /tmp/search/crawldb/ -url http://www.nchc.org.tw |
| 42 | URL: http://www.nchc.org.tw |
| 43 | |
| 44 | not found |
| 45 | |
40 | 46 | }}} |
41 | 47 | - -topN <nnnn> <out_dir> [<min>] dump top <nnnn> urls sorted by score to <out_dir> |
… |
… |
|
56 | 62 | - Usage: !SegmentReader (-dump ... | -list ... | -get ...) [general options] |
57 | 63 | - !SegmentReader -dump <segment_dir> <output> [general options] |
| 64 | {{{ |
| 65 | $ nutch readseg -dump /tmp/search/segments/20090609143444/ ./dump/ |
| 66 | $ vim ./dump/dump |
| 67 | }}} |
58 | 68 | - !SegmentReader -list (<segment_dir1> ... | -dir <segments>) [general options] |
| 69 | {{{ |
| 70 | $ nutch readseg -list /tmp/search/segments/20090609143444/ |
| 71 | |
| 72 | NAME GENERATED FETCHER START FETCHER END FETCHED PARSED |
| 73 | |
| 74 | 20090609143444 1 2009-06-09T14:34:48 2009-06-09T14:34:48 1 1 |
| 75 | |
| 76 | }}} |
59 | 77 | - !SegmentReader -get <segment_dir> <keyValue> [general options] |
60 | 78 | {{{ |
61 | | |
| 79 | $ nutch readseg -get /tmp/search/segments/20090609143444/ http://bioinfo.nchc.org.tw/ |
62 | 80 | }}} |
63 | 81 | === updatedb === |