Changes between Version 3 and Version 4 of waue/2010/1029


Ignore:
Timestamp:
Oct 29, 2010, 3:26:50 PM (14 years ago)
Author:
waue
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • waue/2010/1029

    v3 v4  
    3232{{{
    3333cd /opt/crawlzilla/nutch
    34 
    3534}}}
    3635
     
    6665 * dedup 2: content by hash     100.00%
    6766 * dedup 3: delete from index(es)
     67
     68{{{
     69#java
     70Usage: DeleteDuplicates <indexes> ...
     71}}}
     72
    6873{{{
    6974/opt/crawlzilla/nutch/bin/nutch dedup /user/crawler/cw_yahoo_5/index