Changes between Version 5 and Version 6 of jazz/crawlzilla-dev


Ignore:
Timestamp:
Sep 14, 2012, 11:24:41 PM (12 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • jazz/crawlzilla-dev

    v5 v6  
    55   * 升級/反安裝 -> 舊的資料如何保存或移植延續?!(Stateless)
    66   * Recrawl 進行時必須保留原本的 CrawlDB,等完成後才覆蓋掉。
     7   * Fix Job 流程忘了刪除 HDFS 的 crawldb ?
     8{{{
     9crawler@CrawlzillaServ:~$ /opt/crawlzilla/nutch/bin/hadoop fs -lsr jazz
     10drwxr-xr-x   - crawler supergroup          0 2012-09-14 21:59 /user/crawler/jazz
     11drwxr-xr-x   - crawler supergroup          0 2012-09-14 21:59 /user/crawler/jazz/wang
     12drwxr-xr-x   - crawler supergroup          0 2012-09-14 21:59 /user/crawler/jazz/wang/crawldb
     13}}}
    714 * 想法:
    815   * 套件化(分離 Nutch, Lucene, Hadoop 的部份) - 預設用單機版