Changes between Version 5 and Version 6 of jazz/crawlzilla-dev
- Timestamp:
- Sep 14, 2012, 11:24:41 PM (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
jazz/crawlzilla-dev
v5 v6 5 5 * 升級/反安裝 -> 舊的資料如何保存或移植延續?!(Stateless) 6 6 * Recrawl 進行時必須保留原本的 CrawlDB,等完成後才覆蓋掉。 7 * Fix Job 流程忘了刪除 HDFS 的 crawldb ? 8 {{{ 9 crawler@CrawlzillaServ:~$ /opt/crawlzilla/nutch/bin/hadoop fs -lsr jazz 10 drwxr-xr-x - crawler supergroup 0 2012-09-14 21:59 /user/crawler/jazz 11 drwxr-xr-x - crawler supergroup 0 2012-09-14 21:59 /user/crawler/jazz/wang 12 drwxr-xr-x - crawler supergroup 0 2012-09-14 21:59 /user/crawler/jazz/wang/crawldb 13 }}} 7 14 * 想法: 8 15 * 套件化(分離 Nutch, Lucene, Hadoop 的部份) - 預設用單機版