{{{ #!html
crawlzilla 新版
v 1.0
}}} [[PageOutline]] = 目標 = * 多人共用版本 * 網頁介面更新 * 加入排程等新功能 * 更新 nutch 版本至 1.2 * svn 庫上的安裝測試模式 * slave安裝可搭配網頁引導 = 系統分析 = == 目錄結構 == * /home/crawler/crawlzilla || 目錄1 || 目錄2 || 說明 || || ./user/admin/ || ./IDB/XXX/meta || admin 為預設資料夾,XXX 為新增索引庫, meta 放每個索引庫的相關檔案 || || || ./IDB/XXX/index~segments || index~segments 為 lucene db 的必要五個資料夾|| || ./user/UUU/ || ./IDB/XXX/meta || UUU 為新增使用者名稱,使用者新增移除由 console 端管控 || || || ./IDB/XXX/index~segments || 其他設定與 admin相同 || || ./workspace || || hadoop 的運算資料夾 || || ./slave/ || || 給 slave 安裝需要的檔案 || || ./meta/ || || dialog 產生的中間檔 || || ./meta/tmp/ || || 暫存檔 || * /opt/crawlzilla/ || 目錄1 || 目錄2 || 說明 || || ./tomcat || ./webapps/UUU/XXX || 對應到 UUU 的 XXX 索引庫 || || ./nutch || || nutch 的目錄 || || ./main/ || || 放 crawlzilla 的執行檔|| * /var/log/crawlzilla/ || 目錄1 || 目錄2 || 說明 || || ./hadoop-logs || || || || ./hadoop-pids || || || || ./shell-logs || || || || ./tomcat-logs || || || == 新舊 檔案\目錄 對照 == || 舊 || ==> || 新 || 說明 || || /home/crawler/crawlzilla/logs || ==> || 刪除此鍊結 || || || /home/crawler/crawlzilla/nutch || ==> || 刪除此鍊結 || || || /home/crawler/crawlzilla/source || ==> || /home/crawler/crawlzilla/slave || || || /home/crawler/crawlzilla/archieve/_DBName_ || ==> || /home/crawler/crawlzilla/user/_DBName_ || || || /home/crawler/crawlzilla/tmp || ==> || /home/crawler/crawlzilla/tmp || || || /home/crawler/crawlzilla/urls || ==> || /home/crawler/crawlzilla/meta/urls || || || /home/crawler/crawlzilla/.metadata/_DBName_ || ==> || /home/crawler/crawlzilla/user/_DBName_/metadata || || || /home/crawler/crawlzilla/.menu_tmp || ==> || /home/crawler/crawlzilla/meta/.menu_tmp || || || /home/crawler/crawlzilla/system/(執行檔) || ==> || /opt/crawlzilla/main/(執行檔) || || * /home/crawler/crawlzilla/system: || 舊 || ==> || 新 || 說明 || || hosts || ==> || /home/crawler/crawlzilla/meta/ || || || hosts.old || ==> || /home/crawler/crawlzilla/meta/ || || || hosts.bak || ==> || /home/crawler/crawlzilla/meta/ || || || version || ==> || /opt/crawlzilla/version || || || crawl_nodes || ==> || /home/crawler/crawlzilla/meta/ || || || crawl_nodes.bak || ==> || /home/crawler/crawlzilla/meta/ || || || crawl_nodes.old || ==> || /home/crawler/crawlzilla/meta/ || || == 環境參數 == (以下為舊的) * Crawlzilla_Install_PATH="/opt/crawlzilla" * Tomcat_HOME="/opt/crawlzilla/tomcat" * Crawlzilla_HOME="/home/crawler/crawlzilla" * Work_Path=$Crawlzilla_HOME/system * Manu_Tmp_Path="/home/crawler/crawlzilla/meta" * Hadoop_Daemon="/opt/crawlzilla/nutch/bin/hadoop-daemon.sh" * PID_Dir="/var/log/crawlzilla/hadoop-pids" * Crawl_Nodes=$Crawlzilla_HOME/meta/crawl_nodes = 功能 = == shell == * 狀態 * 運算設定 * 快速設定 * 網頁伺服器設定 * 多人版帳號管理 * 語言切換 * slave安裝提示