| 60 | |
| 61 | |
| 62 | = Ubuntu 10.04 (CPU 1, RAM 1G) in VBox = |
| 63 | |
| 64 | == 分別測試爬取3層~10層,結果如下 == |
| 65 | |
| 66 | 爬取網站: |
| 67 | * http://www.nchc.org.tw/tw/ |
| 68 | * http://www.nchc.org.tw/en/ |
| 69 | |
| 70 | ||Depth||花費時間||結果|| |
| 71 | ||3||1h:31m:00s||Finish|| |
| 72 | ||4||2h:48m:22s||Finish|| |
| 73 | ||5||more than 12Hour||Unfinish|| |
| 74 | |
| 75 | 執行順序為: |
| 76 | * 爬取3層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體 |
| 77 | * 爬取4層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體 |
| 78 | * 爬取第5層時,系統停在下列進度超過12HR,無Fail Job, log檔案也無其他錯誤訊息,系統進度如下表: |
| 79 | ||Jobid||Priority||User||Name||Map % Complete||Reduce % Complete|| |
| 80 | ||job_201009231019_0059||NORMAL||crawler||fetch NCHC_5/segments/20100923162900||100.00%||0.00%|| |
| 81 | * go.sh 仍執行中 |
| 82 | * crawler 30226 0.0 0.1 4332 1124 ? S Sep23 0:00 /bin/bash /home/crawler/crawlzilla/system/go.sh 5 NCHC_5 |