Ubuntu 10.04 (CPU 1, RAM 2G) in VM
- 同時執行 7 個爬行任務 (記憶體使用 1.2G,CPU使用率 80~85%)
- nchc官網 的 tw 和 en (深度 6)
- nchc intra (深度 3)
- google crawlzilla 官網 (深度 8)
- sourceforge crawlzilla 官網 (深度 8)
- trac grid (深度 8)
- trac cloud (深度 10)
- hadoop forum (深度 10)
jazz 建議 可能是 hadoop 預設 heap = 1G,所以再 2G 的情況下,很正常運作
之後改用較低的 RAM 和 修改過的 hadoop heap 參數 測試
- Ubuntu 10.04 (CPU1, RAM 512M) (hadoop Heap 512M) in VM
- 同時執行以上3個爬取任務時,一樣會產生 out of memory 問題
- Ubuntu 10.04 (CPU1, RAM 512M) (hadoop Heap 256M) in VM
- 同時執行以上3個爬取任務時,一樣會產生 out of memory 問題
- error message (syslog)
600 Sep 9 09:59:28 ubuntu-186 kernel: [ 3708.133724] Out of memory: kill process 3843 (go.sh) score 1775788 or a child 601 Sep 9 09:59:28 ubuntu-186 kernel: [ 3708.133791] Killed process 4205 (counter.sh) 602 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.200789] counter.sh invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0 603 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.200978] counter.sh cpuset=/ mems_allowed=0 604 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201021] Pid: 11384, comm: counter.sh Not tainted 2.6.32-24-generic #39-Ubuntu 605 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201061] Call Trace: 606 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201133] [<c01cd1f4>] oom_kill_process+0xa4/0x2b0 607 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201201] [<c01cd869>] ? select_bad_process+0xa9/0xe0 608 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201267] [<c01cd8f1>] __out_of_memory+0x51/0xa0 609 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201333] [<c01cd998>] out_of_memory+0x58/0xb0 610 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201399] [<c01d01a7>] __alloc_pages_slowpath+0x407/0x4a0 611 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201466] [<c01d037a>] __alloc_pages_nodemask+0x13a/0x170 612 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201533] [<c01e6049>] do_wp_page+0x1b9/0x820 613 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201600] [<c013052c>] ? kmap_atomic_prot+0x4c/0xf0 614 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201666] [<c01e6e0c>] handle_mm_fault+0x2fc/0x390 615 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201733] [<c058f1bd>] do_page_fault+0x10d/0x3a0 616 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201799] [<c058f0b0>] ? do_page_fault+0x0/0x3a0 617 Sep 9 09:59:28 ubuntu-186 kernel: [ 3709.201865] [<c058d0b3>] error_code+0x73/0x80
- error message (hadoop-crawler-jobtracker-ubuntu-186.log)
869 2010-09-09 10:04:16,052 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201009090900_0026_m_000000_0: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory 870 at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) 871 at org.apache.hadoop.util.Shell.runCommand(Shell.java:149) 872 at org.apache.hadoop.util.Shell.run(Shell.java:134) 873 at org.apache.hadoop.fs.DF.getAvailable(DF.java:73) 874 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:321) 875 at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) 876 at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107) 877 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:930) 878 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:842) 879 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 880 at org.apache.hadoop.mapred.Child.main(Child.java:158) 881 Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory 882 at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) 883 at java.lang.ProcessImpl.start(ProcessImpl.java:65) 884 at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) 885 ... 10 more
Ubuntu 10.04 (CPU 1, RAM 1G, Disk 8G) in VBox
分別測試爬取3層~10層,結果如下
爬取網站:
Depth | 花費時間 | 結果 |
3 | 1h:31m:00s | Finish |
4 | 2h:48m:22s | Finish |
5 | more than 12Hour | Unfinish |
執行順序及狀況為:
- 爬取3層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體
- 爬取4層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體
- 爬取第5層時,系統進度及相關錯誤訊息如下:
Jobid | Priority | User | Name | Map % Complete | Reduce % Complete |
job_201009231019_0059 | NORMAL | crawler | fetch NCHC_5/segments/20100923162900 | 100.00% | 0.00% |
- 錯誤訊息:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/crawler/NCHC_5/segments/20100923162900/crawl_fetch/part-00000/index could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1280) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) at org.apache.hadoop.ipc.Client.call(Client.java:697) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at $Proxy1.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy1.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
- 動態加入一個運算節點後,系統繼續執行未完成的job
- 小結:
- 1. 除了RAM,Disk也需要夠大,才足以存放運算時所產生的中介檔案
- 2. 若遇到類似的錯誤訊息(Datanode容量不足),僅需動態的加入另一個運算節點並啟動,即可讓系統繼續運算,不需要Kill掉已經卡彈的程序。
- 3. 此一測試結果,仍無法得知RAM 1G 的爬取極限。
Last modified 14 years ago
Last modified on Sep 24, 2010, 10:57:49 AM
Attachments (4)
- syslog (178.1 KB) - added by rock 14 years ago.
- kern.log (425.4 KB) - added by rock 14 years ago.
- dmesg (97.8 KB) - added by rock 14 years ago.
- hadoop-crawler-jobtracker-ubuntu-186.log (202.5 KB) - added by rock 14 years ago.
Download all attachments as: .zip