wiki:crawlzilla/stress_testing

Version 7 (modified by shunfa, 14 years ago) (diff)

--

jazz 建議 可能是 hadoop 預設 heap = 1G,所以再 2G 的情況下,很正常運作
之後改用較低的 RAM 和 修改過的 hadoop heap 參數 測試


  • Ubuntu 10.04 (CPU1, RAM 512M) (hadoop Heap 512M) in VM
    • 同時執行以上3個爬取任務時,一樣會產生 out of memory 問題
  • Ubuntu 10.04 (CPU1, RAM 512M) (hadoop Heap 256M) in VM
    • 同時執行以上3個爬取任務時,一樣會產生 out of memory 問題
    • error message (syslog)
      600	Sep  9 09:59:28 ubuntu-186 kernel: [ 3708.133724] Out of memory: kill process 3843 (go.sh) score 1775788 or a child
      601	Sep  9 09:59:28 ubuntu-186 kernel: [ 3708.133791] Killed process 4205 (counter.sh)
      602	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.200789] counter.sh invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
      603	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.200978] counter.sh cpuset=/ mems_allowed=0
      604	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201021] Pid: 11384, comm: counter.sh Not tainted 2.6.32-24-generic #39-Ubuntu
      605	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201061] Call Trace:
      606	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201133]  [<c01cd1f4>] oom_kill_process+0xa4/0x2b0
      607	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201201]  [<c01cd869>] ? select_bad_process+0xa9/0xe0
      608	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201267]  [<c01cd8f1>] __out_of_memory+0x51/0xa0
      609	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201333]  [<c01cd998>] out_of_memory+0x58/0xb0
      610	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201399]  [<c01d01a7>] __alloc_pages_slowpath+0x407/0x4a0
      611	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201466]  [<c01d037a>] __alloc_pages_nodemask+0x13a/0x170
      612	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201533]  [<c01e6049>] do_wp_page+0x1b9/0x820
      613	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201600]  [<c013052c>] ? kmap_atomic_prot+0x4c/0xf0
      614	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201666]  [<c01e6e0c>] handle_mm_fault+0x2fc/0x390
      615	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201733]  [<c058f1bd>] do_page_fault+0x10d/0x3a0
      616	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201799]  [<c058f0b0>] ? do_page_fault+0x0/0x3a0
      617	Sep  9 09:59:28 ubuntu-186 kernel: [ 3709.201865]  [<c058d0b3>] error_code+0x73/0x80
      
    • error message (hadoop-crawler-jobtracker-ubuntu-186.log)
      869	2010-09-09 10:04:16,052 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201009090900_0026_m_000000_0: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory
      870	  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
      871	  at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
      872	  at org.apache.hadoop.util.Shell.run(Shell.java:134)
      873	  at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
      874	  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:321)
      875	  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
      876	  at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
      877	  at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:930)
      878	  at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:842)
      879	  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
      880	  at org.apache.hadoop.mapred.Child.main(Child.java:158)
      881	Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory
      882	  at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
      883	  at java.lang.ProcessImpl.start(ProcessImpl.java:65)
      884	  at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
      885	  ... 10 more
      

Ubuntu 10.04 (CPU 1, RAM 1G) in VBox

分別測試爬取3層~10層,結果如下

爬取網站:

Depth花費時間結果
31h:31m:00sFinish
42h:48m:22sFinish
5more than 12HourUnfinish

執行順序為:

  • 爬取3層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體
  • 爬取4層後,執行"echo 1>/proc/sys/vm/drop_caches"清除記憶體
  • 爬取第5層時,系統停在下列進度超過12HR,無Fail Job, log檔案也無其他錯誤訊息,系統進度如下表:
JobidPriorityUserNameMap % CompleteReduce % Complete
job_201009231019_0059NORMALcrawlerfetch NCHC_5/segments/20100923162900100.00%0.00%
  • go.sh 仍執行中
    • crawler 30226 0.0 0.1 4332 1124 ? S Sep23 0:00 /bin/bash /home/crawler/crawlzilla/system/go.sh 5 NCHC_5

Attachments (4)

Download all attachments as: .zip