[[PageOutline]] = demo.crawlzilla.info = == 2013-04-03 == * 最近經常遇到 demo.crawlzilla.info 連線許久的狀況,可是從 Munin 又找不出系統主因 * 根據 top 的資訊: {{{ top - 23:16:21 up 15 days, 7:45, 2 users, load average: 1.25, 1.37, 1.35 Tasks: 5 total, 0 running, 5 sleeping, 0 stopped, 0 zombie Cpu0 : 2.3%us, 0.7%sy, 0.0%ni, 97.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 1.6%us, 0.0%sy, 0.0%ni, 96.4%id, 0.0%wa, 0.0%hi, 2.0%si, 0.0%st Cpu2 : 43.3%us, 0.3%sy, 0.0%ni, 56.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 57.2%us, 0.7%sy, 0.0%ni, 42.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8028708k total, 4986732k used, 3041976k free, 189008k buffers Swap: 19803128k total, 0k used, 19803128k free, 3185932k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15371 crawler 20 0 2472m 778m 14m S 102 9.9 516:36.60 java 15474 crawler 20 0 1375m 130m 11m S 7 1.7 135:58.46 java 15628 crawler 20 0 1366m 78m 11m S 0 1.0 1:16.75 java 15554 crawler 20 0 1345m 98m 11m S 0 1.3 1:23.92 java 15408 crawler 20 0 1308m 99m 11m S 0 1.3 0:40.57 java }}} * 發現記憶體用較兇的兩個是 Tomcat 跟 !JobTracker {{{ jazz@CrawlzillaServ:~$ sudo jps 15371 Bootstrap 15554 TaskTracker 15628 DataNode 15408 NameNode 1529 Jps 15474 JobTracker }}} * 而且 Tomcat 似乎只能用到一個 CPU core * [問題] 怎麼讓 Tomcat 使用到多個核心?? * [參考] http://www.mulesoft.com/tomcat-performance * [參考] http://www.tomcatexpert.com/blog/2011/11/22/performance-tuning-jvm-running-tomcat * [參考] [http://stackoverflow.com/questions/13631994/what-is-the-best-practice-for-tomcat-performance-tuning-in-amazon-ubuntu-instanc What is the best practice for tomcat performance tuning in Amazon ubuntu instance?] * 此外,也發現 Apache2 的 '''Keep-Alive 時間很長''',因此需要對 Apache2 的 mod_proxy 做一些調整 * [參考] http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypass {{{ ProxyPass / http://140.110.X.X:8080/ connectiontimeout=2 timeout=5 ttl=5 ProxyPassReverse / http://140.110.X.X:8080/ SetEnv proxy-nokeepalive 1 }}} * 同樣地,我們也降低 Tomcat 自身的 Keep-Alive 時間,縮短 Timeout,讓資源可以快速釋放出來。 * [參考] [http://stackoverflow.com/questions/1542502/java-server-cpu-usage-at-100-after-two-days-continous-running-with-about-110-us Java server cpu usage at 100% after two days continous running with about 110 users] * [參考] [http://www.virtualzone.de/2010/11/tomcat-apache-high-cpu-usage.html Tomcat & Apache: High CPU Usage] {{{ ## 編輯 /opt/crawlzilla/tomcat/conf/server.xml }}} == 2013-05-11 == * 也許因為註冊人數變多了,開始會遇到 Tomcat 記憶體不足的現象。因此做了點小的調整: * 因為 Hadoop 的資源使用量並不大,因此把 hadoop-env.sh 的 HADOOP_HEAPSIZE 調降到 256M - /opt/crawlzilla/nutch/conf/hadoop-env.sh {{{ # The maximum amount of heap to use, in MB. Default is 1000. export HADOOP_HEAPSIZE=256 }}} * 其次,修改 mapred-site.xml 的預設 mapper 個數與 reducer 個數。 - /opt/crawlzilla/nutch/conf/mapred-site.xml {{{ mapred.tasktracker.map.tasks.maximum 6 The maximum number of map tasks that will be run simultaneously by a task tracker. mapred.tasktracker.reduce.tasks.maximum 2 }}} * 增加 Tomcat 的 HEAPSIZE 到 4GB - /opt/crawlzilla/tomcat/bin/catalina.sh * 增加 Tomcat 的 !MaxPermSize 到 256MB - 因為遇到 java.lang.OutOfMemoryError: PermGen space 的錯誤訊息 {{{ case "`uname`" in CYGWIN*) cygwin=true;; OS400*) os400=true;; Darwin*) darwin=true;; esac ## Add by Jazz - 2013-05-11 export CATALINA_OPTS="-Xms4096m -Xmx4096m -XX:MaxPermSize=256m" # resolve links - $0 may be a softlink PRG="$0" }}} * 將 Tomcat 的 Keepalive 降低 - /opt/crawlzilla/tomcat/conf/server.xml {{{ ## 編輯 /opt/crawlzilla/tomcat/conf/server.xml StartServers 1 MinSpareThreads 1 MaxSpareThreads 1 ThreadLimit 1 ThreadsPerChild 1 MaxClients 2 MaxRequestsPerChild 20 }}} * 縮短 Timeout 時間 {{{ # # Timeout: The number of seconds before receives and sends time out. # Timeout 10 # # KeepAlive: Whether or not to allow persistent connections (more than # one request per connection). Set to "Off" to deactivate. # KeepAlive On # # MaxKeepAliveRequests: The maximum number of requests to allow # during a persistent connection. Set to 0 to allow an unlimited amount. # We recommend you leave this number high, for maximum performance. # MaxKeepAliveRequests 50 # # KeepAliveTimeout: Number of seconds to wait for the next request from the # same client on the same connection. # KeepAliveTimeout 2 }}}