close Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_fs.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.

Changes between Initial Version and Version 1 of jazz/09-08-27


Ignore:
Timestamp:
Aug 27, 2009, 6:33:14 PM (15 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • jazz/09-08-27

    v1 v1  
     1= 2009-08-27 =
     2
     3 * [計畫] Hadoop 叢集維護
     4  * [狀況] 發現 hadoop104, hadoop106 kernel panic
     5  * [狀況] 發現 /etc/hadoop/conf/hadoop-site.xml 中 dfs.replication 數值為 1 也就是沒做備份
     6  * [解法]
     7   1. 修改 dfs.replication 數值為 3
     8   2. 重新執行 hadoop-namenode 與 hadoop-datanode
     9   3. 使用 hadoop fs -setrep 設定目前為 1 的 /user 目錄所有檔案
     10{{{
     11root@hadoop:~# su -s /bin/sh hadoop -c "hadoop fs -setrep -R 3 /user"
     12}}}
     13   4. 使用 hadoop balancer 嘗試資料的 replication 機制是否會被執行
     14{{{
     15root@hadoop:~# su -s /bin/sh hadoop -c "hadoop balancer"
     16}}}
     17   5. 使用 hadoop fsck 嘗試資料的 replication 機制是否會被執行
     18{{{
     19root@hadoop:~# su -s /bin/sh hadoop -c "hadoop fsck / -racks"
     20}}}
     21{{{
     22#!sh
     23### 會有訊息顯示目前的 replication 數目不夠
     24/user/waue/input/1.txt:  Under replicated blk_-682447276956362627_16045. Target Replicas is 3 but found 1 replica(s).
     25### 自從誤刪 hadoop113 硬碟資料後,HDFS 狀態都是 CORRUPT,看樣子要請大家重新上傳看看了
     26Status: CORRUPT
     27 Total size:    2514937876121 B
     28 Total dirs:    2800
     29 Total files:   14972
     30 Total blocks (validated):      51686 (avg. block size 48658009 B)
     31  ********************************
     32  CORRUPT FILES:        1921
     33  MISSING BLOCKS:       4972
     34  MISSING SIZE:         232737717270 B
     35  CORRUPT BLOCKS:       4972
     36  ********************************
     37 Minimally replicated blocks:   46714 (90.38037 %)
     38 Over-replicated blocks:        3 (0.00580428 %)
     39 Under-replicated blocks:       45388 (87.81488 %)
     40 Mis-replicated blocks:         0 (0.0 %)
     41 Default replication factor:    3
     42 Average block replication:     1.0597067
     43 Corrupt blocks:                4972
     44 Missing replicas:              89401 (163.2239 %)
     45 Number of data-nodes:          17
     46 Number of racks:               1
     47
     48The filesystem under path '/' is CORRUPT
     49}}}
     50  * [發現] Hadoop 對於 HDFS /var/lib/hadoop/cache 目錄裡的檔案還真是保護到極致了...設定了 10 個 replication 副本
     51{{{
     52-rw-r--r--  10 hadoop002 supergroup     108739 2009-08-24 22:55 /var/lib/hadoop/cache/hadoop/mapred/system/job_200908242228_0009/job.jar
     53}}}