wiki:Problem

Problem

Kernel version: 2.6.23 (enable -DNFS4_CLUSTER)

  1. 遠端的機器在mmstartup -a 時,會一值卡在arbitrating的狀態,估計可能是gpfs連接模組(tracedev)的問題,也有可能是kernel過新,gpfs module無法配合。但是在local端時,則可以正常使用。
gpfs02:~# tsstatus
The mmfsd daemon is not ready to handle commands yet. Waiting for quorum.
gpfs02:~# mmgetstate
 Node number  Node name        GPFS state
------------------------------------------
       3      gpfs02           arbitrating


e1000 NIC module problem

  • Our NIC is e1000 GE in our 2.6.20 kernel. This NIC in our case will cause speed from 1000M down to 100M. You can downlaod new driver from intel web site or use our's compiler modules.
    sudo wget http://trac.nchc.org.tw/grid/raw-attachment/wiki/Problem/e1000.ko
    cp e1000.ko /lib/module/`uname -r`/kernel/driver/net/e1000/
    sudo depmod -a
    


mmadddisk

  • 若要mmadddisk時,必須確認node在一開始以在mmcrcluster -N gpfs.nodes的檔案裡,若是之後透過指令增加的node,此node的硬碟無法擴增到原來的gpfs NSD。


Use ramdisk

gpfs-server:/home/ram_config# mmcrnsd -F gpfs.disks  -v no
mmcrnsd: Processing disk ram0
mmcrnsd: ram0 was not found in /proc/partitions.
mmcrnsd:  Failed while processing disk descriptor /dev/ram0:gpfs00::dataAndMetadata:: on node gpfs00.
mmcrnsd: Processing disk ram1
mmcrnsd: ram1 was not found in /proc/partitions.
mmcrnsd:  Failed while processing disk descriptor /dev/ram1:gpfs00::dataAndMetadata:: on node gpfs00.
gpfs-server:/home/ram_config# mmcrfs /home/ram_mount/ gpfs9 -F gpfs.disks -B 1M -m 2 -r 2 
Unable to open disk 'gpfs2nsd'.
Unable to open disk 'gpfs1nsd'.
No such device
No such device
Error accessing disks.
mmcrfs: tscrfs failed.  Cannot create gpfs9
  • GPFS in kernel 2.6.23 問題卡編譯模組的funtion在 posix_lock_file,已知問題為2.6.18 的posix_lock_file參數為2,但2.6.22和2.6.23的參數皆改為3個參數。目前解決方案為了解和修改gpfs的此function來適應kernel 2.6.23



Last modified 15 years ago Last modified on Nov 2, 2009, 5:44:18 PM

Attachments (2)

Download all attachments as: .zip