= 2011-09-27 = * 為何 rock 要把 bridge.ko 加入 /usr/bin/mkpxeinitrd-net 呢?答案很明顯是跟 brctl 有關! {{{ root@debian:~# lsmod > before root@debian:~# brctl addbr br0 root@debian:~# lsmod > after root@debian:~# diff before after 1a2,3 > bridge 39662 0 > stp 1440 1 bridge }}} * rock 把執行 /bin/start_xen_bridge.sh 需要用的指令都補進 initrd 裡:brctl ip egrep awk gawk seq fgrep dirname expr bash * 可是他卻漏了 mount umount mount.nfs umount.nfs * 至於 netloop.ko 是 xen_brdige.sh 中會用到的,在 KVM 的 case 中並不會用到。 {{{ #!diff --- /usr/bin/mkpxeinitrd-net.drbl-virt_bak 2011-09-27 02:48:49.000000000 +0800 +++ /usr/bin/mkpxeinitrd-net 2011-09-27 02:35:36.000000000 +0800 @@ -37,7 +37,8 @@ # uncompress the compress kernel module. This is special for Mandriva case. It uses compressed kernel module, e.g. pcnet32.ko.gz. Before we used modprobe from busybox, we have to uncompress that. Now the modprobe program is from the OS (See variable include_bin_prog_from_server), so we should allow this type of kernel modules. use_compressed_kernel_module="yes" # Some required bin programs to be included in the PXE initrd, which are not provided by busybox or the one provided by busybox does not support the function we want. E.g. sleep (we need "sleep 0.1", while sleep from busybox does not support 0.1 secs). -include_bin_prog_from_server="sleep lspci insmod modprobe rmmod lsmod pkill strings mount umount mount.nfs umount.nfs" +# drbl-virt add +include_bin_prog_from_server='sleep lspci insmod modprobe rmmod lsmod pkill strings brctl ip egrep awk gawk seq fgrep dirname expr bash' # No need to use sudo if we are root if [ $UID -eq 0 ]; then @@ -303,6 +304,9 @@ cp -a --parents $i $initrd/lib/modules/$kernel_ver/ done +# drbl-virt add +cp -a --parents kernel/net/bridge/bridge.ko $initrd/lib/modules/$kernel_ver/ +cp -a --parents kernel/drivers/xen/netback/netloop.ko $initrd/lib/modules/$kernel_ver/ # Deal with firmwares! # The following is borrowed from Debian's /usr/share/initramfs-tools/hook-functions if [ "$copy_all_firmwares" = "yes" ]; then }}} * rock 的 init 切入點蠻特殊的,我原本是把切換成 br0 的程序加在 udhcpc 之前,試圖讓 br0 去取得 IP 位址。可是卻會出現 /sbin/init 第 175 行錯誤的問題。 {{{ 150 $echo "Bringing up loopback interface" 151 ifconfig lo 127.0.0.1 up 152 route add -net 127.0.0.0 netmask 255.0.0.0 lo 153 ### 把 /bin/sh 加在這裡 }}} * PXE 開機,進到 shell 後,執行以下指令,然後下 exit 跳出,讓它繼續往下跑: {{{ brctl addbr br0 brctl addif br0 eth0 ifconfig br0 0.0.0.0 ### 在 initrd.img 中要用這種方式才能啟動 br0 ifconfig eth0 0.0.0.0 }}} * [[Image(jazz/11-09-27:brctl_br0.png)]] * 就會看到 /sbin/init 第 175 行的錯誤訊息: * [[Image(jazz/11-09-27:init_175.png)]] * 實際上 /sbin/init 等於 DRBL Server 的 /tftpboot/node_root/sbin/init ,而第 175 行的錯誤,是因為 161 行只有找 eth 開頭,但拿到 IP 的是 br0 {{{ #!sh 160 # find my IP address 161 NETDEVICES="$(cat /proc/net/dev | awk -F: '/eth.:|tr.:/{print $1}')" 162 for DEVICE in $NETDEVICES; do 163 IP_tmp="$(ifconfig $DEVICE | grep -A1 $DEVICE | grep -v $DEVICE | grep "inet addr" | sed -e 's/^.*inet addr:\([0-9\.]\+\).*$/\1/')" 164 if [ -n "$IP_tmp" ]; then 165 # Got the IP address, stop to get from other port, so break 166 IP=$IP_tmp 167 echo "My IP address is $IP ([$DEVICE])." 168 break 169 fi 170 done 171 172 IP_prefix="$(echo $IP | cut -d"." -f1-3)" 173 if [ -n "$(echo "$NFSSERVER_LIST" | grep -E "$IP_prefix.[0-9]+")" ]; then 174 for i in $NFSSERVER_LIST; do 175 if [ "$(echo $i | cut -d"." -f1-3)" = $IP_prefix ]; then 176 nfsserver=$i 177 break 178 fi 179 done 180 else }}} * 因此,有一個變通的方式就是把 br0 更名為 eth*(如 eth9)。簡單測試了一下,確實就可以正常 PXE 開機進入 DRBL Client。 * [[Image(jazz/11-09-27:eth9_bridge.png)]] * 若要維持 br* 的作法,必須修正 /sbin/init 的程式碼。 * /tftpboot/node_root/sbin/init 是由 /opt/drbl/setup/files/misc/init.drbl 拷貝而來,因此如果要 patch 的話,應該先 patch /opt/drbl/setup/files/misc/init.drbl 這個檔案,如此未來只要跑 drblsrv 或 drblsrv-offline 均會把 br* 加入考慮。 {{{ #!diff --- /opt/drbl/setup/files/misc/init.drbl.org 2011-09-29 12:04:10.000000000 +0800 +++ /opt/drbl/setup/files/misc/init.drbl 2011-09-29 12:04:33.000000000 +0800 @@ -158,7 +158,7 @@ create_dev # find my IP address -NETDEVICES="$(cat /proc/net/dev | awk -F: '/eth.:|tr.:/{print $1}')" +NETDEVICES="$(cat /proc/net/dev | awk -F: '/eth.:|tr.:|br.:/{print $1}')" for DEVICE in $NETDEVICES; do IP_tmp="$(ifconfig $DEVICE | grep -A1 $DEVICE | grep -v $DEVICE | grep "inet addr" | sed -e 's/^.*inet addr:\([0-9\.]\+\).*$/\1/')" if [ -n "$IP_tmp" ]; then }}} ---- * rock 的作法是在 udhcpc 已經用 eth0 取得 IP 之後,才去把 eth0 暫停,新增 br0,再將 eth0 加入 br0,就能正常掛載 NFS 當 ROOTFS。這真的是需要一些觀念啊~ {{{ #!diff --- /usr/lib/mkpxeinitrd-net/initrd-skel/linuxrc-or-init.drbl-virt_bak 2011-09-27 02:35:36.000000000 +0800 +++ /usr/lib/mkpxeinitrd-net/initrd-skel/linuxrc-or-init 2011-09-27 02:35:36.000000000 +0800 @@ -244,6 +244,8 @@ done done +# drbl-virt add +bash /bin/start_xen_bridge.sh # clean the tag file [ -f "/dev/sname" ] && rm -f /dev/sname }}} * start_xen_bridge.sh 其實有點小複雜,會先去判斷網卡個數,然後透過 /bin/network-bridge 的 netdev 參數來指定網卡。但是這種作法的優點是如果有多張網卡的話,就可以把每一張網卡都對應到 Bridge 裝置 {{{ # cat /usr/lib/mkpxeinitrd-net/initrd-skel/bin/start_xen_bridge.sh #!/bin/bash # drbl-virt add NICs=$(/sbin/ifconfig | grep eth | awk '{print $1}') declare -i NICs_nu=$(echo $NICs |wc -w) NIC="" if [ $NICs_nu -lt 1 ]; then for (( i=0 ; i<$NICs_nu ; i++ )) do NIC_IP=$(ifconfig eth${i} | grep "inet addr" | sed 's/inet addr://g' | sed 's/Bcast.*$//g' | sed 's/^[ ]*//') if [ -n $NIC_IP ]; then NIC="eth${i}" break fi done else NIC=$NICS fi bash /bin/network-bridge start netdev=$NIC }}}