Context Navigation

Xen GPU Cluster
Part 1 Build essential environment
Part 2 Xen PCI Express configuration HowTo
Part 3 Running CUDA on Xen HowTo
Part 4 Running 3D-Applications on Xen by using Virtualizing OpenGL
Part 5 Comparing with Xen passthrough and VMGL

Xen GPU Cluster

Hardware

Machine	Dell OptiPlex 755
Node	9 nodes
CPU	Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
Memory	6GB/node
Storage	160GB/node
Video Card	NVIDIA GeForce 9800GT 1GB/node

Software

OS#1	Ubuntu 8.10 with Kernel: 2.6.28 x86_64 (non-xen-patched kernel)
OS#2	Ubuntu 8.10 with Kernel: 2.6.22-9 x86_64 (Xen-3.3.1+Lustre patched kernel)

System Architecture

Part 1 Build essential environment

1.1 - Basic Environment

# NVIDIA CUDA driver #
rock@cloud:~/nvidia/cuda$ wget http://developer.download.nvidia.com/compute/cuda/2_1/drivers/NVIDIA-Linux-x86_64-180.22-pkg2.run

# NVIDIA CUDA toolkit #
rock@cloud:~/nvidia/cuda$ wget http://developer.download.nvidia.com/compute/cuda/2_1/toolkit/cudatoolkit_2.1_linux64_ubuntu8.04.run

# NVIDIA CUDA SDK #
rock@cloud:~/nvidia/cuda$ wget http://developer.download.nvidia.com/compute/cuda/2_1/SDK/cuda-sdk-linux-2.10.1215.2015-3233425.run

~$ sudo apt-get install autoconf automake build-essential gcc make libtool initramfs-tools libxi6 libxi-dev libxmu6 libxmu-dev linux-kernel-devel linux-headers-2.6.27-11-server xserver-xorg-core xserver-xorg-dev

rock@cloud:~$ sudo ln -sf /usr/src/linux-2.6.22 /usr/src/linux
rock@cloud:~/nvidia/cuda$ sudo sh NVIDIA-Linux-x86_64-180.22-pkg2.run
rock@cloud:~/nvidia/cuda$ sudo sh cudatoolkit_2.1_linux64_ubuntu8.04.run

Enter install path (default /usr/local/cuda, '/cuda' will be appended): /usr/local/cuda

# Note:

* Please make sure your PATH includes /usr/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH includes /usr/local/cuda/lib
*   or add /usr/local/cuda/lib to /etc/ld.so.conf and run ldconfig as root

* Please read the release notes in /usr/local/cuda/doc/

* To uninstall CUDA, delete /usr/local/cuda
* Installation Complete

rock@cloud:~/nvidia/cuda$ sudo sh cuda-sdk-linux-2.10.1215.2015-3233425.run

# Note:

{{{
Enter install path (default /usr/local/cuda, '/cuda' will be appended): /usr/local/NVIDIA_CUDA_SDK
}}}

Configuring SDK Makefile (/usr/local/NVIDIA_CUDA_SDK/common/common.mk)...

* Please make sure your PATH includes /usr/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH includes /usr/local/cuda/lib

* To uninstall the NVIDIA CUDA SDK, please delete /usr/local/NVIDIA_CUDA_SDK

rock@cloud:~$ sudo vim /etc/profile

Add:
export PATH=$PATH:/usr/local/cuda/bin

rock@cloud:~$ source /etc/profile
rock@cloud:~$ sudo vim /etc/ld.so.conf

Add:
/usr/local/cuda/lib

rock@cloud:~$ sudo ldconfig

1.2 NVIDIA Driver HowTo OnNoneXenKernel

# Rock said that the unknown identification of the VGA device might be the "pciids" problem.
Sol1:
rock@cloud:~$ sudo update-pciids <older version>
Sol2:
rock@cloud:~$ wget http://pciids.sourceforge.net/v2.2/pci.ids <latest version>
rock@cloud:~$ sudo cp pci.ids /usr/share/misc/
rock@cloud:~$ sudo lspci -v -v

01:00.0 VGA compatible controller: nVidia Corporation GeForce 9800 GT (rev a2)
	Subsystem: ASUSTeK Computer Inc. Device 82a0
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
	Region 5: I/O ports at dc80 [size=128]
	[virtual] Expansion ROM at fea00000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns, L1 <1us
			ClockPM- Suprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100] Virtual Channel <?>
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [600] Vendor Specific Information <?>
	Kernel driver in use: nvidia
	Kernel modules: nvidia, nvidiafb

rock@cloud:~$ less /var/log/Xorg.0.log | grep nVidia

(--) PCI:*(0@1:0:0) nVidia Corporation GeForce 9800 GT rev 162, Mem @ 0xfd000000/16777216, 0xd0000000/268435456, 0xfa000000/33554432, I/O @ 0x0000dc80/128, BIOS @ 0x????????/131072

rock@cloud:~$ sudo vim /etc/X11/xorg.conf

# Allocate the BusID for the VGA Device
Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    BusID          "PCI:1:0:0"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce 9800 GT"
    Option         "RenderAccel" "True"
    Option         "UseEdidDpi" "False"
EndSection

rider@cloud:~$ export DISPLAY=:0
rider@cloud:~$ glxinfo -display :0

#It seems that the 3D accerlation works fine without any trouble.
name of display: :0.0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, 
    GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control, 
    GLX_EXT_texture_from_pixmap, GLX_ARB_create_context, GLX_ARB_multisample, 
    GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
client glx extensions:
    GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_visual_info, 
    GLX_EXT_visual_rating, GLX_EXT_import_context, GLX_SGI_video_sync, 
    GLX_NV_swap_group, GLX_NV_video_out, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, 
    GLX_SGI_swap_control, GLX_ARB_create_context, GLX_NV_float_buffer, 
    GLX_ARB_fbconfig_float, GLX_EXT_fbconfig_packed_float, 
    GLX_EXT_texture_from_pixmap, GLX_EXT_framebuffer_sRGB, 
    GLX_NV_present_video, GLX_NV_multisample_coverage
GLX version: 1.3
GLX extensions:
    GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, 
    GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control, 
    GLX_EXT_texture_from_pixmap, GLX_ARB_create_context, GLX_ARB_multisample, 
    GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB, 
    GLX_ARB_get_proc_address
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce 9800 GT/PCI/SSE2
OpenGL version string: 3.0.0 NVIDIA 180.44
OpenGL shading language version string: 1.30 NVIDIA via Cg compiler

1.3 NVIDIA Driver HowTo OnXenKernel

In this case,we pick the driver Ver.180.22/180.44 x86_64 for Xen_Lustre kernel.
# Test1- Success
rider@cloud:~/nvidia/driver$ export IGNORE_XEN_PRESENCE=1
rider@cloud:~/nvidia/driver$ export SYSSRC=/lib/modules/2.6.22.9/source
rider@cloud:~/nvidia/driver$ export SYSOUT=/lib/modules/2.6.22.9/build
rider@cloud:~/nvidia/driver$ sudo IGNORE_XEN_PRESENCE=1 CC="gcc -DNV_VMAP_4_PRESENT -DNV_SIGNAL_STRUCT_RLIM" ./NVIDIA-Linux-x86_64-180.44-pkg2.run --kernel-source-path=/usr/src/linux/
rider@cloud:~$ sudo modprobe -l | grep nv

/lib/modules/2.6.22.9/kernel/drivers/video/nvidia.ko

rider@cloud:~/nvidia/driver$ sudo modprobe nvidia
rider@cloud:~/nvidia/driver$ dmesg | grep NVIDIA

NV Driver: 180.22
NVRM: loading NVIDIA UNIX x86_64 Kernel Module  180.22  Tue Jan  6 09:15:58 PST 2009

NV Driver: 180.44
NVRM: loading NVIDIA UNIX x86_64 Kernel Module  180.44  Tue Mar 24 05:46:32 PST 2009

# Test2- Testing failure
rider@cloud:~/nvidia/driver$ export IGNORE_XEN_PRESENCE=1
rider@cloud:~/nvidia/driver$ export SYSSRC=/lib/modules/2.6.22.9/source
rider@cloud:~/nvidia/driver$ export SYSOUT=/lib/modules/2.6.22.9/build
rider@cloud:~/nvidia/driver$ ./NVIDIA-Linux-x86_64-180.22-pkg2.run --extract-only
rider@cloud:~/nvidia/driver$ cd ./NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/
rider@cloud:~/nvidia/driver/NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv$ CC="gcc -DNV_VMAP_4_PRESENT -DNV_SIGNAL_STRUCT_RLIM" make SYSSRC=/lib/modules/2.6.22.9/source SYSOUT=/lib/modules/2.6.22.9/build module
rider@cloud:~/nvidia/driver/NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv$ mkdir -p /lib/modules/2.6.22.9/extra
rider@cloud:~/nvidia/driver/NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv$ sudo cp nvidia.ko /lib/modules/2.6.22.9/extra/
rider@cloud:~/nvidia/driver/NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv$ sudo depmod -a
rider@cloud:~/nvidia/driver/NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv$ sudo modprobe nvidia

errMsg: nvidia: Unknown symbol __phys_addr

PS: Test1 & Test2 Modified File

#Kernel Source (Test1)
/usr/src/linux/include/asm/smp.h
/usr/src/linux/include/xen/interface/memory.h

#NVIDIA Source (Test2)
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/nv.c
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/nv-vm.c
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/conftest.sh
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/Makefile.kbuild
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/nv-linux.h
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/os-interface.c
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/nv-linux.h_old
NVIDIA-Linux-x86_64-180.22-pkg2/usr/src/nv/conftest.sh_old

1.4 NVIDIA GPU StatusCheck

rock@cloud:~$ sudo nvidia-xconfig -query-gpu-info

# GPU Status check
Number of GPUs: 1

GPU #0:
  Name      : GeForce 9800 GT
  PCI BusID : PCI:1:0:0

  Number of Display Devices: 1

  Display Device 0 (CRT-0):
     EDID Name             : ViewSonic VA721
     Minimum HorizSync     : 30.000 kHz
     Maximum HorizSync     : 82.000 kHz
     Minimum VertRefresh   : 50 Hz
     Maximum VertRefresh   : 85 Hz
     Maximum PixelClock    : 140.000 MHz
     Maximum Width         : 1280 pixels
     Maximum Height        : 1024 pixels
     Preferred Width       : 1280 pixels
     Preferred Height      : 1024 pixels
     Preferred VertRefresh : 60 Hz
     Physical Width        : 340 mm
     Physical Height       : 270 mm

rock@cloud:~$ sudo nvidia-smi

Gpus found in probe:
Found Gpuid 0x1000
Attaching all probed Gpus...OK
Getting unit information...OK
Getting all static information..

Part 2 Xen PCI Express configuration HowTo

2.1 Xen_Kernel_config

CONFIG_XEN_PCIDEV_FRONTEND=y
CONFIG_XEN_PCIDEV_BACKEND=y
CONFIG_XEN_PCIDEV_BACKEND_PASS is not set
CONFIG_XEN_PCIDEV_BACKEND_VPCI=y
CONFIG_XEN_PCIDEV_BACKEND_SLOT is not set

2.2 DEV_IDs confirmation

rider@cloud:~$ sudo lspci -vvn

01:00.0 0300: 10de:0605 (rev a2) ---> vendorID + DEV_IDs
	Subsystem: 1043:82a0
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fd000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Region 1: Memory at d0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Region 3: Memory at fa000000 (64-bit, non-prefetchable) [disabled] [size=32M]
	Region 5: I/O ports at dc80 [disabled] [size=128]
	Expansion ROM at fea00000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns, L1 <1us
			ClockPM- Suprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Kernel driver in use: nvidia
	Kernel modules: nvidia, nvidiafb

2.3 PCI Backend Configuration

2.3.1 Binding at Boot

# Important: You'll need to unload the NVIDIA driver before setting up the pci-backend.
rider@cloud:~$ sudo modprobe -r nvidia
rider@cloud:~$ sudo vim /boot/grub/menu.lst

module  /boot/vmlinuz-2.6.22.9 root=UUID=d3fa560e-7071-46d8-a168-036f40960c7b ro console=tty0 pciback.hide=(0000:01:00.0)

2.3.2 Late Binding

rider@cloud:~$ sudo modprobe -r nvidia
rider@cloud:~$ sudo vim /etc/modprobe.d/blacklist

# nvidia driver autoload-disabled
blacklist nvidia

rider@cloud:~$ sudo su -
# Hide the device from dom0 so pciback can take control.
root@cloud:~$ echo -n "0000:01:00.0" > /sys/bus/pci/drivers/nvidia/unbind

# Give the dev_ids to pciback, and give it a new slot then bind.
root@cloud:~$ echo -n "0000:01:00.0" > /sys/bus/pci/drivers/pciback/new_slot
root@cloud:~$ echo -n "0000:01:00.0" > /sys/bus/pci/drivers/pciback/bind

# You can use an initialization script to invoke the PCIE device at startup.

root@cloud:~$ cat /sys/bus/pci/drivers/pciback/slots

0000:01:00.0

# Caution: Make sure that the device is not controlled by any driver: there should be no driver symlink for nvidia.

PATH: /sys/bus/pci/devices/0000:01:00.0/
driver -> ../../../../bus/pci/drivers/nvidia ---> This symlink shouldn't exist.

2.3.3 Permissive Flag

rider@cloud:~$ sudo vim /etc/xen/xend-pci-permissive.sxp

(unconstrained_dev_ids
     #('0123:4567:89AB:CDEF')
     ('0000:01:00.0')
)

2.3.4 User-space Quirks

rider@cloud:~$ sudo vim /etc/xen/xend-pci-quirks.sxp

(pci_ids
   # Entries are formated as follows:  
   #     <vendor>:<device>[:<subvendor>:<subdevice>]

   ('10de:0605'   # NVIDIA 9800GT
   )
)

2.3.5 PCI Frontend Configuration

rider@cloud:~$ sudo vim /etc/xen/vm01.cfg

# We create a new virtual machine named "vm01",and pci_ids configuration example is as below.
# In this case, we take the "PCI Express" deviceID for example.
pci =['01:00.0']

01:00.0 --> PCI Express
00:01.0 --> PCI bridge: Intel Corporation 82Q35 Express PCI Express Root Port
00:1d.0 --> USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller
00:1d.1 --> USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller
00:1d.2 --> USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller
00:1d.7 --> USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller

rider@cloud:/etc/xen$ sudo xm create vm01.cfg
rider@cloud:/etc/xen$ dmesg | grep pciback

pciback 0000:01:00.0: seizing device
pciback: vpci: 0000:01:00.0: assign to virtual slot 0

rider@cloud:~$ sudo xm console vm01
vm01:~# dmesg | grep pci

pcifront pci-0: Installing PCI frontend
pcifront pci-0: Creating PCI Frontend Bus 0000:00

Part 3 Running CUDA on Xen HowTo

3.1 Create a virtual machine for CUDA

rider@cloud:~$ sudo vim /etc/xen-tools/xen-tools.conf

dir = /home
install-method = debootstrap
size   = 6Gb      # Disk image size.
memory = 512Mb    # Memory size
swap   = 128Mb    # Swap size
fs     = ext3     # use the EXT3 filesystem for the disk image.
dist   = hardy    # Default distribution to install. ---> For CUDA Support (Ubuntu 8.0.4)
image  = sparse   # Specify sparse vs. full disk images.
gateway   = 140.XXX.XXX.XXX
netmask   = 255.255.255.0
broadcast = 140.XXX.XXX.XXX
kernel      = /boot/vmlinuz-`uname -r`
initrd      = /boot/initrd.img-`uname -r`
mirror = http://gb.archive.ubuntu.com/ubuntu/
ext3_options   = noatime,nodiratime,errors=remount-ro
ext2_options   = noatime,nodiratime,errors=remount-ro
xfs_options    = defaults
reiser_options = defaults

rider@cloud:~$ sudo xen-create-image --hostname cuda --ip 140.XXX.XXX.XXX

3.2 Running CUDA Example on VirtualMachine

Step1:
# VirtualMachine startup
rider@cloud:~$ sudo xm create cuda.cfg

Step2:
# Remote login
rider@350Z:~$ ssh 140.xxx.xxx.xxx
# Local login
rider@cloud:~$ sudo xm console cuda

Step3:
# NVIDIA CUDA toolkit & sdk installation. Reference: Chapter1: 1.1 - Basic Environment

Step4:
# Build your own cuda project or cuda example running test. Reference: Chapter3: 3.3 - Running CUDA Example on Xen

# Example:Device Bandwidth
rider@cuda:/usr/local/NVIDIA_CUDA_SDK/bin/linux/release$ sudo ./bandwidthTest

(Running on Xen_VirtualMachine)
Device 0: "GeForce 9800 GT"
Quick Mode
Host to Device Bandwidth for Pageable memory
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		31999998.0

Quick Mode
Device to Host Bandwidth for Pageable memory
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		320000000.0

Quick Mode
Device to Device Bandwidth
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		640000000.0

&&&& Test PASSED

# Example:Device Query
rider@cuda:/usr/local/NVIDIA_CUDA_SDK/bin/linux/release$ sudo ./deviceQuery

(Running on Xen_VirtualMachine)
Device 0: "GeForce 9800 GT"
  Major revision number:                         0
  Minor revision number:                         0
  Total amount of global memory:                 6385920 bytes
  Number of multiprocessors:                     11007
  Number of cores:                               88056
  Total amount of constant memory:               6385872 bytes
  Total amount of shared memory per block:       3236702400 bytes
  Total number of registers available per block: 6385904
  Warp size:                                     0
  Maximum number of threads per block:           0
  Maximum sizes of each dimension of a block:    0 x 6385808 x 0
  Maximum sizes of each dimension of a grid:     0 x 0 x 2
  Maximum memory pitch:                          3234490924 bytes
  Texture alignment:                             3236702608 bytes
  Clock rate:                                    0.00 GHz
  Concurrent copy and execution:                 Yes

3.3 Running CUDA Example on Xen

Important:
In this case, We have to use "gcc-4.1" & "g++-4.1" instead of "gcc-4.3" to avoid getting stdio error.

/usr/include/bits/stdio2.h(...): error: identifier "__builtin_va_arg_pack" is undefined

Example HowTo :
rider@cloud:~/opt/NVIDIA_CUDA_SDK$ sudo make
rider@cloud:~/opt/NVIDIA_CUDA_SDK$ cd ./bin/linux/release/
rider@cloud:~/opt/NVIDIA_CUDA_SDK/bin/linux/release$ ./bandwidthTest
rider@cloud:~/opt/NVIDIA_CUDA_SDK/bin/linux/release$ ./deviceQuery

### Demo Example ###

(Running on Xen + Lustre Kernel)
Running on......
      device 0:GeForce 9800 GT
Quick Mode
Host to Device Bandwidth for Pageable memory
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		1574.6

Quick Mode
Device to Host Bandwidth for Pageable memory
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		1187.9

Quick Mode
Device to Device Bandwidth
.
Transfer Size (Bytes)	Bandwidth(MB/s)
 33554432		41442.7

&&&& Test PASSED

Press ENTER to exit...

(Running on Xen + Lustre Kernel)
There is 1 device supporting CUDA

Device 0: "GeForce 9800 GT"
  Major revision number:                         1
  Minor revision number:                         1
  Total amount of global memory:                 1073414144 bytes
  Number of multiprocessors:                     14
  Number of cores:                               112
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.51 GHz
  Concurrent copy and execution:                 Yes

Part 4 Running 3D-Applications on Xen by using Virtualizing OpenGL

4.1 Build VNC for Dom0 remote-desktop

# Try to set the locale environment variables as you met the locale problem.
# This section provides the solution for those who wants to surf the Dom0 desktop.
# Actually, you can use tightvnc or turbovnc instead of gdm.

rider@cloud:~$ sudo vim /etc/profile

# Locale
export LANGUAGE="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
export LANG="en_US.UTF-8"

rider@cloud:~$ source /etc/profile
rider@cloud:~$ sudo dpkg-reconfigure locales
rider@cloud:~$ sudo apt-get install autoconf automake build-essential gcc htop iptraf make libtool gdm sun-java6-jdk xdebconfigurator xfonts-100dpi xfonts-75dpi xfs xfonts-base xutils-dev tightvncserver ubuntu-desktop
rider@cloud:~$ sudo vim /etc/gdm/gdm.conf

DisallowTCP=false

# VNC Server
[server-VNC]
name=VNC server
command=/usr/bin/Xvnc -geometry 1024x768 -depth 24
flexible=true

rider@cloud:~$ sudo perl -pi.bak -e 's/^0=Standard/0=VNC/g' /etc/gdm/gdm.conf

rider@cloud:~$ sudo /etc/init.d/gdm restart
rider@cloud:~$ netstat -tunlp

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:5900            0.0.0.0:*               LISTEN      -                 ---> Xvnc port
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               LISTEN      3877/Xtightvnc  
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      3877/Xtightvnc  
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -               
tcp6       0      0 :::22                   :::*                    LISTEN      -               
tcp6       0      0 ::1:631                 :::*                    LISTEN      -               
udp        0      0 0.0.0.0:32769           0.0.0.0:*                           -               
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           -

4.1.1 Kernel Module

rider@cloud:~$ lsmod (Make sure that the "nvidia" module is loaded, and you can also try to load the nvidiafb.)

Module                  Size  Used by
bridge                 55080  0 
llc                    10544  1 bridge
iptable_filter          6784  0 
ip_tables              20824  1 iptable_filter
x_tables               18952  1 ip_tables
loop                   19076  0 
nvidia               8111512  36 
i2c_core               24960  1 nvidia
e1000                 120000  0 
e1000e                131500  0 
fuse                   43312  3

4.2 Build VNC for DomU remote-desktop

rider@nv:~$ sudo vim /etc/profile

# Locale
export LANGUAGE="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
export LANG="en_US.UTF-8"

rider@nv:~$ source /etc/profile
rider@nv:~$ sudo dpkg-reconfigure locales
rider@nv:~$ sudo apt-get install autoconf automake build-essential gcc htop iptraf make libtool gdm sun-java6-jdk xdebconfigurator xfonts-100dpi xfonts-75dpi xfs xfonts-base xutils-dev tightvncserver ubuntu-desktop
rider@nv:~$ sudo vim /etc/gdm/gdm.conf

DisallowTCP=false

# VNC Server
[server-VNC]
name=VNC server
command=/usr/bin/Xvnc -geometry 1024x768 -depth 24
flexible=true

rider@nv:~$ sudo perl -pi.bak -e 's/^0=Standard/0=VNC/g' /etc/gdm/gdm.conf

rider@nv:~$ sudo /etc/init.d/gdm restart

4.3 Build VMGL HowTo

4.3.1 VMGL installation on Dom0 HowTo

rider@cloud:~/vmgl$ sudo apt-get install build-essential mesa-common-dev libglu1-mesa-dev mesa-utils libxmu-headers libxmu6 libxmu-dev zlib1g-dev libjpeg62 libjpeg62-dev xutils-dev libxaw-headers libxaw7 libxaw7-dev libxext6 libxext-dev rxvt lwm xauth xvfb xfonts-100dpi xfonts-75dpi culmus xfonts-scalable xfonts-base
rider@cloud:~/vmgl$ wget http://www.cs.toronto.edu/~andreslc/software/vmgl-0.1.tar.bz2
rider@cloud:~/vmgl$ tar jxvf vmgl-0.1.tar.bz2
rider@cloud:~/vmgl$ cd ./vmgl.hg/tightvnc/
rider@cloud:~/vmgl/vmgl.hg/tightvnc$ patch -p0 < ../../tightvnc-1.2.9-amd64support.patch
rider@cloud:~/vmgl$ cd ..
rider@cloud:~/vmgl$ sudo ln -sf /usr/bin/make /usr/bin/gmake (if necessary)
rider@cloud:~/vmgl/vmgl.hg$ make -j 4
rider@cloud:~/vmgl/vmgl.hg$ sudo make install-host
rider@cloud:~$ xauth

# Set the authority for remote guest.

Using authority file /home/rider/.Xauthority
xauth> add guest/unix:10  MIT-MAGIC-COOKIE-1  ec0ffd387888b9749d55f88031505888
xauth> add guest/unix:1  MIT-MAGIC-COOKIE-1  6824789b4ce0ac5743aeb57fd3ef8f5b
xauth> exit

4.3.2 VMGL installation on DomU or Guest OS HowTo

rider@guest:~/vmgl$ sudo apt-get install build-essential mesa-common-dev libglu1-mesa-dev mesa-utils libxmu-headers libxmu6 libxmu-dev zlib1g-dev libjpeg62 libjpeg62-dev xutils-dev libxaw-headers libxaw7 libxaw7-dev libxext6 libxext-dev rxvt lwm xauth xvfb xfonts-100dpi xfonts-75dpi culmus xfonts-scalable xfonts-base

# Desgin for lightweight basic windows support

xfonts-base xfonts-100dpi xfonts-75dpi -> fonts support
rxvt -> x-terminal-emulator for VNC
lwm  -> x-window-manager for VNC

rider@guest:~/vmgl$ wget http://www.cs.toronto.edu/~andreslc/software/vmgl-0.1.tar.bz2
rider@guest:~/vmgl$ tar jxvf vmgl-0.1.tar.bz2
rider@guest:~/vmgl$ cd ./vmgl.hg/tightvnc/
rider@guest:~/vmgl/vmgl.hg/tightvnc$ patch -p0 < ../../tightvnc-1.2.9-amd64support.patch
rider@guest:~/vmgl$ cd ..
rider@guest:~/vmgl$ sudo ln -sf /usr/bin/make /usr/bin/gmake (if necessary)
rider@guest:~/vmgl/vmgl.hg$ sudo mkdir -p /usr/lib/xorg/modules/extensions (if necessary)
rider@guest:~/vmgl/vmgl.hg$ make
rider@guest:~/vmgl/vmgl.hg$ sudo make install-guest

## Fix the rgb_Path problem.
rider@guest:~/vmgl/vmgl.hg$ sudo mkdir -p /usr/X11R6/lib/X11
rider@guest:~/vmgl/vmgl.hg$ sudo ln -sf /etc/X11/rgb.txt /usr/X11R6/lib/X11/rgb

4.4 VMGL UsageNote

4.4.1 VMGL on Dom0 (host)

# Start the host VMGL stub-daemon.
rider@cloud:~$ export DISPLAY=:0
rider@cloud:~$ stub-daemon
rider@cloud:~$ netstat -tunlp

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:8002            0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:7000            0.0.0.0:*               LISTEN      29082/stub-daemon ---> VMGL stub-daemon
udp        0      0 0.0.0.0:32769           0.0.0.0:*                           -               
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           -

Using X forwarding
rider@cloud:~$ ssh -X guest
rider@guest:~$ glxinfo

name of display: localhost:10.0
display: localhost:10  screen: 0
direct rendering: Yes
server glx vendor string: VMGL
server glx version string: 1.2 VMGL
server glx extensions:
client glx vendor string: VMGL
client glx version string: 1.2 VMGL
client glx extensions:
GLX version: 1.3
GLX extensions:
OpenGL extensions:
    GL_ARB_depth_texture, GL_ARB_fragment_program, GL_ARB_multisample,
    GL_ARB_multitexture, GL_ARB_occlusion_query, GL_ARB_point_parameters,
    GL_ARB_point_sprite, GL_ARB_shadow, GL_ARB_texture_border_clamp,
    GL_ARB_texture_compression, GL_ARB_texture_cube_map,
    GL_ARB_texture_env_add, GL_ARB_texture_env_combine,
    GL_EXT_texture_env_combine, GL_ARB_texture_env_dot3,
    GL_EXT_texture_env_dot3, GL_ARB_texture_mirrored_repeat,
    GL_ARB_texture_non_power_of_two, GL_ARB_transpose_matrix,
    GL_ARB_vertex_buffer_object, GL_ARB_vertex_program, GL_ARB_window_pos,
    GL_EXT_blend_color, GL_EXT_blend_minmax, GL_EXT_blend_func_separate,
    GL_EXT_blend_subtract, GL_EXT_texture_env_add, GL_EXT_fog_coord,
    GL_EXT_multi_draw_arrays, GL_EXT_secondary_color, GL_EXT_shadow_funcs,
    GL_EXT_stencil_wrap, GL_EXT_texture_cube_map, GL_EXT_texture_edge_clamp,
    GL_EXT_texture_filter_anisotropic, GL_EXT_texture_lod_bias,
    GL_EXT_texture_object, GL_EXT_texture3D, GL_EXT_bgra,
    GL_IBM_rasterpos_clip, GL_NV_fog_distance, GL_NV_fragment_program,
    GL_NV_register_combiners, GL_NV_register_combiners2,
    GL_NV_texgen_reflection, GL_NV_texture_rectangle, GL_NV_vertex_program,
    GL_NV_vertex_program1_1, GL_NV_vertex_program2, GL_SGIS_generate_mipmap,
    GL_CR_state_parameter, GL_CR_cursor_position, GL_CR_bounding_box,
    GL_CR_print_string, GL_CR_tilesort_info, GL_CR_synchronization,
    GL_CR_head_spu_name, GL_CR_performance_info, GL_CR_window_size,
    GL_CR_tile_info, GL_CR_saveframe, GL_CR_readback_barrier_size,
    GL_CR_server_id_sharing, GL_CR_server_matrix
    GLX_ARB_multisample
OpenGL vendor string: H. Andres Lagar-Cavilla
OpenGL renderer string: VMGL
OpenGL version string: 1.5 VMGL 1.9
0x52 24 dc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  2 1 None
0x53 24 dc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  2 1 None
0x54 24 dc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  4 1 None
0x55 24 dc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  4 1 None
0x56 24 dc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  2 1 None
0x57 24 dc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  2 1 None
0x58 24 dc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  4 1 None
0x59 24 dc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  4 1 None
0x23 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x5a 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x5b 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  0 0 None
0x5c 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  0 0 None
0x5d 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x5e 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x5f 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  0 16 16 16 16  0 0 None
0x60 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  0 16 16 16 16  0 0 None
0x61 32 tc  0 32  0 r  y  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x62 32 tc  0 32  0 r  y  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None
0x63 32 tc  0 32  0 r  .  .  8  8  8  0  4  0  0 16 16 16 16  0 0 None
0x64 32 tc  0 32  0 r  .  .  8  8  8  8  4  0  0 16 16 16 16  0 0 None
0x65 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  0 16 16 16 16  2 1 None
0x66 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  0 16 16 16 16  2 1 None
0x67 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  0 16 16 16 16  4 1 None
0x68 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  0 16 16 16 16  4 1 None
0x69 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  0 16 16 16 16  2 1 None
0x6a 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  0 16 16 16 16  2 1 None
0x6b 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  0 16 16 16 16  4 1 None
0x6c 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  0 16 16 16 16  4 1 None
0x6d 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  2 1 None
0x6e 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  2 1 None
0x6f 32 tc  0 32  0 r  y  .  8  8  8  0  4 24  8 16 16 16 16  4 1 None
0x70 32 tc  0 32  0 r  y  .  8  8  8  8  4 24  8 16 16 16 16  4 1 None
0x71 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  2 1 None
0x72 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  2 1 None
0x73 32 tc  0 32  0 r  .  .  8  8  8  0  4 24  8 16 16 16 16  4 1 None
0x74 32 tc  0 32  0 r  .  .  8  8  8  8  4 24  8 16 16 16 16  4 1 None

rider@guest:~$ glxgears

47819 frames in 5.0 seconds = 9563.678 FPS
46064 frames in 5.0 seconds = 9212.566 FPS
44584 frames in 5.0 seconds = 8916.581 FPS
44256 frames in 5.0 seconds = 8850.974 FPS
44688 frames in 5.0 seconds = 8937.528 FPS

4.4.2 VMGL on DomU or Guest OS

# Set the VMGL environment for VMGL guest.
rider@guest:~$ sudo ln -sf /usr/share/fonts/X11/ /usr/X11R6/lib/X11/fonts (if necessary - fix font path)

# FontPath:

/usr/X11R6/lib/X11/fonts
or
/usr/share/fonts/X11

rider@guest:~$ less /usr/X11R6/lib/X11/rgb.txt (if necessary - rgb path confirmation)
rider@guest:~$ sudo vim /etc/profile

GLSTUB=Cloud_IP(host_IP):7000
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/vmgl
LD_PRELOAD=/usr/local/lib/vmgl/libGL.so
export GLSTUB LD_LIBRARY_PATH LD_PRELOAD

rider@guest:~$ source /etc/profile

Using X forwarding
rider@guest:~$ sudo vim /etc/ssh/sshd_config

X11Forwarding yes

rider@guest:~$ sudo vim /etc/ssh/ssh_config

    ForwardX11 yes
    ForwardX11Trusted yes

rider@guest:~$ sudo /etc/init.d/ssh restart
rider@guest:~$ vncserver -geometry 1024x768 -depth 24 :1
rider@guest:~$ vncserver -geometry 1024x768 -depth 24 :2
rider@guest:~$ netstat -tunlp

(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               LISTEN      10652/Xtightvnc ---> guest:1
tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      10630/Xtightvnc
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      10652/Xtightvnc ---> guest:2
tcp        0      0 0.0.0.0:6002            0.0.0.0:*               LISTEN      10630/Xtightvnc
tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN      -
tcp6       0      0 :::22                   :::*                    LISTEN      -
tcp6       0      0 ::1:6010                :::*                    LISTEN      -

4.4.3 Using VNCViewer

# All of the client users can run their 3D-apps with "Direct rendering" between Multi-VNC.

rider@PC:~$ vncviewer guest:1

Run Your apps via command in rxvt

rider@NB:~$ vncviewer guest:2

Run Your apps via command in rxvt

4.5 Customize youre VM desktop

4.5.1 Enlightenment Installation

# You can download it from WebSite: http://www.enlightenment.org/p.php?p=download&l=en
rider@cloud:~$ sudo mkdir /opt/enlightenment
rider@cloud:~/enlightenment/e16-0.16.8.15$ sudo apt-get install libimlib2 libimlib2-dev
rider@cloud:~/enlightenment$ tar zxvf e16-0.16.8.15.tar.gz
rider@cloud:~/enlightenment/e16-0.16.8.15$ ./configure --prefix=/opt/enlightenment --with-x --enable-modules
rider@cloud:~/enlightenment/e16-0.16.8.15$ make -j 4
rider@cloud:~/enlightenment/e16-0.16.8.15$ sudo make install

Part 5 Comparing with Xen passthrough and VMGL

	Xen Passthrough	VMGL
method	直接 (Xen支援)	間接 (使用VMGL library)
資源提供	一張顯卡能給一台VM使用 (會 lock 顯卡資源，適用於主機板有多張顯卡)	單張網卡可以給多台 VM 使用 (效能隨著VM數目遞減)
傳輸協定	不需	網路傳輸協定 (port:7000)

Reference & Support:
1.NVIDIA CUDA: http://www.nvidia.com/object/cuda_home.html
2.openSUSE NVIDIA + Xen: http://en.opensuse.org/Use_Nvidia_driver_with_Xen
3.NVIDIA GPUs DEV_IDs: http://www.laptopvideo2go.com/forum/index.php?showtopic=7664
4.pci_ids db: http://www.pcidatabase.com/
5.Xen: assigning PCI devices to a domain: http://www.bestgrid.org/index.php/Xen:_assigning_PCI_devices_to_a_domain
6.Xen PCI Passthrough: http://www.wlug.org.nz/XenPciPassthrough
7.Xen Users' Manual v3.0: http://www.cl.cam.ac.uk/research/srg/netos/xen/readmes/user/
8.VT-d in Xen: http://wiki.xensource.com/xenwiki/VTdHowTo
9.VMGL support: https://code.fluendo.com/elisa/trac/wiki
10.Latest NVIDIA driver: ftp://download.nvidia.com/XFree86/Linux-x86_64/

Last modified 16 years ago Last modified on Feb 2, 2010, 4:22:53 PM

Attachments (5)

2.6.22.9.config (40.2 KB) - added by rider 16 years ago. 2.6.22.9_Xen_VMGL
tightvnc-1.2.9-amd64support.patch (1.5 KB) - added by rider 16 years ago. tightvnc-1.2.9_amd64_patch
config-2.6.28 (90.4 KB) - added by rider 16 years ago. 2.6.28_Vanilla_DR
xorg.conf (1.7 KB) - added by rider 16 years ago. XEN_NVIDIA_Dom0
Architecture.jpg (76.4 KB) - added by rider 16 years ago. Xen_GPU_Arch

Download all attachments as: .zip

Download in other formats:

Plain Text