= Some Notes from MPICH2 Installation Guide = == Offical [http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-doc-user.pdf User's Guide] == * MPICH2 test examples - cpi * in Offical [http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-doc-user.pdf User's Guide] {{{ mpd & cd /home/you/mpich2-installed/examples mpiexec -n 3 cpi mpdallexit }}} * in Debian environment {{{ jazz@bio-cluster-12:~$ mpd & [1] 3551 jazz@bio-cluster-12:~$ mpiexec -n 3 /usr/share/mpich2/examples/cpi Process 1 of 3 is on bio-cluster-12 Process 2 of 3 is on bio-cluster-12 Process 0 of 3 is on bio-cluster-12 pi is approximately 3.1415926544231318, Error is 0.0000000008333387 wall clock time = 0.001213 jazz@bio-cluster-12:~$ mpdallexit }}} * Compiling and Linking * 4.3 Special Issues for C++ * Some users may get error messages such as {{{ SEEK_SET is #defined but must not be for the C++ binding of MPI }}} * The problem is that both stdio.h and the MPI C++ interface use SEEK SET, SEEK CUR, and SEEK END. This is really a bug in the MPI-2 standard. You can try adding following definition to the command line {{{ -DMPICH_IGNORE_CXX_SEEK }}} * 5.1 Standard mpiexec * Currently, MPICH2 does not fully support calling the dynamic process routines from MPI-2 (e.g., '''MPI_Comm_spawn''' or '''MPI_Comm_accept''') from processes that are not started with mpiexec. * Note: this might be the reason why we encounter R-MPI '''mpi.spawn.Rslaves()''' problem. * 5.3.1 Basic mpiexec arguments for MPD * You can use mpiexec to run non-MPI programs as well. This is sometimes useful in making sure all the machines are up and ready for use. Useful examples include {{{ mpiexec -n 10 hostname mpiexec -n 10 printenv }}} * 5.3.2 Other Command-Line Arguments to mpiexec for MPD * This section describe ''machinefile'' format to specifying '''hosts''', '''number of processes''' and '''interface host name'''(ifhn) * i.e, ${hostname}:${number of processors} ifhn=${name of network interface} {{{ # comment line hosta hostb:2 hostc ifhn=hostc-gige hostd:4 ifhn=hostd-gige }}} * 5.7.1 MPD in the PBS environment * One way to convert the node file to the MPD format is as follows: {{{ sort $PBS_NODEFILE | uniq -C | awk ’{ printf(”%s:%s”, $2, $1); }’ > mpd.nodes }}} * 6.1 MPD * '''mpdlistjobs''' lists the jobs that the mpd’s are running. Jobs are identified by the name of the mpd where they were submitted and a number. * '''mpdsigjob''' delivers a signal to the named job. Signals are specified by name or number. {{{ tty1: jazz@bio-cluster-12:~$ mpiexec -n 2 /home/jazz/demo1 This is machine 0 of 2 name = bio-cluster-12 This is machine 1 of 2 name = bio-cluster-08 ================================================================= tty2: jazz@bio-cluster-12:~$ mpdlistjobs jobid = 4@bio-cluster-12_48909 jobalias = username = jazz host = bio-cluster-12 pid = 3619 sid = 3618 rank = 0 pgm = /home/jazz/demo1 jobid = 4@bio-cluster-12_48909 jobalias = username = jazz host = bio-cluster-08 pid = 6187 sid = 6186 rank = 1 pgm = /home/jazz/demo1 }}} * 7.1 gdb via mpiexec * If there are no gdb installed in your system, you might have following error message: {{{ jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1 0: Traceback (most recent call last): 0: File "/usr/bin/mpdgdbdrv.py", line 78, in ? 0: write(gdb_sin_fileno,'handle SIGUSR1 nostop noprint\n') 0: OSError: [Errno 32] Broken pipe 1: Traceback (most recent call last): 1: File "/usr/bin/mpdgdbdrv.py", line 75, in ? 1: write(gdb_sin_fileno,'set confirm off\n') 1: OSError: [Errno 32] Broken pipe }}} * After install gdb, you can use gdb to debug your MPICH2 program. {{{ jazz@bio-cluster-12:~$ sudo apt-get install gdb jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1 1: Traceback (most recent call last): 1: File "/usr/bin/mpdgdbdrv.py", line 75, in ? 1: write(gdb_sin_fileno,'set confirm off\n') 1: OSError: [Errno 32] Broken pipe 0: (gdb) l 0: 4 * History: 0: 5 * 2008-04-09 BETA 0: 6 * 2008-06-25 增加顯示主機名稱功能 0: 7 */ 0: 8 0: 9 #include 0: 10 #include 0: 11 #include "mpi.h" 0: 12 int main (int argc, char **argv) 0: 13 { 0: (gdb) }}} * You can attach to a running job with ''-gdba'' option, where comes from '''mpdlistjobs'''. {{{ mpiexec -gdba }}} == Offical [http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-doc-install.pdf Installer's Guide] == * 4 Choosing the Communication Device * '''ch3:sock''' This is the default communication method. It uses sockets for all communications between processes. * '''ch3:ssm''' This method uses sockets between nodes and shared memory within a node. * '''ch3:shm''' This method only uses shared memory and only works within a single SMP. It does not support the MPI dynamic process routines such as MPI Comm spawn. * '''ch3:nemesis''' This method is our new, high performance method. It supports sockets, shared memory, and Myrinet-GM at present. * Most installations should use either the '''ch3:ssm''' or '''ch3:sock''' methods. If you need __'''multi-threaded MPI'''__, you must __'''use ch3:sock'''__ in this release. If you have __'''a cluster of SMPs and do not need a multi-threaded MPI'''__, then select __'''ch3:ssm'''__. If you are interested in trying out our highest-performance device, and need neither threads nor the dynamic process routines from MPI, then you can use ch3:nemesis. * 5.1.5 Running MPD on multi-homed systems {{{ n1% mpd --ifhn=192.168.1.1 & }}} * 5.1.6 Running MPD as Root * '''MPD can run as root to support multiple users simultaneously.''' * To use root’s ring, they must use an option named '''MPD_USE_ROOT_MPD'''. {{{ bio-cluster-12:~# cat /etc/mpd.conf MPD_USE_ROOT_MPD=1 secretword=this_is_password }}} * 5.1.7 Running MPD on SMP’s {{{ mpd --ncpus=2 }}} * 5.1.8 Security Issues in MPD * '''mpdboot''' uses '''ssh''' by default, although the less secure rsh can be used if the user chooses. * When a single new mpd joins a ring of existing mpd’s, user must have the '''same secretword''' set in his ''.mpd.conf'' file on each machine. ''.mpd.conf'' file must be readable only by the user starting the mpd; otherwise the mpd will refuse to read it. (i.e. chmod 600 .mpd.conf) * How '''mpiexec''' connects to the local '''mpd'''? This is done through a '''Unix socket''' in '''/tmp''' rather than through an '''INET socket'''. == [http://www.clustertech.com/cpe-eval/doc/mpich2instguide.pdf MPICH2 Installation and Configuration Guide] ==