Version 11 (modified by jazz, 16 years ago) (diff) |
---|
Some Notes from MPICH2 Installation Guide
Offical User's Guide
- MPICH2 test examples - cpi
- in Offical User's Guide
mpd & cd /home/you/mpich2-installed/examples mpiexec -n 3 cpi mpdallexit
- in Debian environment
jazz@bio-cluster-12:~$ mpd & [1] 3551 jazz@bio-cluster-12:~$ mpiexec -n 3 /usr/share/mpich2/examples/cpi Process 1 of 3 is on bio-cluster-12 Process 2 of 3 is on bio-cluster-12 Process 0 of 3 is on bio-cluster-12 pi is approximately 3.1415926544231318, Error is 0.0000000008333387 wall clock time = 0.001213 jazz@bio-cluster-12:~$ mpdallexit
- in Offical User's Guide
- Compiling and Linking
- 4.3 Special Issues for C++
- Some users may get error messages such as
SEEK_SET is #defined but must not be for the C++ binding of MPI
- The problem is that both stdio.h and the MPI C++ interface use SEEK SET, SEEK CUR, and SEEK END. This is really a bug in the MPI-2 standard. You can try adding following definition to the command line
-DMPICH_IGNORE_CXX_SEEK
- Some users may get error messages such as
- 4.3 Special Issues for C++
- 5.1 Standard mpiexec
- Currently, MPICH2 does not fully support calling the dynamic process routines from MPI-2 (e.g., MPI_Comm_spawn or MPI_Comm_accept) from processes that are not started with mpiexec.
- Note: this might be the reason why we encounter R-MPI mpi.spawn.Rslaves() problem.
- 5.3.1 Basic mpiexec arguments for MPD
- You can use mpiexec to run non-MPI programs as well. This is sometimes useful in making sure all the machines are up and ready for use. Useful examples include
mpiexec -n 10 hostname mpiexec -n 10 printenv
- You can use mpiexec to run non-MPI programs as well. This is sometimes useful in making sure all the machines are up and ready for use. Useful examples include
- 5.3.2 Other Command-Line Arguments to mpiexec for MPD
- This section describe machinefile format to specifying hosts, number of processes and interface host name(ifhn)
- i.e, ${hostname}:${number of processors} ifhn=${name of network interface}
# comment line hosta hostb:2 hostc ifhn=hostc-gige hostd:4 ifhn=hostd-gige
- 5.7.1 MPD in the PBS environment
- One way to convert the node file to the MPD format is as follows:
sort $PBS_NODEFILE | uniq -C | awk ’{ printf(”%s:%s”, $2, $1); }’ > mpd.nodes
- One way to convert the node file to the MPD format is as follows:
- 6.1 MPD
- mpdlistjobs lists the jobs that the mpd’s are running. Jobs are identified by the name of the mpd where they were submitted and a number.
- mpdsigjob delivers a signal to the named job. Signals are specified by name or number.
tty1: jazz@bio-cluster-12:~$ mpiexec -n 2 /home/jazz/demo1 This is machine 0 of 2 name = bio-cluster-12 This is machine 1 of 2 name = bio-cluster-08 ================================================================= tty2: jazz@bio-cluster-12:~$ mpdlistjobs jobid = 4@bio-cluster-12_48909 jobalias = username = jazz host = bio-cluster-12 pid = 3619 sid = 3618 rank = 0 pgm = /home/jazz/demo1 jobid = 4@bio-cluster-12_48909 jobalias = username = jazz host = bio-cluster-08 pid = 6187 sid = 6186 rank = 1 pgm = /home/jazz/demo1
- 7.1 gdb via mpiexec
- If there are no gdb installed in your system, you might have following error message:
jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1 0: Traceback (most recent call last): 0: File "/usr/bin/mpdgdbdrv.py", line 78, in ? 0: write(gdb_sin_fileno,'handle SIGUSR1 nostop noprint\n') 0: OSError: [Errno 32] Broken pipe 1: Traceback (most recent call last): 1: File "/usr/bin/mpdgdbdrv.py", line 75, in ? 1: write(gdb_sin_fileno,'set confirm off\n') 1: OSError: [Errno 32] Broken pipe
- After install gdb, you can use gdb to debug your MPICH2 program.
jazz@bio-cluster-12:~$ sudo apt-get install gdb jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1 1: Traceback (most recent call last): 1: File "/usr/bin/mpdgdbdrv.py", line 75, in ? 1: write(gdb_sin_fileno,'set confirm off\n') 1: OSError: [Errno 32] Broken pipe 0: (gdb) l 0: 4 * History: 0: 5 * 2008-04-09 BETA 0: 6 * 2008-06-25 增加顯示主機名稱功能 0: 7 */ 0: 8 0: 9 #include <stdio.h> 0: 10 #include <unistd.h> 0: 11 #include "mpi.h" 0: 12 int main (int argc, char **argv) 0: 13 { 0: (gdb)
- You can attach to a running job with -gdba option, where <jobid> comes from mpdlistjobs.
mpiexec -gdba <jobid>
- If there are no gdb installed in your system, you might have following error message:
Offical Installer's Guide
- 4 Choosing the Communication Device
- ch3:sock This is the default communication method. It uses sockets for all communications between processes.
- ch3:ssm This method uses sockets between nodes and shared memory within a node.
- ch3:shm This method only uses shared memory and only works within a single SMP. It does not support the MPI dynamic process routines such as MPI Comm spawn.
- ch3:nemesis This method is our new, high performance method. It supports sockets, shared memory, and Myrinet-GM at present.
- Most installations should use either the ch3:ssm or ch3:sock methods. If you need multi-threaded MPI, you must use ch3:sock in this release. If you have a cluster of SMPs and do not need a multi-threaded MPI, then select ch3:ssm. If you are interested in trying out our highest-performance device, and need neither threads nor the dynamic process routines from MPI, then you can use ch3:nemesis.
- 5.1.5 Running MPD on multi-homed systems
n1% mpd --ifhn=192.168.1.1 &
- 5.1.6 Running MPD as Root
- MPD can run as root to support multiple users simultaneously.
- To use root’s ring, they must use an option named MPD_USE_ROOT_MPD.
bio-cluster-12:~# cat /etc/mpd.conf MPD_USE_ROOT_MPD=1 secretword=this_is_password
- 5.1.7 Running MPD on SMP’s
mpd --ncpus=2
- 5.1.8 Security Issues in MPD
- mpdboot uses ssh by default, although the less secure rsh can be used if the user chooses.
- When a single new mpd joins a ring of existing mpd’s, user must have the same secretword set in his .mpd.conf file on each machine. .mpd.conf file must be readable only by the user starting the
mpd; otherwise the mpd will refuse to read it. (i.e. chmod 600 .mpd.conf)
- How mpiexec connects to the local mpd? This is done through a Unix socket in /tmp rather than through an INET socket.