wiki:jazz/mpich2_guide

Some Notes from MPICH2 Installation Guide

Offical User's Guide

  • MPICH2 test examples - cpi
    • in Offical User's Guide
      mpd &
      cd /home/you/mpich2-installed/examples
      mpiexec -n 3 cpi
      mpdallexit
      
    • in Debian environment
      jazz@bio-cluster-12:~$ mpd &
      [1] 3551
      jazz@bio-cluster-12:~$ mpiexec -n 3 /usr/share/mpich2/examples/cpi
      Process 1 of 3 is on bio-cluster-12
      Process 2 of 3 is on bio-cluster-12
      Process 0 of 3 is on bio-cluster-12
      pi is approximately 3.1415926544231318, Error is 0.0000000008333387
      wall clock time = 0.001213
      jazz@bio-cluster-12:~$ mpdallexit
      
  • Compiling and Linking
    • 4.3 Special Issues for C++
      • Some users may get error messages such as
        SEEK_SET is #defined but must not be for the C++ binding of MPI
        
      • The problem is that both stdio.h and the MPI C++ interface use SEEK SET, SEEK CUR, and SEEK END. This is really a bug in the MPI-2 standard. You can try adding following definition to the command line
        -DMPICH_IGNORE_CXX_SEEK
        
  • 5.1 Standard mpiexec
    • Currently, MPICH2 does not fully support calling the dynamic process routines from MPI-2 (e.g., MPI_Comm_spawn or MPI_Comm_accept) from processes that are not started with mpiexec.
    • Note: this might be the reason why we encounter R-MPI mpi.spawn.Rslaves() problem.
  • 5.3.1 Basic mpiexec arguments for MPD
    • You can use mpiexec to run non-MPI programs as well. This is sometimes useful in making sure all the machines are up and ready for use. Useful examples include
      mpiexec -n 10 hostname
      mpiexec -n 10 printenv
      
  • 5.3.2 Other Command-Line Arguments to mpiexec for MPD
    • This section describe machinefile format to specifying hosts, number of processes and interface host name(ifhn)
    • i.e, ${hostname}:${number of processors} ifhn=${name of network interface}
      # comment line
      hosta
      hostb:2
      hostc   ifhn=hostc-gige
      hostd:4 ifhn=hostd-gige
      
  • 5.7.1 MPD in the PBS environment
    • One way to convert the node file to the MPD format is as follows:
      sort $PBS_NODEFILE | uniq -C | awk ’{ printf(”%s:%s”, $2, $1); }’ > mpd.nodes
      
  • 6.1 MPD
    • mpdlistjobs lists the jobs that the mpd’s are running. Jobs are identified by the name of the mpd where they were submitted and a number.
    • mpdsigjob delivers a signal to the named job. Signals are specified by name or number.
      tty1: 
      
      jazz@bio-cluster-12:~$ mpiexec -n 2 /home/jazz/demo1
      This is machine 0 of 2  name = bio-cluster-12
      This is machine 1 of 2  name = bio-cluster-08
      
      =================================================================
      tty2: 
      
      jazz@bio-cluster-12:~$ mpdlistjobs
      jobid    = 4@bio-cluster-12_48909
      jobalias =
      username = jazz
      host     = bio-cluster-12
      pid      = 3619
      sid      = 3618
      rank     = 0
      pgm      = /home/jazz/demo1
      
      jobid    = 4@bio-cluster-12_48909
      jobalias =
      username = jazz
      host     = bio-cluster-08
      pid      = 6187
      sid      = 6186
      rank     = 1
      pgm      = /home/jazz/demo1
      
  • 7.1 gdb via mpiexec
    • If there are no gdb installed in your system, you might have following error message:
      jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1
      0: Traceback (most recent call last):
      0:   File "/usr/bin/mpdgdbdrv.py", line 78, in ?
      0:     write(gdb_sin_fileno,'handle SIGUSR1 nostop noprint\n')
      0: OSError: [Errno 32] Broken pipe
      1: Traceback (most recent call last):
      1:   File "/usr/bin/mpdgdbdrv.py", line 75, in ?
      1:     write(gdb_sin_fileno,'set confirm off\n')
      1: OSError: [Errno 32] Broken pipe
      
    • After install gdb, you can use gdb to debug your MPICH2 program.
      jazz@bio-cluster-12:~$ sudo apt-get install gdb
      jazz@bio-cluster-12:~$ mpiexec -gdb -n 2 /home/jazz/demo1
      1: Traceback (most recent call last):
      1:   File "/usr/bin/mpdgdbdrv.py", line 75, in ?
      1:     write(gdb_sin_fileno,'set confirm off\n')
      1: OSError: [Errno 32] Broken pipe
      0:  (gdb) l
      0:  4    * History:
      0:  5    *   2008-04-09 BETA
      0:  6    *   2008-06-25 增加顯示主機名稱功能
      0:  7   */
      0:  8
      0:  9   #include <stdio.h>
      0:  10  #include <unistd.h>
      0:  11  #include "mpi.h"
      0:  12  int main (int argc, char **argv)
      0:  13  {
      0:  (gdb)
      
    • You can attach to a running job with -gdba option, where <jobid> comes from mpdlistjobs.
      mpiexec -gdba <jobid>
      

Offical Installer's Guide

  • 4 Choosing the Communication Device
    • ch3:sock This is the default communication method. It uses sockets for all communications between processes.
    • ch3:ssm This method uses sockets between nodes and shared memory within a node.
    • ch3:shm This method only uses shared memory and only works within a single SMP. It does not support the MPI dynamic process routines such as MPI Comm spawn.
    • ch3:nemesis This method is our new, high performance method. It supports sockets, shared memory, and Myrinet-GM at present.
    • Most installations should use either the ch3:ssm or ch3:sock methods. If you need multi-threaded MPI, you must use ch3:sock in this release. If you have a cluster of SMPs and do not need a multi-threaded MPI, then select ch3:ssm. If you are interested in trying out our highest-performance device, and need neither threads nor the dynamic process routines from MPI, then you can use ch3:nemesis.
  • 5.1.5 Running MPD on multi-homed systems
    n1% mpd --ifhn=192.168.1.1 &
    
  • 5.1.6 Running MPD as Root
    • MPD can run as root to support multiple users simultaneously.
    • To use root’s ring, they must use an option named MPD_USE_ROOT_MPD.
      bio-cluster-12:~# cat /etc/mpd.conf
      MPD_USE_ROOT_MPD=1
      secretword=this_is_password
      
  • 5.1.7 Running MPD on SMP’s
    mpd --ncpus=2
    
  • 5.1.8 Security Issues in MPD
    • mpdboot uses ssh by default, although the less secure rsh can be used if the user chooses.
    • When a single new mpd joins a ring of existing mpd’s, user must have the same secretword set in his .mpd.conf file on each machine. .mpd.conf file must be readable only by the user starting the mpd; otherwise the mpd will refuse to read it. (i.e. chmod 600 .mpd.conf)
    • How mpiexec connects to the local mpd? This is done through a Unix socket in /tmp rather than through an INET socket.

MPICH2 Installation and Configuration Guide

  • If the key pair is passphrase-protected (by supplying a passphrase during key pair generation), the user should start up a SSH agent and supply the passphrase to the SSH agent:
    eval `ssh-agent`
    ssh-add
    

Other Reference

Last modified 16 years ago Last modified on Jul 2, 2008, 11:20:39 PM