How to run R-MPI on multiple machine with normal user permission
- Reference Configuration Guide:
Configure mpd for normal user
- Fisrt, login as nornal user. Here we login with user id 'jazz'. Then exchange SSH public key to each computing node.
login as: jazz jazz@bio-cluster-12's password: jazz@bio-cluster-12:~$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/jazz/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/jazz/.ssh/id_rsa. Your public key has been saved in /home/jazz/.ssh/id_rsa.pub. The key fingerprint is: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX jazz@bio-cluster-12 jazz@bio-cluster-12:~$ for i in 11 10 09 08 07 06; do scp .ssh/id_rsa.pub bio-cluster-$i:.ssh/authorized_keys; done
- setup .mpd.conf for each compute node in $HOME
jazz@bio-cluster-12:~$ echo "MPD_SECRETWORD=${user}$$" > ~/.mpd.conf jazz@bio-cluster-12:~$ chmod 600 .mpd.conf jazz@bio-cluster-12:~$ for i in 11 10 09 08 07 06 > do > scp .mpd.conf bio-cluster-$i:. > done .mpd.conf 100% 21 0.0KB/s 00:00 .mpd.conf 100% 21 0.0KB/s 00:00 .mpd.conf 100% 21 0.0KB/s 00:00 .mpd.conf 100% 21 0.0KB/s 00:00 .mpd.conf 100% 21 0.0KB/s 00:00 .mpd.conf 100% 21 0.0KB/s 00:00
- setup mpd.hosts on localhost
jazz@bio-cluster-12:~$ cat > mpd.hosts << EOF > bio-cluster-11 > bio-cluster-10 > bio-cluster-09 > bio-cluster-08 > bio-cluster-07 > bio-cluster-06 > EOF
- run mpdboot for 7 nodes and use mpdtrace to check the status of mpd process on each compute node. mpdringtest can test the speed of message passing. mpdallexit to terminate all mpd processes.
jazz@bio-cluster-12:~$ mpdboot -n 7 jazz@bio-cluster-12:~$ mpdtrace -l bio-cluster-12_54092 (10.220.202.219) bio-cluster-08_38361 (10.220.202.223) bio-cluster-09_52923 (10.220.202.222) bio-cluster-11_33377 (10.220.202.220) bio-cluster-10_33103 (10.220.202.221) bio-cluster-06_59631 (10.220.202.225) bio-cluster-07_59533 (10.220.202.224) jazz@bio-cluster-12:~$ mpdringtest 100 time for 100 loops = 0.0729811191559 seconds jazz@bio-cluster-12:~$ mpdallexit
Test 1: single mpd and R-MPI in localhost
- run mpd in localhost
jazz@bio-cluster-12:~$ mpd & [1] 1505 jazz@bio-cluster-12:~$ mpdtrace -l bio-cluster-12_37007 (10.220.202.219)
- BTW, you can also use mpdboot to run mpd in localhost
jazz@bio-cluster-12:~$ mpdboot [1]+ Done mpd jazz@bio-cluster-12:~$ mpdtrace -l bio-cluster-12_44810 (10.220.202.219)
- run R-MPI with single mpd in localhost
jazz@bio-cluster-12:~$ R R version 2.4.0 Patched (2006-11-25 r39997) Copyright (C) 2006 The R Foundation for Statistical Computing ISBN 3-900051-07-0 > library(Rmpi) > mpi.spawn.Rslaves() 1 slaves are spawned successfully. 0 failed. master (rank 0, comm 1) of size 2 is running on: bio-cluster-12 slave1 (rank 1, comm 1) of size 2 is running on: bio-cluster-12 > mpi.close.Rslaves() mpi.close.Rslaves() [1] 1 > mpi.quit(save="no") mpi.quit(save="no") jazz@bio-cluster-12:~$
Test 2: Compile MPICH2 sample program and run in multiple compute nodes
- Sample program from Wade.
jazz@bio-cluster-12:~$ cat > demo1.c << EOF > /* Program: > * 每個 node 將自己的 id 印出,並且將所有的參與運動的 node 總數也印出 > * ,顯示出自己的主機名稱。 > * History: > * 2008-04-09 BETA > * 2008-06-25 增加顯示主機名稱功能 > */ > > #include <stdio.h> > #include "mpi.h" > int main (int argc, char **argv) > { > int rank, size, len; > char name[MPI_MAX_PROCESSOR_NAME]; > MPI_Init(&argc, &argv); > int myid, numprocs; > > /* 取得 node 總數 */ > MPI_Comm_size(MPI_COMM_WORLD,&numprocs); > > /* 取得本身 node id / rank */ > MPI_Comm_rank(MPI_COMM_WORLD,&myid); > > /* 取得本身 host name */ > MPI_Get_processor_name(name, &len); > printf("This is machine %d of %d name = %s\n", myid, numprocs, name); > MPI_Finalize(); > } > EOF jazz@bio-cluster-12:~$ mpicc -I /usr/include/mpich2/ -lmpich demo1.c -o demo1 jazz@bio-cluster-12:~$ for i in 11 10 09 08 07 06 > do > scp demo1 bio-cluster-$i:. > done demo1 100% 557KB 557.3KB/s 00:00 demo1 100% 557KB 557.3KB/s 00:00 demo1 100% 557KB 557.3KB/s 00:00 demo1 100% 557KB 557.3KB/s 00:00 demo1 100% 557KB 557.3KB/s 00:00 demo1 100% 557KB 557.3KB/s 00:00 jazz@bio-cluster-12:~$ mpdboot -n 7 jazz@bio-cluster-12:~$ mpdtrace -l bio-cluster-12_41632 (10.220.202.219) bio-cluster-08_33197 (10.220.202.223) bio-cluster-09_40371 (10.220.202.222) bio-cluster-10_54199 (10.220.202.221) bio-cluster-06_54334 (10.220.202.225) bio-cluster-07_42302 (10.220.202.224) bio-cluster-11_36534 (10.220.202.220) jazz@bio-cluster-12:~$ mpiexec -n 7 /home/jazz/demo1 This is machine 0 of 7 name = bio-cluster-12 This is machine 1 of 7 name = bio-cluster-08 This is machine 2 of 7 name = bio-cluster-09 This is machine 3 of 7 name = bio-cluster-10 This is machine 5 of 7 name = bio-cluster-07 This is machine 6 of 7 name = bio-cluster-11 This is machine 4 of 7 name = bio-cluster-06
Test 3: run multiple mpd and R-MPI in multiple compute nodes
- Note: mpdboot will use rsh as default communication channel, in debian 4.0 we can find that rsh is equal to ssh.
jazz@bio-cluster-12:~$ which rsh /usr/bin/rsh jazz@bio-cluster-12:~$ ls -al /usr/bin/rsh lrwxrwxrwx 1 root root 21 2008-04-09 20:52 /usr/bin/rsh -> /etc/alternatives/rsh jazz@bio-cluster-12:~$ ls -al /etc/alternatives/rsh lrwxrwxrwx 1 root root 12 2008-05-29 19:04 /etc/alternatives/rsh -> /usr/bin/ssh
Last modified 16 years ago
Last modified on Jul 2, 2008, 11:50:04 AM