Version 7 (modified by chris, 17 years ago) (diff) |
---|
Install mpich2
1. Testbed Introduction
- VMware Workstation 5.0
- Create three nodes, client-01 ~ client-03, all with 5GB HD and 128MB memory.
- The OS on these 3 nodes is Ubuntu-7.10-server
- hostname and IP are:
- client-01 192.168.180.131
- client-02 192.168.180.132
- client-03 192.168.180.133
2. Getting Start to install mpich2
- Step 1 : Modify /etc/hosts
- Assume that we got 3 machines in our testbed, now we have to edit /etc/hosts on each node.
Here is the example of client-01. So does the other two nodes.root@client-01:~# cat > /etc/hosts << "EOF" > 127.0.0.1 localhost > 192.168.180.131 client-01 > 192.168.180.132 client-02 > 192.168.180.133 client-03 > EOF
- Check the /etc/hosts contents, it should be like this.
root@client-01:~# cat /etc/hosts 127.0.0.1 localhost 192.168.180.131 client-01 192.168.180.132 client-02 192.168.180.133 client-03
- Notice that if there is 127.0.0.1 client-01 in /etc/hosts, it must be deleted.
- Assume that we got 3 machines in our testbed, now we have to edit /etc/hosts on each node.
- Step 2 : Download and Install mpich2
- We use mpcich2-1.0.7rc1 for example.
root@client-01:~# wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/mpich2-1.0.7rc1.tar.gz root@client-01:~# tar -zxvf mpich2-1.0.7rc1.tar.gz cd mpich2-1.0.7rc1/
- You could choose the install location as you wish, but all the nodes must have the same install location!!
We use /opt/mpich2 here for example.root@client-01:~/mpich2-1.0.7rc1# ./configure prefix=/opt/mpich2
- If there are errors when configuring mpich2, you may need some packages or libraries such as c/c++ compiler.
Try this :root@client-01:~/mpich2-1.0.7rc1# apt-get install build-essential
- If there are errors when configuring mpich2, you may need some packages or libraries such as c/c++ compiler.
- compile and install
root@client-01:~/mpich2-1.0.7rc1# make root@client-01:~/mpich2-1.0.7rc1# make install
- We use mpcich2-1.0.7rc1 for example.
- Step 3 : Check if the Installation is Successful
- Try these commands to see if the mpich2 have been installed correctly.
root@client-01:~/mpich2-1.0.7rc1# which mpd /usr/local/bin/mpd root@client-01:~/mpich2-1.0.7rc1# which mpicc /usr/local/bin/mpicc root@client-01:~/mpich2-1.0.7rc1# which mpiexec /usr/local/bin/mpiexec root@client-01:~/mpich2-1.0.7rc1# which mpirun /usr/local/bin/mpirun
- If there is nothing showed on the screen when enter which xxxx, you probably need to set your environment variables.
root@client-01:~/mpich2-1.0.7rc1# export PATH="$PATH:/opt/mpich2/bin"
- If there is nothing showed on the screen when enter which xxxx, you probably need to set your environment variables.
- Try these commands to see if the mpich2 have been installed correctly.
- Step 4 : Add New Configuration Files
- Add mpd.host file on all the other nodes, you could put it anywhere in your system.
root@client-01:~/mpich2-1.0.7rc1# touch mpd.hosts root@client-01:~/mpich2-1.0.7rc1# cat > mpd.hosts << "EOF" > client-01 > client-02 > client-03 > EOF
- Check if the content in mpd.hosts is correct.
root@client-01:~/mpich2-1.0.7rc1# cat mpd.hosts client-01 client-02 client-03
- Add /etc/mpd.conf on every node, there is just one line.
Notice that all the nodes must have the same password.
We use "this_is_password" for example here.
root@client-01:~/mpich2-1.0.7rc1# touch /etc/mpd.conf root@client-01:~/mpich2-1.0.7rc1# cat > /etc/mpd.conf << "EOF" > secretword=this_is_password > EOF
- Change the file properties to 600 for security consideration.
root@client-01:~/mpich2-1.0.7rc1# chmod 600 /etc/mpd.conf
- Add mpd.host file on all the other nodes, you could put it anywhere in your system.
- Step 5 : Modify SSH Connections
- At first, you need to check if ssh have been installed on your machines.
If not, you could try apt-get install ssh to install ssh.
All nodes need to install ssh.
- Second, now we are going to establish ssh connections between all nodes without entering password.
Issue the following commands on ALL the other nodes.root@client-01:~/mpich2-1.0.7rc1# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: d7:ad:ce:0f:e2:d8:16:3d:92:10:b8:b3:7f:7b:68:d9 root@client-01 root@client-01:~/mpich2-1.0.7rc1#
- Just press enter when the system promoting the messages.
- Copy id_dsa.pub and overwrite to ./authorized_keys
root@client-01:~/mpich2-1.0.7rc1# cd /root/.ssh/ root@client-01:~/.ssh# cp ./id_dsa.pub ./authorized_keys
- Copy all files under /root/.ssh/ and overwrite the same directory on all nodes.
- Establish ssh connections to all other nodes from node1(client-01).
While connecting to the other nodes first time, you need to enter 'yes'.
Since you will be automatically login as root to all the other nodes without typing any password.
root@client-01:~# ssh root@client-02 The authenticity of host 'client-02 (192.168.180.132)' can't be established. RSA key fingerprint is 27:17:fa:34:63:c4:a7:c1:ec:25:84:76:6d:03:2d:96. Are you sure you want to continue connecting (yes/no)? yes
- Repeat the operations above to all the other nodes.(If you got 10+ machines, the will be a "big work")
- If any error occurs, just remove /root/.ssh/known_hosts, and re-ssh.
- At first, you need to check if ssh have been installed on your machines.
- Step 6 : Start MPICH2
- The easiest way to setup a parallelized computing environment is to issue:
root@client-01:~# mpdboot -n 3 -f /root/mpich2-1.0.7rc1/mpd.hosts
- Check the mpich2 nodes.
root@client-01:~# mpdtrace client-01 client-03 client-02
- The easiest way to setup a parallelized computing environment is to issue:
3. Conclusion
- This is the most simple parallel computing environment, just for teaching and testing.
- Do not expect that your job can be submitted to it and run faster! They are just virtual machines.
- Establish MPICH2 parallelized computing environment will become very difficult when the number of nodes growing up and up.
So, it is very important for using DRBL and Clonezilla well to help us build this parallelized environment.
I will try to use DRBL to build a 10+ nodes parallelized machine later.