== Install mpich2 == === 1. Testbed Introduction === * VMware Workstation 5.0 * Create three nodes, client-01 ~ client-03, all with 5GB HD and 128MB memory. * The OS on these 3 nodes is Ubuntu-7.10-server * hostname and IP are: * client-01 192.168.180.131 * client-02 192.168.180.132 * client-03 192.168.180.133 === 2. Getting Start to install mpich2 === * '''Step 1''' : Modify /etc/hosts * Assume that we got 3 machines in our testbed, now we have to edit /etc/hosts on each node.[[BR]] Here is the example of client-01. So does the other two nodes. {{{ root@client-01:~# cat > /etc/hosts << "EOF" > 127.0.0.1 localhost > 192.168.180.131 client-01 > 192.168.180.132 client-02 > 192.168.180.133 client-03 > EOF }}} * Check the /etc/hosts contents, it should be like this. {{{ root@client-01:~# cat /etc/hosts 127.0.0.1 localhost 192.168.180.131 client-01 192.168.180.132 client-02 192.168.180.133 client-03 }}} * Notice that if there is __127.0.0.1 client-01__ in /etc/hosts, it must be deleted. ---- * '''Step 2''' : Download and Install mpich2 * We use mpcich2-1.0.7rc1 for example. {{{ root@client-01:~# wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/mpich2-1.0.7rc1.tar.gz root@client-01:~# tar -zxvf mpich2-1.0.7rc1.tar.gz cd mpich2-1.0.7rc1/ }}} * You could choose the install location as you wish, but all the nodes must have the same install location!! [[BR]] We use ''/opt/mpich2'' here for example. {{{ root@client-01:~/mpich2-1.0.7rc1# ./configure CFLAGS="-fPIC" CXXFLAGS="-fPIC" FFLAGS="-fPIC" prefix=/opt/mpich2 }}} * If there are errors when configuring mpich2, you may need some packages or libraries such as c/c++ compiler.[[BR]] Try this : {{{ root@client-01:~/mpich2-1.0.7rc1# apt-get install build-essential }}} * compile and install {{{ root@client-01:~/mpich2-1.0.7rc1# make root@client-01:~/mpich2-1.0.7rc1# make install }}} ---- * '''Step 3''' : Check if the Installation is Successful * Try these commands to see if the mpich2 have been installed correctly. {{{ root@client-01:~/mpich2-1.0.7rc1# which mpd /usr/local/bin/mpd root@client-01:~/mpich2-1.0.7rc1# which mpicc /usr/local/bin/mpicc root@client-01:~/mpich2-1.0.7rc1# which mpiexec /usr/local/bin/mpiexec root@client-01:~/mpich2-1.0.7rc1# which mpirun /usr/local/bin/mpirun }}} * If there is nothing showed on the screen when enter ''which xxxx'', you probably need to set your environment variables. The same as the normal user. For convenience, the export should be added in user's bash file or profile. {{{ root@client-01:~/mpich2-1.0.7rc1# export PATH="$PATH:/opt/mpich2/bin" }}} or {{{ user@client-01:~/mpich2-1.0.7rc1# export PATH="$PATH:/opt/mpich2/bin" }}} ---- * '''Step 4''' : Add New Configuration Files * Add ''mpd.host file'' on all the other nodes, you could put it anywhere in your system. {{{ root@client-01:~/mpich2-1.0.7rc1# touch mpd.hosts root@client-01:~/mpich2-1.0.7rc1# cat > mpd.hosts << "EOF" > client-01 > client-02 > client-03 > EOF }}} * Check if the content in mpd.hosts is correct. {{{ root@client-01:~/mpich2-1.0.7rc1# cat mpd.hosts client-01 client-02 client-03 }}} * Execute mpd by '''root''' [[BR]] Add ''/etc/mpd.conf'' on every node, there is just one line needed in this file.[[BR]] Notice that all the nodes must have the same password.[[BR]] Use "this_is_password" for example here.[[BR]] {{{ root@client-01:~/mpich2-1.0.7rc1# touch /etc/mpd.conf root@client-01:~/mpich2-1.0.7rc1# cat > /etc/mpd.conf << "EOF" > secretword=this_is_password > EOF }}} * Execute mpd by '''normal user''' Add ''~/.mpd.conf'' on every node's (the same user's home directory), there is just one line needed in this file.[[BR]] Notice that all the nodes must have the same password.[[BR]] Use "this_is_password" for example here.[[BR]] {{{ user@client-01:~/mpich2-1.0.7rc1# touch ~/.mpd.conf user@client-01:~/mpich2-1.0.7rc1# cat > ~/.mpd.conf << "EOF" > secretword=this_is_user_password > EOF }}} * Change the file properties to 600 for security consideration. {{{ root@client-01:~/mpich2-1.0.7rc1# chmod 600 /etc/mpd.conf }}} ---- * '''Step 5''' : Modify SSH Connections * At first, you need to check if ssh have been installed on your machines.[[BR]] If not, you could try ''apt-get install ssh'' to install ssh.[[BR]] All nodes need to install ssh.[[BR]] * Second, now we are going to establish ssh connections between all nodes without entering password.[[BR]] Issue the following commands on '''ALL''' the other nodes. {{{ root@client-01:~/mpich2-1.0.7rc1# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: d7:ad:ce:0f:e2:d8:16:3d:92:10:b8:b3:7f:7b:68:d9 root@client-01 root@client-01:~/mpich2-1.0.7rc1# }}} * Just press enter when the system promoting the messages. * Copy id_dsa.pub and overwrite to ./authorized_keys {{{ root@client-01:~/mpich2-1.0.7rc1# cd /root/.ssh/ root@client-01:~/.ssh# cp ./id_dsa.pub ./authorized_keys }}} * Copy all files under /root/.ssh/ and overwrite the same directory on all nodes. * Establish ssh connections to all other nodes from node1(client-01).[[BR]] While connecting to the other nodes first time, you need to enter 'yes'.[[BR]] Since you will be automatically login as root to all the other nodes without typing any password.[[BR]] {{{ root@client-01:~# ssh root@client-02 The authenticity of host 'client-02 (192.168.180.132)' can't be established. RSA key fingerprint is 27:17:fa:34:63:c4:a7:c1:ec:25:84:76:6d:03:2d:96. Are you sure you want to continue connecting (yes/no)? yes }}} * '''Repeat the operations above to all the other nodes.'''(If you got 10+ machines, the will be a "big work") * If any error occurs, just remove /root/.ssh/known_hosts, and re-ssh. ---- * '''Step 6''' : Start MPICH2 * The easiest way to setup a parallelized computing environment is to issue: {{{ root@client-01:~# mpdboot -n 3 -f /root/mpich2-1.0.7rc1/mpd.hosts }}} * Check the mpich2 nodes. {{{ root@client-01:~# mpdtrace client-01 client-03 client-02 }}} * You could test the time consuming from first node to last node, such as {{{ root@client-01:~# mpdringtest time for 1 loops = 0.00142002105713 seconds root@client-01:~# mpdringtest 3 time for 3 loops = 0.00336718559265 seconds root@client-01:~# mpdringtest 100 time for 100 loops = 0.108873844147 seconds }}} * For leaving mpd program, issue this command: {{{ root@client-01:~# mpdallexit }}} === 3. Conclusion === * This is the most simple parallelized computing environment, just for teaching and testing. * Do not expect that your job can be submitted to it and run faster! They are just virtual machines. * Establish MPICH2 parallelized computing environment will become very difficult when the number of nodes growing up and up.[[BR]] So, it is very important about how to use DRBL and Clonezilla well to help us build this parallelized environment.[[BR]] I will try to use DRBL to build a 10+ nodes parallelized machine later. ----