wiki:chwhs/Grid_environment/GT_DRBL

Version 12 (modified by chwhs, 16 years ago) (diff)

--

Grid_environment

How to Install Globus Toolkit Based on DRBL

  • 1. Introduction

    • The installation guide is written by consulting GT 4.2 Quickstart. We install GT4.2 and DRBL on Debian.

  • 2. Install and set up on the DRBL server and clients

    • At first, you can download gt4.2.0-all-source-installer.tar.gz and build it. After installing, you can test successfully if you follow the instructions(see GT 4.2 Quickstart) to set up your machine. Now we only list some differences and modifications for GT4.2 based on DRBL.

    • part_a) Globus Toolkit:

      • Default shared directory on each machine is $GLOBUS_LOCATION. This directory should be readable and writable by all machines.

      • "$GLOBUS_LOCATION/var" is the only one subdirectory which should be moved to non-NFS mount file system. This is because that globus-schedler-event-generator may doesn't work correctly if the locking of globus-fork.log fails.

        • (1) on the DRBL_server
           cd $GLOBUS_LOCATION
           mv var /etc/
           ln -s /etc/var var
           chmod 622 $GLOBUS_LOCATION/var/globus-for.log (Its permission should be like "-rw--w--w-".)
           cp /etc/var /tftpboot/nodes/[client_ip]/etc/ -rf
          
        • (2) on all DRBL_clients
           chmod 622 $GLOBUS_LOCATION/var/globus-for.log (Its permission should be like "-rw--w--w-".)
          
      • Each machine has its own directory: /etc/grid-security. This directory stores the signed host(container) keys and certificates and grid-mapfile for user authorization.

        • (1) on the DRBL_server
          cp /etc/grid-security /tftpboot/nodes/[client_ip]/etc/ -rf
          
        • (2) on each machine
          To request and sign its own host certificates.
          
        • /etc/grid-security
           root@drbl-srv:/etc/grid-security# ls -l
          drwxr-xr-x  4 root root 4096 2008-09-10 21:45 certificates
          -rw-r--r--  1 root root 4625 2008-09-10 20:00 containercert.pem
          -r--------  1 root root  891 2008-08-15 22:59 containerkey.pem
          lrwxrwxrwx  1 root root   61 2008-09-10 21:04 globus-host-ssl.conf -> /etc/grid-security/certificates/globus-host-ssl.conf.71a89a47
          lrwxrwxrwx  1 root root   61 2008-09-10 21:04 globus-user-ssl.conf -> /etc/grid-security/certificates/globus-user-ssl.conf.71a89a47
          -rw-r--r--  1 root root  277 2008-08-25 21:10 grid-mapfile
          lrwxrwxrwx  1 root root   59 2008-09-10 21:04 grid-security.conf -> /etc/grid-security/certificates/grid-security.conf.71a89a47
          -rw-r--r--  1 root root 4625 2008-09-10 20:00 hostcert.pem
          -rw-r--r--  1 root root 1367 2008-08-15 01:22 hostcert_request.pem
          -r--------  1 root root  891 2008-08-15 01:22 hostkey.pem
          
        • /etc/grid-security/certificates
           root@drbl-srv:/etc/grid-security/certificates# ls -l
          -rw-r--r--  1 root root 1285 2008-09-08 10:16 71a89a47.0
          -rw-r--r--  1 root root 1344 2008-04-09 10:24 71a89a47.signing_policy
          -rw-r--r--  1 root root 2625 2008-09-10 14:37 globus-host-ssl.conf.71a89a47
          -rw-r--r--  1 root root 2625 2008-09-10 14:37 globus-user-ssl.conf.71a89a47
          -rw-r--r--  1 root root 1306 2008-04-09 11:27 grid-security.conf.71a89a47
          
      • Each machine should also has some files in its own "/etc". You should copy this files from the DRBL_server to all clients.
         /etc/sudoers
         /etc/services
         /etc/xinetd.d/myproxy
         /etc/xinetd.d/gridftp
         /etc/xinetd.d/globus-gatekeeper
        
    • part_b) DRBL:

      • After you install DRBL server, you can follow the example when you execute "drblpush -i" to set up the client environments. Each machine need a host certificate for Globus, and its hostname needs to be fixed. In order to avoid modifying some configuration files(ex:/etc/hosts) every time and avoid starting Globus unsuccessfully, you need to let the DHCP service in DRBL server offer same IP address to the client every time when client boots.

        root@drbl-srv:/opt/drbl# /opt/drbl/sbin/drblpush -i -l 0
        
        ******************************************************
        Now we can collect the MAC address of clients!
        If you want to let the DHCP service in DRBL server offer same IP
        address to client every time when client boot, and you never did this
        procedure, you should do it now!
        If you already have those MAC addresses of clients, you can put them
        into different group files (These files number is the same number of
        networks cards for DRBL service). In this case, you can skip this
        step.
        This step helps you to record the MAC addresses of clients, then
        divide them into different groups. It will save your time and reduce
        the typos.
        The MAC addresses will be recorded turn by turn according to the boot
        of clients,
        and they will be put into different files according to the network
        card in server, file name will be like macadr-eth1.txt,
        macadr-eth2.txt... You can find them in directory /etc/drbl.
        Please boot the clients by order, make sure they boot from etherboot or PXE!
        Do you want to collect them ?
        [y/N] y
        ******************************************************
        OK! Let's do it!
        request_eth_port:eth1
        Stopping dhcp3-server ...
        Stopping DHCP server: dhcpd3.
        Stopping tftpd-hpa ...
        Stopping HPA's tftpd: in.tftpd.
        *****************************************************.
        Start detecting MAC address....
        Enter 1 or press Enter to view the collecting status.
        Enter 2 or q to finish collecting and quit.
        1
        =======================================
        00:0C:29:FA:20:5A
        00:0C:29:EB:9E:2F
        00:0C:29:1B:19:1E
        Total: 3
        =======================================
        Enter 1 or press Enter to view the collecting status.
        Enter 2 or q to finish collecting and quit.
        2
        *****************************************************.
        The collected MAC addresses from [eth1] are saved in file(s)
        separately: macadr-eth1.txt.
        These files are saved in directory /etc/drbl.
        ******************************************************
        OK! Let's continue...
        ******************************************************
        Do you want to let the DHCP service in DRBL server offer same IP
        address to the client every time when client boots (If you want this
        function, you have to collect the MAC addresses of clients, and save
        them in file(s) (as in the previous procedure)). This is for the
        clients connected to DRBL server's ethernet network interface eth1 ?
        [y/N] y
        ******************************************************
        OK! Please tell me the file name which contains the MAC address of
        clients line by line for eth1.
        [macadr-eth1.txt]
        ******************************************************
        What is the initial number do you want to use in the last set of
        digits in the IP (i.e. the initial value of d in the IP address
        a.b.c.d) for DRBL clients connected to this ethernet port eth1.
        [1] 10
        ******************************************************
        The file name you set is "macadr-eth1.txt".
        The clients number in this file is 3.
        We will set the IP address for the clients connected to DRBL server's
        ethernet network interface eth1 By the MAC address file you set, the
        IP addresses for the clients connected to DRBL server's ethernet
        network interface eth1 as: 192.168.0.10 - 192.168.0.12
        Accept ? [Y/n] y
        ******************************************************
        OK! Let's continue...
        ******************************************************
        The Layout for your DRBL environment:
        ******************************************************
                 NIC    NIC IP                    Clients
        +-----------------------------+
        |         DRBL SERVER         |
        |                             |
        |    +-- [eth0] 192.168.183.129 +- to WAN
        |                             |
        |    +-- [eth1] 192.168.0.254 +- to clients group 1 [ 3 clients, their IP
        |                             |            from 192.168.0.10 - 192.168.0.12]
        +-----------------------------+
        ******************************************************
        Total clients: 3
        ******************************************************
        Press Enter to continue...
        
      • You need to modify two files: "/opt/drbl/sbin/drbl-nfs-exports" and "/opt/drbl/sbin/drbl-gen-client-files".

        • (1) /opt/drbl/sbin/drbl-nfs-exports
           In "for subnet in $subnet_list" loop, you should add one line.
          
              /usr/local/globus-4.2.0 $subnet.*($EXPORTS_NFS_RW_NRS_OPT) 
          
           In "for ip in `get-client-ip-list`" loop, you should also add one line.
          
              /usr/local/globus-4.2.0 $ip($EXPORTS_NFS_RW_NRS_OPT)
          
           We assume that $GLOBUS_LOCATION is /usr/local/globus-4.2.0.
          
        • (2) /opt/drbl/sbin/drbl-gen-client-files
           You only need to add one line.
          
              $nfsserver:/usr/local/globus-4.2.0   /usr/local/globus-4.2.0   nfs    $FSTAB_NFS_RW_OPT
          
        • After modifying the two files, you should run "/opt/drbl/sbin/drblpush -c /etc/drbl/drblpush.conf" to replace current /etc/exports.
  • 3. Starting Globus container

    • After you install and set up on the DRBL server and clients successfully, you can start the webservices container to test if Globus works well. Globus Toolkit is installed based on DRBL, so its installation directory is shared on all machines. For example, if you install PBS on node1 and install SGE on node2 and build the WS GRAM PBS jobmanager and the WS GRAM SGE jobmanager to be included in Globus webservices container, you should follow method 1 or method 2.

    • Note: You should start the Globus container of the upstream server in Index Service hierarchy first. (You can reference this web page.http://www.globus.org/toolkit/docs/4.2/4.2.0/admin/quickstart/index.html#q-vo)

      • method 1
         node1:
         (1) move $GLOBUS_LOCATION/etc/globus_wsrf_gram_SGE to other directories (ex:$GLOBUS_LOCATION/)  
         (2) start Globus container
         (3) move $GLOBUS_LOCATION/globus_wsrf_gram_SGE back to $GLOBUS_LOCATION/etc/
        
         node2:
         (1) move $GLOBUS_LOCATION/etc/globus_wsrf_gram_PBS to other directories (ex:$GLOBUS_LOCATION/) 
         (2) start Globus container
         (3) move $GLOBUS_LOCATION/globus_wsrf_gram_PBS back to $GLOBUS_LOCATION/etc/
        
      • method 2
         server:
         (1) move $GLOBUS_LOCATION/etc/globus_wsrf_gram_PBS to one non_NFS directory on node1 (ex:/tftpboot/nodes/[node1]/etc/)
         (2) move $GLOBUS_LOCATION/etc/globus_wsrf_gram_SGE to one non_NFS directory on node2 (ex:/tftpboot/nodes/[node2]/etc/)
         (3) cd $GLOBUS_LOCATION/etc
             ln -s /etc/globus_wsrf_gram_PBS globus_wsrf_gram_PBS
             ln -s /etc/globus_wsrf_gram_SGE globus_wsrf_gram_SGE
         (4) start Globus container (node1 and node2) 
             You can try the command. 
             /opt/drbl/bin/drbl-doit /etc/init.d/globus-ws-java-container start
             Note: Please check the installation directory of DRBL and the command for starting GLOBUS
                   because you may install and set up them on other directories.
        
    • In this way, you can successfully start the WS GRAM PBS jobmanager on node1, not on node2. And you can also successfully start the WS GRAM SGE jobmanager on node2, not on node1.
       root@node1:~# ps aux | grep pbs
       root 11941 0.0 0.0 4584 1100 ? S Oct06 0:00 /usr/local/globus-4.2.0/libexec/globus-scheduler-event-generator -s pbs -t 1197005617
      
       root@node2:~# ps aux | grep sge
       root 11943 0.0 0.0 4588 1000 ? S Oct06 0:01 /usr/local/globus-4.2.0/libexec/globus-scheduler-event-generator -s sge -t 1223308213
      

References