Using the globus device on GUSTO


Up: MPI, Globus, and GUSTO Next: Setting up security for GUSTO Previous: MPI, Globus, and GUSTO

Three features of the GUSTO environment make the use of the globus device particularly easy:

* The public-key-based implementation of Globus security services are used; this allows you to log on just once and then access resources at a variety of GUSTO sites. (However, you must first set up your Globus security environment: see below).

* Globus process creation servers called Globus Resource Allocation Managers (GRAMs) or simply resource managers already exist on the various GUSTO resources.

* You have access to a Lightweight Directory Access Protocol (LDAP) database called the Metacomputing Directory Service (MDS) where information describing Globus installations is stored. We shall see how this is used below.

Perform the following experiment to verify that your machine has the Globus client software installed and configured to use GUSTO. Use the following ldapsearch command to query MDS for the names of all GUSTO resource managers:*,

% <globusinstalldir>/bin/ldapsearch "mn=*" | grep ^mn

which should result in something similar to this



Up: MPI, Globus, and GUSTO Next: Setting up security for GUSTO Previous: MPI, Globus, and GUSTO


Setting up security for GUSTO


Up: Using the globus device on GUSTO Next: Authenticating yourself to GUSTO Previous: Using the globus device on GUSTO

Because GUSTO uses public key security, you must spend a few minutes setting up your security environment before you can start using the globus device on GUSTO. This entails obtaining a private key from the GUSTO certificate authority (CA) and setting up some local configuration information. This process is somewhat complex but only has to be performed once. See http://www.globus.org/security/tutorial.html for a more detailed discussion of these topics.

    1. Set your GLOBUS_DIR environment variable to point to the top level Globus directory:

    % setenv GLOBUS_DIR <globusinstalldir>


    2. Augment your path to include the bin directories for both Globus and the SSL libraries, which must also be installed to use the globus device on GUSTO (see http://www.globus.org for details). These are


    $GLOBUS_DIR/bin 
    <sslinstalldir>/bin 
    

    3. Make a directory in which to store your certificate and key:


    % mkdir  /cert 
    % cd  /cert  
    

    4. Run the SSL certreq program to generate your certificate request and private key:

    % certreq

    Something like this will appear upon your screen. It may take a few seconds to finish generating the private key.


    Using configuration from 
    /soft/pub/packages/SSLeay-0.8.1/etc/ssleay.globus.cnf 
    Generating a 1024 bit RSA private key 
    ................................+++++ 
    ......................+++++ 
    writing new private key to 'newkey.pem' 
    Enter PEM pass phrase: 
    
    At this point, you should enter in a pass phrase. A pass phrase is basically a password, except that it can be longer (64 characters) and can include spaces. It will immediately ask you to re-enter your password, for verification. The subsequent screen output will be:


    Verifying password - Enter PEM pass phrase: 
    ----- 
    You are about to be asked to enter information that will be 
    incorporated 
    into your certificate request. 
    What you are about to enter is what is called a Distinguished Name 
    or a DN. 
    There are quite a few fields but you can leave some blank 
    For some fields there will be a default value, 
    If you enter '.', the field will be left blank. 
    ----- 
    Country Name (2 letter code) [US]: 
    
    You can press return to answer any question with the default value (within the brackets [] ). Otherwise, enter the appropriate information. The next few questions are as follows:


    Main Organization [Globus]: 
    Home Globus Site [Argonne National Laboratory]: 
    Organizational Unit Name (eg, section) [MCS]: 
    name (eg, Globus id without the @globus.org) []: 
    
    For name, enter your old ``globusid'', which tended to be your username or last name, (e.g., smartin, wsmith, foster).

    The next questions pertain to optional information not currently being used by the authentication program. You may simply hit return to all further questions. The screen output will be:


    Please enter the following 'extra' attributes 
    to be sent with your certificate request 
    A challenge password []:  
    An optional company name []: 
    Private key is in newkey.pem 
    Request is in newreq.pem 
    E-Mail the newreq.pem file to Globus CA:  
    ca@globus.org 
    
    At this point, you are done with the certreq script. It will return you to the UNIX prompt, having created 2 new files in your directory, newkey.pem and newreq.pem.


    5. Mail the newreq.pem file to the GUSTO CA:

    % mail ca@globus.org < newreq.pem

    You should receive a response from the CA with your signed certificate.


    6. Store the certificate received from the GUSTO CA (including the ----BEGIN CERTIFICATE---- and ----END CERTIFICATE---- delimiters) in the file /cert/newcert.pem.


    7. Set protection modes on the files in /cert:

    % chmod 444 newcert.pem 
    % chmod 400 newkey.pem 
    
    These protections are essential if you are to insure the security and integrity of your private key and certificate. Basically they ensure that the certificate is world readable but unalterable, while the private key file is readable only by you, the user, and also unalterable.


    8. Use the setenv command to set the environment variables used to locate your certificate file, private key, and trusted certificates:


    % cd 
    % setenv X509_CERT_DIR $GLOBUS_DIR/share 
    % setenv X509_USER_CERT cert/newcert.pem 
    % setenv X509_USER_KEY cert/newkey.pem 
    % setenv X509_USER_PROXY cert/newproxy.pem 
    
    The first of these commands indicates the directory containing the trusted certificate; the second the filename of the file containing your certificate; the third the filename of the file containing your private key; and the fourth the filename in which your temporary proxy certificate and key should be stored.


    9. Enable access to the resources that you wish to use. Before you can use any GUSTO resource, you must (a) have an account on those resources, and (b) have any entry in an access control list called a globusmap file associated with the resource. You must email the Globus administrator of each resource in question.



Up: Using the globus device on GUSTO Next: Authenticating yourself to GUSTO Previous: Using the globus device on GUSTO


Authenticating yourself to GUSTO


Up: Using the globus device on GUSTO Next: Using  mpirun on GUSTO Previous: Setting up security for GUSTO

Now that you have set up your Globus security environment, you are ready to run some MPI programs. As a first step, we authenticate ourselves (``log on'') to GUSTO by running the command cinit. This command creates a temporary credential, which allows you to use GUSTO resources for a certain amount of time. This credential is stored in a file, the name of which must be recorded in the environment variable X509_USER_PROXY. Hence, we might type:

% setenv X509_USER_PROXY /tmp/my_temporary_cert 
% cinit -out $X509_USER_PROXY 
The temporary credential is valid for a default period of time, typically 12 hours. You can change this period by using the -hours flag on the cinit command, for example:
% setenv X509_USER_PROXY /tmp/my_temporary_cert 
% cinit -hours 6 -out $X509_USER_PROXY 
You may acquire a new security credential at any time. If you acquire a new credential before a previous one expires, the new credential simply overwrites the old one. Attempting to start an MPI application in the presence of an expired security credential will result in failure.



Up: Using the globus device on GUSTO Next: Using  mpirun on GUSTO Previous: Setting up security for GUSTO


Using mpirun on GUSTO


Up: Using the globus device on GUSTO Next: Advanced features of the globus device Previous: Authenticating yourself to GUSTO

Before typing your first mpirun command you must identify the computers on which you wish to run your application. This is done by listing the manager names associated with these computers in a machines file on your Globus client. For example, the following machines file


ico16.mcs.anl.gov-easymcs@globus.org 10 
sp023e.sp.uh.edu-loadleveler@globus.org 5 
names managers associated with two IBM SPs, one at Argonne and the other at the University of Houston. Assuming the machines file and executable myapp are in your current directory, you can then start your application as follows:

% mpirun -np 15 -globusargs stage myapp

This command loads myapp on 10 nodes on the Argonne SP, transfers a copy of myapp to the Houston SP (-globusargs stage), and loads myapp on 5 nodes there. (The number of nodes to create on each machine are specified by the counts appearing at the end of each line in the machines file.) Issues of authentication and the submission of appropriate requests to the Argonne and Houston SP schedulers are handled automatically. All 15 nodes behave as a single MPI application, i.e., a single MPI_COMM_WORLD with nodes ranked 0 through 14. Standard output and standard error are routed back to the originating node. Hence, the behavior is identical to that of an MPI program running on a single 15-node parallel computer.

Determining resource manager names. This example assumes that you know the resource manager names for the computers on which you want to run. To learn the manager name for a particular machine*, e.g., pitcairn.mcs.anl.gov, use the ldapsearch command introduced earlier:


yukon% ldapsearch "hn=pitcairn.mcs.anl.gov" | grep ^mn 
mn=pitcairn.mcs.anl.gov-fork@globus.org, ou=MCS, o=Argonne National Laboratory, 
o=Globus, c=US 
mn=pitcairn.mcs.anl.gov-fork@globus.org 
which returns two different names for the one GRAM running on pitcairn.mcs.anl.gov. The first, longer name is called the distinguished name; we are only interested in the second, shorter name, pitcairn.mcs.anl.gov-fork@globus.org.

To find all the manager names at a particular location, e.g., Argonne National Laboratory,


ldapsearch "mn=*" mn | grep ^mn | grep "anl.gov" 
mn=pitcairn.mcs.anl.gov-fork@globus.org, ou=MCS, o=Argonne National Laboratory, 
o=Globus, c=US 
mn=pitcairn.mcs.anl.gov-fork@globus.org 
mn=ico16.mcs.anl.gov-fork@globus.org, ou=MCS, o=Argonne National Laboratory, o=G 
lobus, c=US 
mn=ico16.mcs.anl.gov-fork@globus.org 
mn=ico16.mcs.anl.gov-easymcs@globus.org, ou=MCS, o=Argonne National Laboratory,  
o=Globus, c=US 
mn=ico16.mcs.anl.gov-easymcs@globus.org 

Specifying node counts. The optional numbers appearing at the end of each line in a machines file (default value = 1) are used to determine the maximum number of nodes to create on each machine. Hence, given the machines file above, the command mpirun -np 8 would start just 8 nodes on the Argonne SP. mpirun ``wraps around'' the machines file, so -np 18 would create 10 nodes on the Argonne SP, 5 nodes at Houston, and a further 3 nodes at Argonne. These three groups of nodes comprise three distinct ``subjobs.'' This distinction is important because communication within a subjob can be very different from communication between subjobs, particularly on MPPs such as the IBM SP. Communication between subjobs is always done using TCP/IP. Communication within a subjob is done using the fastest protocol available, for example IBM's MPL.

Staging. In the above example, we used the -globusargs stage option to request that our application be staged to the computers on which we wanted to run. Staging works here because the computers in question are binary compatible. If even one of the platforms listed in the machines file is not binary compatible then you may not use the -globusargs stage option. Instead, you must either stage executables manually, prior to running the program, or use the more flexible staging commands described in Advanced features of the globus device below.

If you wanted to run an application on a cluster of binary compatible workstations (one process on each) that all share the same filesystem, then staging is not required. In this case you write a machines file listing the Globus servers in your cluster, e.g.,


pitcairn.mcs.anl.gov-fork@globus.org 
tuva.mcs.anl.gov-fork@globus.org 
and omit the -globusargs stage option from the mpirun command:

% mpirun -np 2 myapp

Locating the machines file. The mpirun command determines which machines file to use as follows:

    1. If a -machinefile <machinefilename> argument is specified on the mpirun command, it uses that; otherwise,
    2. it looks for a file machines in the directory in which you typed mpirun; and finally,
    3. it looks for a machines file in <mpidir>/lib/<arch>/globus where <mpidir> is the directory where you built MPICH and <arch> is the architecture MPICH was built on, e.g., solaris, IRIX64, ... .
If none of these information are provided, then mpirun fails.



Up: Using the globus device on GUSTO Next: Advanced features of the globus device Previous: Authenticating yourself to GUSTO


Advanced features of the globus device


Up: Using the globus device on GUSTO Next: Acknowledgments Previous: Using  mpirun on GUSTO

As noted above, the -globusargs stage command does not support staging when machines are not binary compatible. In this situation, we must currently use something called a Resource Specification Language (RSL) request to specify the executable filename for each machine. This technique is very flexible, but rather complex; work is currently underway to simplify the manner in which these issues are addressed.

The easiest way to learn how to write your own RSL request is to study the one generated for you by mpirun. Consider the example where we wanted to run an application on a cluster of workstations. Recall our machines file looked like this:


pitcairn.mcs.anl.gov-fork@globus.org 
tuva.mcs.anl.gov-fork@globus.org 
To view the RSL request generated in this situation, without actually launching the program, we type the following mpirun command:

% mpirun -globusargs dumprsl -np 2 myapp 123 456

which produces the following output:


+ 
( &(resourceManagerContact="pitcairn.mcs.anl.gov:8711:pitcairn.mcs.anl.gov-f 
ork@globus.org")  
   (count=1) 
   (label="subjob 0") 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
( &(resourceManagerContact="tuva.mcs.anl.gov:8711:tuva.mcs.anl.gov-fork@glob 
us.org")  
   (count=1) 
   (label="subjob 1") 
   (arguments=" 123 456") 
   (directory=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus) 
   (executable=/homes/karonis/MPI/mpich.yukon/mpich/lib/IRIX64/globus/myapp) 
) 
This RSL specification specifies such things as the ``contact string'' for the GRAMs that we will be using, the number of nodes that we want to create, a unique label for each subjob, our home directory, and the name of the executable that we will be using.

We can write a custom RSL request by simply modifying the request obtained above. For example, if we want to use a different executable on the two different machines, we can simply change the executable name. Once we have made this change, we use the -globusrsl option to supply this modified request to mpirun:

% mpirun -globusrsl <myrslrequestfile>

where <myrslrequestfile> is the filename of your RSL reqeust. You do not specify any other arguments to mpirun (e.g., -np, executable, command line arguments, etc.), and even the machines file is ignored, as all required information is contained in the RSL request.

RSL is a flexible language capable of doing much more than has been presented here. For example, it can be used to stage executables and to set environment variables on remote computers before starting execution. A full description of the language can be found at http://www.globus.org.



Up: Using the globus device on GUSTO Next: Acknowledgments Previous: Using  mpirun on GUSTO