mpirun - Run MPI programs on LAM nodes.

SYNTAX

       mpirun [-fhvO] [-c <#> | -np <#>] [-D | -wd <dir>] [-ger |
              -nger] [-c2c | -lamd] [-nsigs] [-nw | -w] [-nx]
              [-pty] [-s <node>] [-t | -toff | -ton] [-x
              VAR1[=VALUE1][,VAR2[=VALUE2],...]]  [<nodes>]
              <program> [-- <args>]


       mpirun [-fhvO] [-D | -wd <dir>] [-ger | -nger] [-lamd |
              -c2c] [-nsigs] [-nw | -w] [-nx] [-pty] [-t | -toff
              | -ton] [-x VAR1[=VALUE1][,VAR2[=VALUE2],...]]
              <schema>


OPTIONS

       There are two forms of the mpirun command -- one for  pro­
       grams  (i.e., SPMD-style applications), and one for appli­
       cation schemas (see appschema(5)).  Both forms  of  mpirun
       use  the  following  options  by  default:  -c2c -nger -w.
       These may each be overriden by their counterpart  options,
       described below.

       Additionally,  mpirun  will send the name of the directory
       where it was invoked on the local node to each of the  re­
       mote  nodes, and attempt to change to that directory.  See
       the "Current Working Directory" section, below.

       -c <#>    Synonym for -np (see below).

       -c2c      Use "client to client" (c2c) mode for MPI commu­
                 nication  in  the  user  program.  This mode can
                 significantly speed  up  some  applications,  as
                 messages will be passed directly from the source
                 rank to the destination rank;  the  LAM  daemons
                 will  not be used as third-party message passing
                 agents.  However, this disables  monitoring  and
                 debugging capabilities; see MPI(7).  This option
                 is mutually exclusive with -lamd.

       -D        Use the executable program location as the  cur­
                 rent  working  directory  for created processes.
                 The current working  directory  of  the  created
                 processes  will be set before the user's program
                 is invoked.  This option is  mutually  exclusive
                 with -wd.

       -f        Do not configure standard I/O file descriptors -
                 use defaults.

       -h        Print useful information on this command.

                 MPI(7) for a description of GER.  This option is
                 mutually exclusive with -nger.

       -lamd     Use the LAM "daemon mode" for MPI communication.
                 See -c2c (above) and MPI(7) for a description of
                 the "daemon mode" communication.

       -nger     Disable  GER  (Guaranteed  Envelope  Resources).
                 This option is mutually exclusive with -ger.

       -nsigs    Do not have LAM catch signals.

       -np <#>   Run this many copies of the program on the given
                 nodes.  This option indicates that the specified
                 file is an executable program and not an  appli­
                 cation  schema.   If no nodes are specified, all
                 LAM nodes are  considered  for  scheduling;  LAM
                 will  schedule  the  programs  in  a round-robin
                 fashion, "wrapping around" (and scheduling  mul­
                 tiple copies on a single node) if necessary.

       -nw       Do not wait for all processes to complete before
                 exiting mpirun.  This option is mutually  exclu­
                 sive with -w.

       -nx       Do  not  automatically export LAM_MPI_* environ­
                 ment variables to the remote nodes.

       -O        Multicomputer is homogeneous.  Do no  data  con­
                 version when passing messages.

       -pty      Enable  pseudo-tty support.  Among other things,
                 this  enabled  line-buffered  output  (which  is
                 probably  what  you want).  The only reason that
                 this feature is not enabled by  default  is  be­
                 cause  it is so new and has not been extensively
                 tested yet.

       -s <node> Load the program from this node.  This option is
                 not  valid on the command line if an application
                 schema is specified.

       -t, -ton  Enable execution trace generation for  all  pro­
                 cesses.   Trace  generation will proceed with no
                 further action.  These options are mutually  ex­
                 clusive with -toff.

       -toff     Enable  execution  trace generation for all pro­
                 cesses.  Trace generation will begin after  pro­
                 cesses collectively call MPIL_Trace_on(2).  This
                 option is mutually exclusive with -t and -ton.


       -w        Wait  for all applications to exit before mpirun
                 exits.

       -wd <dir> Change to the directory <dir> before the  user's
                 program  executes.   Note that if the -wd option
                 appears both on the command line and in  an  ap­
                 plication  schema,  the schema will take precen­
                 dence over the command line.  This option is mu­
                 tually exclusive with -D.

       -x        Export  the  specified  environment variables to
                 the remote nodes before executing  the  program.
                 Existing  environment variables can be specified
                 (see the Examples section, below), or new  vari­
                 able  names specified with corresponding values.
                 The parser for the -x option is not very sophis­
                 ticated; it does not even understand quoted val­
                 ues.  Users are advised to set variables in  the
                 environment,  and then use -x to export (not de­
                 fine) them.

       -- <args> Pass these runtime arguments to every  new  pro­
                 cess.   This must always be the last argument to
                 mpirun.  This option is not valid on the command
                 line if an application schema is specified.


DESCRIPTION

       One invocation of mpirun starts an MPI application running
       under LAM.  If the application is simply SPMD, the  appli­
       cation  can  be  specified on the mpirun command line.  If
       the application is MIMD, comprising multiple programs,  an
       application  schema  is  required in a separate file.  See
       appschema(5) for a description of the  application  schema
       syntax,  but  it essentially contains multiple mpirun com­
       mand lines, less the command name itself.  The ability  to
       specify  different options for different instantiations of
       a program is another reason to use an application  schema.

   Application Schema or Executable Program?
       To  distinguish  the  two different forms, mpirun looks on
       the command line for <nodes> or the -c option.  If neither
       is  specified,  then the file named on the command line is
       assumed to be an application schema.   If  either  one  or
       both are specified, then the file is assumed to be an exe­
       cutable program.  If <nodes> and -c  both  are  specified,
       then  copies  of  the program are started on the specified
       nodes according to  an  internal  LAM  scheduling  policy.
       Specifying just one node effectively forces LAM to run all
       copies of the program in one place.  If -c is  given,  but
       not  <nodes>,  then all LAM nodes are used.  If <nodes> is
       given, but not -c, then one copy of the program is run  on
       By  default,  LAM  searches for executable programs on the
       target node where a particular instantiation will run.  If
       the  file system is not shared, the target nodes are homo­
       geneous, and the program is frequently recompiled, it  can
       be  convenient  to  have  LAM  transfer the program from a
       source node (usually the local node) to each target  node.
       The  -s  option specifies this behavior and identifies the
       single source node.

   Locating Files
       LAM looks for an executable program by searching  the  di­
       rectories  in  the user's PATH environment variable as de­
       fined on the source node(s).  This behavior is  consistent
       with  logging  into the source node and executing the pro­
       gram from the shell.  On remote nodes, the "." path is the
       home directory.

       LAM  looks for an application schema in three directories:
       the local directory, the value of the LAMAPPLDIR  environ­
       ment  variable, and LAMHOME/boot, where LAMHOME is the LAM
       installation directory.

   Standard I/O
       LAM directs UNIX standard input to /dev/null on all remote
       nodes.   On  the  local node that invoked mpirun, standard
       input is inherited from mpirun.  The default is what  used
       to  be  the -w option to prevent conflicting access to the
       terminal.

       LAM directs UNIX standard output and error to the LAM dae­
       mon  on  all  remote  nodes.   LAM ships all captured out­
       put/error to the node that invoked mpirun and prints it on
       the  standard output/error of mpirun.  Local processes in­
       herit the standard output/error of mpirun and transfer  to
       it directly.

       Thus  it  is possible to redirect standard I/O for LAM ap­
       plications by using the typical shell  redirection  proce­
       dure on mpirun.

              % mpirun N my_app < my_input > my_output

       The  -f  option  avoids  all the setup required to support
       standard I/O described above.  Remote processes  are  com­
       pletely  directed to /dev/null and local processes inherit
       file descriptors from lamboot(1).

   Pseudo-tty support
       The -pty option enabled  pseudo-tty  support  for  process
       output.    This  allows,  among  other  things,  for  line
       buffered output from remote nodes (which is probably  what
       you want).
       cause it has not been thoroughly tested on  a  variety  of
       different Unixes. Users are encouraged to use -pty and re­
       port any problems back to the LAM Team.

   Current Working Directory
       The default behavior of mpirun has changed with respect to
       the directory that processes will be started in.

       The  -wd  option to mpirun allows the user to change to an
       arbitrary directory before their program is  invoked.   It
       can  also  be  used in application schema files to specify
       working directories on specific nodes and/or for  specific
       applications.

       If the -wd option appears both in a schema file and on the
       command line, the schema file directory will override  the
       command line value.

       The -D option will change the current working directory to
       the directory where the executable resides.  It cannot  be
       used  in application schema files.  -wd is mutually exclu­
       sive with -D.

       If neither -wd nor -D are specified, the local  node  will
       send  the  directory name where mpirun was invoked from to
       each of the remote nodes.  The remote nodes will then  try
       to  change  to that directory.  If they fail (e.g., if the
       directory does not exists on that node), they  will  start
       with from the user's home directory.

       All directory changing occurs before the user's program is
       invoked; it does not wait until MPI_INIT is called.

   Process Environment
       Processes in the MPI application inherit their environment
       from  the  LAM daemon upon the node on which they are run­
       ning.  The environment of a LAM daemon is fixed upon boot­
       ing  of  the LAM with lamboot(1) and is inherited from the
       user's shell.  On the origin node this will be  the  shell
       from which lamboot(1) was invoked and on remote nodes this
       will be the shell started by rsh(1).  When running dynami­
       cally   linked   applications  which  require  the  LD_LI­
       BRARY_PATH environment variable to be set,  care  must  be
       taken  to ensure that it is correctly set when booting the
       LAM.

   Exported Environment Variables
       All environment variables  that  are  named  in  the  form
       LAM_MPI_*  will automatically be exported to new processes
       on the local and remote nodes.  This exporting may be  in­
       hibited with the -nx option.

       While the syntax of the -x option allows the definition of
       new variables, note that the parser  for  this  option  is
       currently not very sophisticated - it does not even under­
       stand quoted values.  Users are advised to  set  variables
       in  the  environment and use -x to export them; not to de­
       fine them.

   Trace Generation
       Two switches control trace generation from processes  run­
       ning  under  LAM  and  both must be in the on position for
       traces to actually be generated.  The first switch is con­
       trolled  by  mpirun and the second switch is initially set
       by  mpirun  but   can   be   toggled   at   runtime   with
       MPIL_Trace_on(2)  and  MPIL_Trace_off(2).  The -t (-ton is
       equivalent) and  -toff  options  all  turn  on  the  first
       switch.   Otherwise  the  first switch is off and calls to
       MPIL_Trace_on(2) in the application program  are  ineffec­
       tive.  The -t option also turns on the second switch.  The
       -toff  option  turns   off   the   second   switch.    See
       MPIL_Trace_on(2) and lamtrace(1) for more details.

   MPI Data Conversion
       LAM's  MPI library converts MPI messages from local repre­
       sentation to LAM representation upon sending them and then
       back  to local representation upon receiving them.  If the
       case of a LAM consisting of a homogeneous network  of  ma­
       chines where the local representation differs from the LAM
       representation this can result in unnecessary conversions.
       The  -O switch can be used to indicate that the LAM is ho­
       mogeneous and turn off data conversion.

   Direct MPI Communication
       For much improved performance but much  decreased  observ­
       ability,  the -c2c option directs LAM's MPI library to use
       the most direct underlying mechanism to  communicate  with
       other processes, rather than use the network message-pass­
       ing of  the  LAM  daemon.   Unreceived  messages  will  be
       buffered  in  the  destination  process instead of the LAM
       daemon.  MPI process and message monitoring  commands  and
       tools  will be much less effective, usually reporting run­
       ning processes and empty message queues.  Signal  delivery
       with doom(1) is unaffected.

   Guaranteed Envelope Resources
       By default, LAM will guarantee a minimum amount of message
       envelope buffering to each MPI process pair and  will  im­
       pede  or  report  an  error  to a process that attempts to
       overflow this system resource.  This robustness and debug­
       ging  feature  is implemented in a machine specific manner
       when direct communication (-c2c) is used.  For normal  LAM
       communication via the LAM daemon, a protocol is used.  The
       -nger option disables GER and the measures taken  to  sup­
       details.


EXAMPLES

       mpirun N prog1
           Load  and  execute prog1 on all nodes.  Search for the
           executable file on each node.

       mpirun -c 8 prog1
           Run 8 copies of prog1 wherever LAM wants to run  them.

       mpirun n8-10 -v -nw -s n3 prog1 -- -q
           Load  and execute prog1 on nodes 8, 9, and 10.  Search
           for prog1 on node 3 and transfer it to the three  tar­
           get  nodes.   Report as each process is created.  Give
           "-q" as a command line to each new  process.   Do  not
           wait  for  the  processes  to  complete before exiting
           mpirun.

       mpirun -v myapp
           Parse the application schema,  myapp,  and  start  all
           processes  specified in it.  Report as each process is
           created.

       mpirun N N -pty -wd /workstuff/output -x DISPLAY
       run_app.csh
           Run the application "run_app.csh" (assumedly a C shell
           script) twice on each node in the  system  (ideal  for
           2-way  SMPs).   Also enable pseudo-tty support, change
           directory to /workstuff/output, and export the DISPLAY
           variable  to  the  new  processes  (perhaps  the shell
           script will invoke an X application such as xv to dis­
           play output).

       mpirun -np 5 -D `pwd`/my_application
           A  common  usage  of  mpirun  in  environments where a
           filesystem is shared between all nodes in  the  multi­
           computer, using the shell-escaped "pwd" command speci­
           fies the full name of the  executable  to  run.   This
           prevents  the  need  for  putting the directory in the
           path; the remote notes will have an absolute  filename
           to  execute  (and  change directory to it upon invoca­
           tion).


DIAGNOSTICS

       mpirun: Exec format error
           A non-ASCII character was detected in the  application
           schema.   This  is  usually a command line usage error
           where mpirun is expecting an application schema and an
           executable file was given.

       mpirun: syntax error in application schema, line XXX
           The  application  schema cannot be parsed because of a
           This error can occur in two cases.  Either  the  named
           file  cannot  be  located or it has been found but the
           user does not have sufficient permissions  to  execute
           the program or read the application schema.


SEE ALSO

       mpimsg(1),     mpitask(1),     lamexec(1),    lamtrace(1),
       MPIL_Trace_on(2), loadgo(1)