lamboot - Start a LAM multicomputer.
SYNTAX
lamboot [-dhvxH] [<bhost>]
OPTIONS
-d Turn on debugging output. This implies -v.
-h Print the command help menu.
-v Be verbose.
-x Run in fault tolerant mode.
-H Do not display the command header.
DESCRIPTION
The lamboot tool starts the LAM software on each of the
machines specified in the boot schema, <bhost>. The user
may wish to first run the recon(1) tool to verify that LAM
can be started.
Starting LAM is a three step procedure. In the first
step, hboot(1) is invoked on each of the specified
machines. Then each machine allocates a dynamic port and
communicates it back to lamboot which collects them. In
the third step, lamboot gives each machine the list of
machines/ports in order to form a fully connected topol
ogy. If any machine was not able to start, or if a time
out period expires before the first step completes, lam
boot invokes wipe(1) to terminate LAM and reports the
error.
The remote shell program that is used to invoke commands
on remote hosts is set when LAM is configured. It is typ
ically rsh, but can be set to any value by the person who
setup/compiled LAM. This program can be overridden at
lamboot invocation time by setting the LAMRSH environment
variable to a suitable remote shell program. For example:
setenv LAMRSH "ssh -x"
This will force LAM to use the "ssh" client to invoke pro
grams on remote nodes, and ensure that "ssh" uses the -x
command line flag (to suppress the ssh 1.x client series
standard information banner that is normally output to the
standard error, which would cause lamboot to fail).
The <bhost> file is a LAM boot schema written in the host
file syntax. See bhost(5). Instead of the command line,
a boot schema can be specified in the LAMBHOST environment
variable. Otherwise a default file, bhost.def, is used.
In addition, lamboot uses a process schema for the indi
vidual LAM nodes. A process schema (see conf(5)) is a
description of the processes which constitute the operat
ing system on a node. In general, the system administra
tor maintains this file. It is also possible for the user
to customize the LAM software with a private process
schema.
Fault Tolerance
If the -x option is given, LAM runs in fault tolerant
mode. In this mode, nodes exchange ``heart beat'' mes
sages periodically to make sure all nodes are running and
the links connecting them are operational. When a node's
heart beats stop, it is declared ``dead'' and all LAM
nodes (and processes) are notified. This allows users to
write fault tolerant applications that can degrade grace
fully, or fully recover by replacing the defunct node with
another (see lamgrow(1)). Since this mode introduces a
performance penalty, it is not activated by default.
EXAMPLES
lamboot -v
Start LAM on the machines described in the default
boot schema. Report about important steps as they are
done.
lamboot mynodes
Start LAM on the machines described in the boot schema
mynodes. Operate silently.
FILES
$LAMHOME/boot/bhost.def default boot schema file
$LAMHOME/boot/conf.lam default process schema file for LAM
nodes
SEE ALSO
recon(1), wipe(1), bhost(5), hboot(1), conf(5), lam-help
file(5)