qmanager.doc:  Description of the unix based TRANSP queue manager.
==================================================================

Contents:

  0.  Introduction.
  1.  Common File system.
  2.  The Run Queue.
  3.  User Interface.
        a)  Command Interface
        b)  GUI
  4.  Configuration Database.
  5.  The Queue Server.
  6.  The Compute Servers.
  7.  The "master daemon".

Appendices:
  A.  Automated Code Maintenance
  B.  Troubleshooting Hints
  C.  Miscellaneous Notes
  D.  Possible Future Developments

0.  Introduction.
=================

The `qmanager' system provides an efficient mechanism for facilitating
the sharing of N TRANSP compute server machines amongst M TRANSP users.

The qmanager is a fundamental component of a unix based multi-user TRANSP
run production system.  Its use in a single user TRANSP code development
system is optional, but may prove convenient.

The system can be thought of as consisting of:

  * Users:  create input data, generate run requests, examine results.
  * Queue Server:  gathers and processes user run requests, sending
    runs in an orderly way to compute servers as these become available.
  * Compute Servers:  the machines where the TRANSP runs are actually
    carried out.

A more detailed list of qmanager system components follows:

1.  Common File System:  the queue server, compute servers, and user machines
need to share a common file system, with consistent naming of directories, in
order for the communications between machines required for orderly queueing
of runs to function properly.

2.  The Run Queue:  A directory, writable by all users, where run queue
requests are posted.  The TRANSP system software also uses this directory
to post changes to the status of runs, as these occur.  Runs are generally
executed in the order queued, subject to availability of compute servers.
Users can jump the queue by asserting a high priority for their runs, but,
they must post an explanation of their action.  Run ownership is tracked,
but beyond this security is minimal, as an honor system is presumed to exist
amongst the TRANSP users.

3.  User Interface and user machines:  A set of user interface commands
allows users to view the status of all jobs in the TRANSP queue, to add 
runs to the queue, to remove runs from the queue, to examine partial output
from an executing run, and to terminate an executing run.  Users can use
"staging directories" on their own machines, provided the queue server can
have read access to the staging directory (to copy out the namelist when
time comes to execute the user's run).  Typically, each user sets up 
namelists and input data in a staging directory chosen by the user.  The
user posts a run request when the input data is fully prepared, by means of 
qmanager's `enqueue' command.

All user interface functions can also be accessed via the Tcl/Tk based GUI 
application `xlauncher'.  To find xlauncher, add $CODESYSDIR/wish to the
user's PATH environment variable.

4.  Configuration Database:  A set of data files which characterize the
available TRANSP compute server machines.  The database contains such
information as a ranking of the computer servers by speed, the number of
simultaneously executing TRANSP runs allowed on each machine, a list of
machines that are off-line e.g. due to hardware trouble, and access
control data by which certain machines can be reserved to certain users.
The configuration database is maintained by a "TRANSP system administrator"
and is not normally directly accessible to users.

5.  The Queue Server:  One machine (which may or may not also be one of
the compute servers), with access to the run queue and the configuration
database, serves the queue by actually dispatching runs and related 
requests to the compute server machines.  The queue server runs out of
the `master daemon' job which is regularly and frequently scheduled (i.e. 
once every five minutes).

6.  The Compute Servers:  The machines where TRANSP jobs are actually 
executed.  The compute servers execute scripts which look for run requests
and launch runs.  The run control scripts provide for the generation of files
containing data which allow the progress of runs to be monitored.

7.  The Master Daemon:  A control script that must run on every machine that
is to function as a queue server and/or a compute server and/or a build 
server that carries out code maintenance functions.  Traditionally,
master.daemon is scheduled once every 5 minutes as a `cron' job.  The
master.daemon job will automatically restart TRANSP jobs interrupted by 
a system crash or scheduled downtime.  Systems administration can rely
on this and so reboot TRANSP machines more or less at will without concern
about losing jobs.

Each of these components will be described in detail.


1.  Common File System.
=======================

A common file system needs to exist in order for the queue manager software
to operate correctly in the task of carrying user requests from user machines
to the TRANSP compute server machines.  Although the TRANSP system's directory
structure is suggestive of ways to set up this common file system, the actual
task of producing the common file system functionality will be up to the 
systems administration of the site where TRANSP is installed for production
use.  At PPPL, we use NFS for directory sharing, but there are other options.

For those familiar with the standard set of environment variables used
to identify standard TRANSP system subdirectories, the following table 
may help in the task of understanding which directories should be machine
local, and which directories should be shared, in an installed TRANSP 
production system.  There are actually three categories to be considered:

    local        -- local to the machine or workstation in question
    shared       -- common directory shared by all machines
    architecture -- for binaries:  either local to each machine, or,
                    preferably, shared amongst binary compatible machines.

environment
variable       translation or link    description              category
-------------------------------------------------------------------------
TRANSPROOT     root directory         root directory           local

TMPDIR         $TRANSPROOT/tmp        temporary files          local
REQUESTDIR     $TRANSPROOT/request    server requests          local
DBGDIR         $TRANSPROOT/debug      debug work area          local
WORKDIR        $TRANSPROOT/work       root, work directories   local

...<none>...   $TRANSPROOT/daemon     daemon scripts dir.      shared
DATADIR        $TRANSPROOT/data       input data root dir.     shared
LOGDIR         $TRANSPROOT/log        log files root dir.      shared
RESULTDIR      $TRANSPROOT/result     results data root dir.   shared
CONFIGDIR      $TRANSPROOT/config     configuration data       shared
CODESYSDIR     $TRANSPROOT/codesys    source code root dir.    shared

QSHARE         $TRANSPROOT/qshare     user requests and run state files
  [QSHARE is writable by all TRANSP users]                     shared

LOCAL          $TRANSPROOT/local      code binaries            architecture
SIGTABLDIR     $TRANSPROOT/sigtabl    atomic physics tables    architecture

Generally, the shared directories appear as links to NFS exported filesystems
on all machines except perhaps those machines to which the shared directories
are local.

The TRANSP queue server and compute servers must have all of the above
defined, with read/write access.  At PPPL we have invented a username
`pshare', under which all TRANSP production servers operate.

User accounts need read access to all directories in the "shared" and
"architecture" categories.  In addition, write access to $QSHARE and
to $DATADIR subdirectories will be needed, and, each user needs a writable 
$TMPDIR though it should probably be defined as something other than 
a subdirectory of $TRANSPROOT.  The following paragraphs give a brief
description of the various TRANSP system directories.

The TRANSP source code, including shell scripts for the queue manager,
are installed under a root directory identified by the environment
variable $CODESYSDIR.  Users wishing to examine the TRANSP source code
will need read access.  The local configuration of the TRANSP production 
system (the configuration database) will be defined by a directory indicated
by the environment variable $CONFIGDIR.  With these environment variables, 
the user will be able to see:

  $CODESYSDIR/source...    (the TRANSP source code)
  $CODESYSDIR/qcsh         (queue manager shell scripts)

  $CONFIGDIR/TOKAMAK.DAT   (file containing list of known tokamaks)

and other files and directories.  (The $CODESYSDIR notation will be used
in this document; $CODESYSDIR indicates "the value of the environment
variable CODESYSDIR").

The queue server machine will use the "configuration database" ($CONFIGDIR)
which defines the compute servers and their TRANSPROOTs.  The configuration
database will also specify the machines responsible for once per night runs
of make files to keep the various sets of code binaries up-to-date under
the various instances of $LOCAL (at least one per architecture).

The TRANSP code writes its output to subdirectories of a root directory for
the TRANSP output directory tree, referred to as $RESULTDIR.  Users have
read access to $RESULTDIR so that they can see the results of completed
TRANSP runs-- not just their own runs but all users' runs.  The subdirectories
of $RESULTDIR are named from <tokamak-shot-year> strings, consisting of a 
known tokamak id, a dot, and a two digit shot year code, for example:

  $RESULTDIR/D3D.97    (analysis of D3D 1997 shots)
  $RESULTDIR/TFTR.88   (analysis of TFTR 1988 shots).

If the production system is heavily used, it is likely that automated
archival and retrieval of TRANSP results will be needed (note that this
implies a need to keep track of all existing runids; see the description
of the ENQUEUE_EXIST procedure below, under the user interface section).

If the Ufiles system is used for providing input data to TRANSP, the input 
data for TRANSP runs should be visible to all users under a common, shared 
root directory $DATADIR.  (If MDS+ is used instead of Ufiles, data will be
accessed via a call to the MDS+ server, and the $DATADIR tree is no longer
needed.  However, the user must still supply the input data, and that is
the hard part!).  Under $DATADIR there should exist a subdirectory for each 
tokamak, e.g.:

  $DATADIR/D3D         (root directory for D3D TRANSP input data)
  $DATADIR/TFTR        (root directory for TFTR TRANSP input data).

Under these roots the organization of the data into further subdirectories
is site-determined.  (Here are some strategies that have been used:  a
separate subdirectory for each shot, a separate subdirectory for each
related group of shots, a separate subdirectory for each worker involved
in data preparation).  The TRANSP namelist (TRDAT section) will set the
INPUTDIR character variable to the appropriate subdirectory under
$DATADIR/<tokamak-id>.  If INPUTDIR is defaulted, the tokamak root
subdirectory itself is expected to contain the Ufiles.

$DATADIR needs to be writable by all users, so that users may put their
TRANSP Ufile input data in place.  Subdirectories might be owned by
individual users.  Other organizations of input data (not using $DATADIR)
are also possible.

TRANSP executable binaries are stored in what TRANSP itself knows as

  $LOCAL/exe

where $LOCAL has a different value depending on the machine architecture 
(i.e. one value of $LOCAL for DEC UNIX machines, another for SUN, and 
another for HP).  Users will need to be able to find the TRANSP executable 
binaries directory in order to be able to run useful programs such as 
qmonitor and trdat.  The user should be able to see the correct binaries 
from whichever machine he/she is using.  The user's PATH environment
variable should be set accordingly, to find these programs.

The log file output of TRANSP runs are stored in the directory 

  $LOGDIR/runlog

where $LOGDIR points to the shared root directory for log files, which
should be readable by all TRANSP users.

Users will use "staging directories" of their own choosing for preparation
of TRANSP namelists.  These working directories could be chosen to be sub-
directories of $DATADIR, although normally only one user should be using
any given staging directory.  The user's staging directory must be readable
by the queue server machine, as the queue server will need to copy files
out of the staging directory when it is time to submit the user's run.

The work directories of the TRANSP compute server must be visible and writable
from the queue server machine, as the queue server will need to copy in the
namelist files and write a file instructing the compute server to carry out
the user's run.  The run's temporary files will also be written in the 
compute server's work directories, in the course of normal run execution.

And, finally, the queue server, compute server, and users will need to share
write access to directory indicated by the value of the environment variable

  $QSHARE

where the run queue itself is stored, as a set of files.  $QSHARE will 
contain files with information on the state of runs, i.e. queued, running,
aborted, or successfully completed, as described in more detail in the
next section.

The following sequence summarizes use of this filesystem as a series of 
steps for creation of a TRANSP run under the shared queue manager system:

  1.  User prepares input data.  Ufiles are placed in a selected
      subdirectory of $DATADIR.  The user places the namelist in a
      "staging directory".

  2.  From the staging directory, the user executes a command to enqueue
      a TRANSP run.  This causes a file to be written in $QSHARE which
      will be interpreted by the queue server as a run request.  The file
      includes the path back to the user's staging directory, a date-time
      stamp, a priority specification, and optional run parameters.  A
      file is also written giving the ownership of the run.

  3.  The queue server processes run requests found in $QSHARE, forwarding
      the requests (and copying files) to compute servers as these become 
      available.  The run state changes from "queued" to "submitted".  When
      the compute server succeeds in starting a run, its state changes to
      "running".

  4.  The compute server carries out the run.  When the run job terminates,
      the compute server modifies the state file in $QSHARE to indicate
      normal successful completion, or, abnormal termination, of the run.
      On normal successful completion, the run output files are copied to 
      the appropriate $RESULTDIR subdirectory, the log file is copied to 
      the $LOGDIR/runlog directory, and the compute server's workspace is 
      cleaned up, readying the server for the next run.  In the case of an 
      abnormal termination, all files are left in place to allow expert 
      diagnosis and debugging of the failure of automatic processing.
      When a run terminates (successfully or with problems), Email is
      sent to the run's owner.

  5.  Within 24 hours of successful completion, the queue server removes
      the residual run files from $QSHARE.

  6.  The user can monitor run progress with the `qmonitor' program.

2.  The Run Queue.
==================

The shared run queue is implemented as a collection of files in the
shared writable directory $QSHARE.  The general form of filenames in $QSHARE
is:

   <runid>_<tokamak-id>.<type>,

where:

   <runid> is a TRANSP runid, such as 94388A07,
   <tokamak-id> is a valid tokamak id, such as D3D or TFTR, and
   <type> indicates the filetype; each filetype has a particular
          meaning to the queue manager software.

The supported filetypes are:

   .owner   --  specifies the owner of the run.
		contains:  owner name

   .queue   --  indicates a requested run, and contains details thereon.
		contains:  staging directory path
			   date-time queued
			   priority
			   additional parameters (for queue server or compute 
				server).

   .submit  --  run has been submitted to a compute server, but is not
		yet executing.
		contains:  (same as .active file)

   .active  --  run is currently executing, compute server is specified.
		contains:  compute server name
			   date-time started

   .stopped --  run has been stopped (by user or by abnormal termination),
                file contents specifies compute server.
		contains:  compute server name
			   date-time halted
			   "user" or "system"; "user" if user requested halt

   .success --  run has completed successfully.
		contains:  compute server name
			   date-time completed

   .look    --  request for an advance peek at output of an incomplete run.
		contains:  name of person requesting the peek.
                           dates and processing information

   .halt    --  request to (prematurely) halt a currently executing run.
                contents:  dates and processing information

   .archive --  request to (prematurely) halt and archive an executing run.
                contents:  dates and processing information

   .cancel  --  request to (prematurely) halt and remove an executing run.
                contents:  dates and processing information

The queue server can submit a run to a compute server, changing the run's
state from queued to active.  The compute server can post here that a run
has halted abnormally, or completed successfully.  The queue server will
eventually remove files associated with a successfully completed run from 
the $QSHARE directory, so that $QSHARE will mostly contain information about
requested or currently active runs.

The files themselves contain small amounts of data, as needed.  For example,
the .queue file contains the path to the user's staging directory, the date,
time and priority setting associated with the run request, and additional
optional parameters.


3.  User Interface.
===================

Creating a TRANSP run requires completion of three tasks:

  *  creation of a namelist
  *  creation of input data (Ufiles or MDS+)
  *  submission of a run request.

The user creates the namelist in a work directory which is accessible
to TRANSP's queue server.  The namelist contains information specifying
MDS+ data input, or, pointing to the directory containing the Ufiles.  The
Ufiles directory must be visible from all TRANSP compute servers.  The user
might choose to create the namelist in the same directory with the Ufiles.

Creation of namelist and Ufiles input is covered elsewhere in the TRANSP
documentation.  What the user needs to be aware of for purposes of using the
queue manager is that (1) the namelist file needs to be placed in a directory
accessible to the TRANSP queue server machine, and (2) the Ufile data needs 
to be in a directory accessible to any TRANSP compute server machine.  These
requirements are covered in the section on the shared filesystem, above.  

Operating within these parameters, the user can use the `xtranspin' program 
as a graphical interface to the namelist; the `trdat' program can be used to 
check the namelist for errors, and to re-examine the input data.

The following section describes a command interface which allows users to
access the necessary functionality.  A Tcl/Tk GUI application `xlauncher'
can also be used; it is described in the subsequent section.

Command Oriented User Interface:
--------------------------------

The queue manager's user interface is concerned with the submission of
an actual TRANSP run request, once the TRANSP namelist and input data are
ready.  The interface commands are summarized as follows:

notation:
      <runid> -- run id string, e.g. 94276A16
      [args...] -- optional arguments (explained below).

command summary:
  qmonitor   -- show the status of the TRANSP run queue and/or
                list the available TRANSP compute servers.  This is
                an interactive program.  Arguments are interpreted
                as "type-ahead" input, so that non-interactive forms
                such as "qmonitor rq" and "qmonitor lq" can be used.

  enqueue <runid> [args...]
             -- enqueue a TRANSP run, specifying the tokamak-shot-year
                string and giving hand entered comments on the run.
             -- user's current working directory must contain the run's
                namelist (but see notes on ENQDIR environment variable, 
                below).

  requeue <runid> [args...]
             -- requeue a TRANSP run (tokamak-shot-year and run comments
                have already been given and are not to be changed).
             -- `requeue <runid> lrs' relinks and restarts an aborted run
                `requeue <runid> rs' restarts an aborted run (no relink);
                this usually only causes the run's abort state to recur,
                but the "lrs" option can be useful if a code repair has
                been done.
             -- `requeue' commands must be issued from the same working
                directory from which the original `enqueue' was issued.
                (but see notes on ENQDIR environment variable, below).

  dequeue <runid>
             -- remove a run from the queue (before it starts executing).

  tr_look <runid>
             -- preduce a scratch set of run output, allowing a peek at
                run output prior to normal completion of the run.  The
                scratch output dataset is placed in the directory
                $RESULTDIR/scratch.  The tr_look command can be issued
                by a non-owner of <runid>.

The following commands are available for error recovery, and should be
used with great care, as they are irreversible.  These can only be used
by <runid>'s owner:

  tr_cancel <runid>
             -- delete a run from the queue (after it has started executing).
                This is usually done to correct a mistake; the run may have
                crashed.  The run is halted, if necessary, and all run-
                specific files are removed from the queue server and the 
                compute server; the compute server is freed for processing
                the next queued run request.

  tr_archive <runid>
             -- archive a run, after execution has started but before 
                normal completion.  This is usually done to archive a
                partial run that crashed before reaching the end of the
                simulation.  If issued for a run that is still executing,
                the run is halted first.  Run files are removed from the
                compute server, freeing it to process the next queued run
                request.

  tr_halt <runid>
             -- force an abnormal termination of a run, but do not archive
                or delete its files.  The run files continue to take up
                space on the compute server, which may prevent that server
                from being available for a subsequent run.  The run can be
                restarted later.

Many commands result in processing that can lead to the generation of Email.
The user receives Email whenever any of the following events occur:

  ->  a user's run completes successfully.
  ->  a user's run terminates abnormally.
  ->  a user's tr_look request is completed.
  ->  a user's tr_cancel, tr_archive, or tr_halt request is completed.

If a user request "vanishes without a trace", this indicates a problem
(see the appendix on trouble shooting).

The `enqueue' command is interactive-- the user will be expected to supply
a tokamak-shot-year code identifying the run's destination output directory
(e.g. "D3D.97" for D3D runs based on 1997 shots).  Also, the user will be
placed into an editor session in which comments describing the run may be
entered.  Environment variables affect the `enqueue' interaction, as
described, below.

The `enqueue' and `requeue' commands accept additional arguments, which will
be processed by the enqueue/requeue script, or passed on to the queue manager.
Any arguments passed to the queue manager but not processed there are passed
on to the TRANSP job itself.

Arguments processed by enqueue/requeue:
---------------------------------------

  priority <n>   (n a number between 1 and 8).

This asserts a priority number for the job being enqueued.  Jobs with higher
priority numbers are processed first.  However, to request a priority greater
than the default (priority = 5), the user must supply comments justifying the
high priority request; these comments are separate from the comments describing
the run itself, and they are readable by other users using an option of the 
`qmonitor' program.

Arguments processed by the queue manager:
-----------------------------------------

  <compute-server-name>
  top <m> (m a number between 1 and the number of compute servers)

The user can request the job to be queued to the named compute server (only),
when that machine becomes available.  Alternatively, the user may request
that only one of the "m top (i.e. fastest) machines" be used to service the
run.  These options restrict the choice of compute server, but do not otherwise
give the run any priority advantage in the queueing process.

Arguments processed by the TRANSP run script:
---------------------------------------------

Although there are several arguments understood by the TRANSP run script,
these are not normally set by the user.  The only option that is sometimes
used is:

  lrs     (load and restart)

which can sometimes be used to restart an aborted run, after a bugfix has
been installed into the TRANSP source code.  The name of the compute server
on which the aborted run resides (visible with qmonitor) must also be given.

Examples:

  enqueue 12345A07 priority 7 top 2
    -- enqueue the run at raised priority, to run only on one of the two
       fastest compute servers.  The user will be required to give comments
       justifying the heightened run priority.

  requeue 12345A07
    -- requeue the run at normal priority, eligible for any compute server.

  requeue 12345A07 lrs
    -- requeue the run for load and restart on the same machine where the
       run was originally started.

Environment Variables.
----------------------

Environment variables need to be set in order to allow the user to access
the queue manager, and to customize its behaviour.

To access the queue manager user interface, the user's PATH needs to be
modified.  The interface consists mostly of executable shell scripts.  It 
is made available by placing the TRANSP directory

      $CODESYSDIR/qcsh

in the user's PATH environment variable.  If the user's PATH includes
$CODESYSDIR/csh, the qcsh directory should come first, so as not to 
access a more primitive `enqueue' command implemented for standalone
TRANSP running without a queue manager.

In order for the queue manager to function properly, all users need to be
able to access a shared writable directory, as indicated by the environment
variable QSHARE, i.e.

      QSHARE = <run queue directory>.

As this resource must be shared by all TRANSP users at a given site, it 
might be appropriate to set this up at the system level.

In order for the qmonitor program to function properly, it needs to be able
to find the queue manager configuration database.  This should be set in the 
environment variable CONFIGDIR, i.e.

      CONFIGDIR = <configuration database directory>.

So, the user sees the same files under $CONFIGDIR as the TRANSP compute
servers and queue server.

The following is useful for preventing a situation where a user or
multiple users create two TRANSP runs with the same runid:

It is generally advisable to set up a site-specific procedure for
determining, given <tokamak-shot-year> and <runid>, whether the named
<runid> has already been used.  The idea is to protect the user from
inadvertantly reusing a <runid> which another user might already have
chosen for a run of his own.  The default procedure simply looks in 
the appropriate $RESULTDIR subdirectory for the named run.  This would
not be sufficient at a site where runs are tracked and archived off-line.
In this case, the site needs to invent a procedure which accepts the 
arguments <tokamak-shot-year> <runid> and returns a status code TRUE (1)
if the run already exists, and FALSE (0) otherwise.  The name of the
procedure is set in the user environment variable ENQUEUE_EXIST.  For
example, if the shared executable script /usr/trsys/shared/dupcheck were
set up for this purpose, the environment variable

      ENQUEUE_EXIST = /usr/trsys/shared/dupcheck

would be set.  Then, if the user performed an `enqueue 12345A99' and 
specified results output subdirectory (tokamak-shot-year) "D3D.97", then
the `dupcheck' script would be activated with arguments "12345A99 D3D.97".

If the user always processes TRANSP runs on the same tokamak device, the
environment variable ENQUEUE_TOKYY can be set:

      ENQUEUE_TOKYY = <tokamak-id>            (e.g. "D3D")

or even

      ENQUEUE_TOKYY = <tokamak-shot-year>     (e.g. "D3D.97")

which serves as a hint to the enqueue script.  If this environment variable
is not set, the enqueue script will use (a) a prior enqueue-ing of the same
runid, or (b) the current work directory string, by looking for subdirectory
names containing a valid tokamak-id.  The valid tokamak-id strings are as
given in the first column of $CONFIGDIR/TOKAMAK.DAT.

Users are expected to use a staging directory (or multiple staging directories)
for preparation of namelists.  The `enqueue' and `requeue' commands normally
have to be issued with the user's current working directory already set to
this staging directory.  However, if the user defines the environment
variable

      ENQDIR = <path to staging directory for TRANSP runs>

Then, as a convenience, `enqueue' and `requeue' will cd to this directory
first.  However, if the user employs multiple staging directories then he/she
should not define ENQDIR, and must cd to the desired staging directory, 
containing the run's namelist, prior to giving an `enqueue' or `requeue' 
command.

Finally, the user can define which editor (e.g. "emacs" or "vi") to use when
comments are required.  The enqueue script follows the following procedure:
(1) if the ENQUEUE_EDITOR environment variable is defined, use this as the
editor command, otherwise (2) if the EDITOR environment variable is defined,
use this, otherwise (3) use "vi".

Graphical User Interface:
-------------------------

(New 29 Jan 1998).  A new directory, $CODESYSDIR/wish has been added to
the TRANSP source code distribution.  This contains an executable Tcl/Tk
script, `xlauncher', which provides a GUI that allows user point and click
access to the queue manager commands just described.

A simple setup procedure is required to make effective use of `xlauncher'.

1)  Add $CODESYSDIR/wish to the PATH environment variable, creating the
xlauncher command.

2)  Identify the paths to one or more staging directories to be used for
preparation of TRANSP runs.  The user should create at least one staging 
directory, or perhaps one staging directory "per project".

3)  Run `xlauncher' and use the options under the `definitions' menu to
enter the necessary information on staging directories.  This will create
commands under the `applications' menu which will take the user to the
staging directory and then allow the user to select runs, modify namelists,
examine data, enqueue runs, monitor runs, terminate runs, etc.  The GUI
is built on top of the command interface, uses the same shell scripts, and
provides the same functionality as the command interface, but with a point
and click control interface.

4.  Configuration Database.
===========================

An interactive program, `configdb', can be used by the TRANSP "system
administrator" (not an ordinary user) to modify the contents of the 
TRANSP system's configuration database, which contains such information 
as:

  *  list of known tokamak-ids.

  *  a list of compute servers in descending order of speed, giving
     each server's TRANSPROOT, run-capacity (# of simultaneously 
     executing runs), and disk-capacity (# of simultaneously executing
     runs + aborted runs).

  *  a list of compute servers that are off-line temporarily due to 
     hardware trouble.

  *  a list of access restrictions placed on compute servers.

  *  the queue server node, where the queue manager runs.  Also the
     queue server runs TRANSP's "makefile generator" once per 24 hour
     period.

  *  the list of nodes where nightly "builds" are run, exactly one
     node for each instance of LOCAL, as described in the section on
     the common filesystem.  The generated makefiles are executed, 
     causing software to be recompiled and reloaded in response to 
     source code changes.

The `configdb' program is meant to be self-explanatory.  However, it is
probably worth making some comments about compute server access restriction
options.

Compute servers can be reserved to an OR list of users or tokamak-id's, by
giving a list of names.  For example, reserving node HYDRA to names
"jojo D3D" allows runs by user jojo **or** any D3D run to execute on HYDRA.

Alternatively, certain users or tokamak-id's can be denied access to a
compute server.  The reservation string "~TFTR ~cpuhog" for compute server
USCWS3 states that anyone except "cpuhog" can make a TRANSP run on server
USCWS3, as long as it is not a TFTR run.

Finally, these "reservation" and "denial" clauses can be mixed:  the string
"D3D PBXM ~cpuhog" would require the run to be either a D3D or PBXM run,
and it would prohibit any access by user "cpuhog".  Some caution should be
exercised, since combinations may exist which end up denying access to 
all users.


5.  The Queue Server.
=====================

The `qserver' script and `qmanager' program carry out the servicing of
the run queues.  The qmanager program analyzes the queue and generates
a script which will copy the necessary information to the compute server
request area.  The copied files represent requests to the compute servers;
the following types of requests are possible:

     $REQUESTDIR/<runid>_<tok>.queue  --  execute a TRANSP run
     $REQUESTDIR/<runid>_<tok>.look   --  "trlook" a TRANSP run,
         i.e. put in $RESULTDIR/<tok> an output dataset for a TRANSP
         run that has not yet completed execution
     $REQUESTDIR/<runid>_<tok>.halt   --  halt (abort) a TRANSP run
     $REQUESTDIR/<runid>_<tok>.archive--  abort and archive a TRANSP run
     $REQUESTDIR/<runid>_<tok>.cancel --  abort and delete a TRANSP run

The compute server script `cserver' processes

From there, the compute server takes over (see the next section).

The queue server will also schedule nightly code maintenance jobs.
By default, these jobs run at 1 am every night.  To select a different
hour, define the BUILDHOUR environment variable at login time, e.g.

    setenv BUILDHOUR 4

to specify that code maintenance jobs should run at 4 am.  Code maintenance
"build" jobs are controlled from the `bserver' script.  For further details 
see appendix A.

The queue server also carries out "cleanup" actions, to remove from $QSHARE
information on runs that completed on the previous day or earlier.  The
$LOGDIR directory tree is also subject to "cleanup" actions.  By default
run log files that are older than 30 days are deleted.  To change the
number days set the environment variable LOGFILE_RESIDENCY, e.g.

    setenv LOGFILE_RESIDENCY 14

to specify a policy to remove files older than 14 days.

The scripts `qserver', `bserver', and `cserver' will not run automatically
unless the "master daemon" script is scheduled for the system cron daemon.  
See section 7, below.

6.  Compute Servers.
====================

The compute servers have responsibility for actual execution of user requests:
normal run requests, tr_look requests, tr_archive requests, tr_halt requests,
and tr_cancel requests.  These services are implemented by periodic execution
of the `cserver' script under the "master daemon".

The `cserver' script should *only* be run out of master.daemon, which takes
responsibility for preventing multiple simultaneous executions of the script
(which would have unpredictable consequences).

The `cserver' script goes to considerable lengths to notify users by Email
if problems are detected.  For example, if cserver tries to start a queued
run which contains an error in the namelist or input data such that the run
cannot be started, cserver will dequeue the run and send an Email to the
user.

Similarly, cserver will notify the user if it detects contradictory requests,
e.g. a simultaneous tr_archive and tr_cancel request on a given run (the 
tr_archive request wins).

If user requests "disappears without a trace" (i.e. it is not executed and
no Email is generated) this indicates some kind of system problem.  See the
appendix on troubleshooting.

7.  The "master daemon".
========================

The queue server, compute server, and build server scripts are all run
out of the shared script $TRANSPROOT/daemon/master.daemon.  This script
contains "locking mechanisms" to prevent multiple parallel execution of
scripts that could e.g. lead to redundant launchings of a TRANSP run.  At
the same time, the script also checks against "hanging locks" that can be
created on the occasion of system crashes.  These mechanisms are thought
to be reasonably reliable, as they have been use in the PPPL TRANSP run
production system for several years.  The master.daemon also runs a 
"reboot daemon" which will restart TRANSP runs interrupted by a system
crash.

Use the `crontab' command to schedule master.daemon for regular execution.
This should be done on *all* machines being used as TRANSP servers.

The `crontab' command creates crontab table entries; each table entry
contains scheduling information and an sh command specifying the scheduled
action.  Here is a typical master.daemon crontab entry at PPPL:

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /mount/transp0/pshare/daemon/master.daemon >/mount/transp0/pshare/work/log/master.cron.log

This schedules master.daemon to run once every 5 minutes.  The more important
scripts executed out of the master.daemon are:

  restart_check           -- restart TRANSP runs (i.e. after a system crash).
  cserver interrupt       -- carry out tr_look, tr_halt, etc.
  bserver setup           -- misc. build server related functions.
  bserver execute         -- run makefile generator; run makefiles.
  cserver start           -- launch transp runs.
  qserver                 -- (queue server node only) service the queue.

Note:  master.daemon is launched by the system cron daemon, out of `sh'.  The
transp account's .login script is **not** automatically source'd by cron.
Therefore, master.daemon supports various ways to define its environment.
The proper method of definition will vary from site to site.  For further
information, see the first few lines of the $TRANSPROOT/daemon/master.daemon
script, or seek help (Email dmccune@pppl.gov).

A.  Automated Code Maintenance.
===============================

Automated code maintenance eases the installation burden, if local code
changes are made.  Basically, the code's makefiles get checked, regenerated
if necessary, and executed once every 24 hours.

This means that if a change is installed in the common source code, but
built in and tested only on compute server `A', then, the change will also
be built in on all other compute servers within 24 hours.  The queue server
will run the makefile generator and then request makefile execution on each
"build server" defined in the configuration database.  Each build server has
responsibility for a separate set of code binaries.

B.  Troubleshooting Hints.
==========================

The basic method of troubleshooting amounts to examination of log files.

Message output from scripts is routed to log files as follows:

general:

  $LOGDIR/<machine-id>_error.log
    ... "unusual" messages from the build server or compute server
    scripts

compute server (general):

  $LOGDIR/<machine-id>_master.log
    ... record of each TRANSP job launched on <machine-id>.
  $LOGDIR/<machine-id>_reboot.log
    ... record of runs restarted (i.e. after reboot) on <machine-id>.

  $LOGDIR/<machine-id>_archive.log
    ... tr_archive requests processed on <machine-id>.
  $LOGDIR/<machine-id>_cancel.log
    ... tr_cancel requests processed on <machine-id>.
  $LOGDIR/<machine-id>_halt.log
    ... tr_halt requests processed on <machine-id>.  This includes
    "implied" requests due to tr_archive or tr_cancel.
  $LOGDIR/<machine-id>_trlook.log
    ... tr_look requests processed on <machine-id>.  Output file
    translator job messages are included here.

compute server (specific runs):

  $LOGDIR/runlog/<runid>_trdat.log
    ... logfile of preprocessing (trdat, link) for run <runid>
  $LOGDIR/runlog/<runid>_tr.log
    ... logfile of TRANSP execution and post-processing for run <runid>.

  (note that the queue server removes these files after 30 days or the
  number of days indicated in the environment variable LOGFILE_RESIDENCY).  

build server:

  $LOGDIR/<machine-id>_chktok.log
    ... log of jobs to check work subdirectories against list of tokamaks.

  $LOGDIR/<day-id>/<machine-id>_checkmake.log
    ... log of checkmake job on most recent instance of <day-id>.
    checkmake looks for changes in the source code and reruns the makefile
    generator on all components containing changes.

  $LOGDIR/<day-id>/<machine-id>_build.log
    ... log of build job on most recent instance of <day-id>, on the
    indicated <machine-id> which matches the <machine-id> on which 
    checkmake was run.

  $LOGDIR/<day-id>/<machine-id>_build2.log
    ... log of build job on most recent instance of <day-id>, on the
    indicated <machine-id> which is different from the <machine-id> on 
    which checkmake was run.

  Errors detected during build jobs are compressed out of the log files
  (using the errfilter program) and sent as Email to the address indicated
  in the file $CONFIGDB/csh_mail.address, or, if this file is not found,
  the address used is the output of the `whoami` command on the build
  server.

queue server:

  $LOGDIR/qserver.log ... "unusual" messages from the queue server.

  $LOGDIR/qlog/<month-id>.log  ... contains summary qmanager output for 
    the indicated month (most recent year only).  A date-time stamp is
    output each time the qmanager program is run.  If in addition the
    qmanager decides on action (i.e. to submit a run) or detects an
    abnormal condition (e.g. a compute server is inaccessible), these
    actions or conditions are reported.

  $LOGDIR/qlog/thisweek/<day-id>.log
  $LOGDIR/qlog/lastweek/<day-id>.log
    more detailed qmanager outputs on a day by day basis; only the current
    week and preceding week's output are retained.

Suggestions for improving the logging system?  Email dmccune@pppl.gov
(dmc 19 Dec 1997).

C.  Miscellaneous Notes...
==========================

1.  The qcsh scripts assume an ability to send Email to users, using only
    the output of the `whoami` command as an address.  It is up to local
    system administration to assure that such Email gets sent to the user
    in an appropriate way.  That is, if the queue server wants to send
    user "harry" an Email, this should not arrive in a local Email file
    on the queue server machine, which "harry" will never check.  Email
    should be routed to a central mail server and forwarded in a sensible
    way.

2.  The TRANSP compute server machines also need to be able to send mail
    to users knowning only their username.  The qcsh scripts record the
    output of the `whoami` command in the ".owner" file for each run; this
    data is subsequently used by the TRANSP queue server and compute server
    machines.  Email will be sent in case of error, and when the TRANSP run
    is completed.

3.  Email will also be sent when an error occurs during execution of a
    TRANSP build script.  This Email is sent to the address contained in
    the file $CONFIGDIR/csh_mail.address.  If this file is not found, the
    output of `whoami` in the TRANSP account (the account running the
    master.daemon) is used for the address.

D.  Possible Future Development.
================================

The queue management system creates a superstructure that allows for further
automation of TRANSP related tasks.  An example of a task that could be
automated (but has not been as of 19 Dec 1997) is the regular scheduling of 
machines to be "on-line" or "off-line" for TRANSP batch processing, depending
on the time-of-day and/or day-of-the-week.  Jobs executing when the machine
goes off-line could be suspended, to be restarted when the machine goes back
on-line.

Additional automation ideas are welcome.  Send Email to dmccune@pppl.gov.