001 <?xml version="1.0"?>
002
003 <!--
004 File: AthASK/doc/job.xml
005 Author: Wim Lavrijsen (LBNL, WLavrijsen@lbl.gov)
006 Created: 03/03/04
007 Last: 12/16/05
008 -->
009
010 <!DOCTYPE sect1 PUBLIC "-//OASIS/DTD DocBook XML V4.4//EN"
011 "http://docbook.org/xml/4.4/docbookx.dtd" [
012
013 <!ENTITY % entities SYSTEM "AthASK.ent">
014 <!ENTITY % glossary SYSTEM "glossary/glossary.ent">
015
016 %entities;
017 %glossary;
018
019 ]>
020
021 <sect1 id="sec-job">
022 <title>Simple jobs</title>
023
024 <para>
025 This section describes a way of running jobs on CERN &lsf;, using &athask;.
026 The scripts that are used are flexible enough the run all tutorials that are
027 provided in the appendices.
028 They do not, however, provide any job provenance, data management, or other
029 bookkeeping, and are as such unsuitable for real production: this text is
030 provided for educational purposes only.
031 </para>
032
033 <note><para>
034 Running a batch job is somewhat complicated at the moment, because access
035 to the repository is restricted (you can not get a &kerberos; ticket, unless
036 you are willing to put your password in a file which isn't really an option).
037 Thus, if you need packages from the repository, job submission should start
038 with copying<footnote><para>A further complication is the fact that &cmt; uses
039 hard-wired absolute paths in its configuration files, which must be countered
040 by reconfiguring the package after it has been moved.</para></footnote>
041 over a package that was previously checked out.
042 </para></note>
043
044 <sect2>
045 <title>Preparation</title>
046
047 <para>
048 The following will assume that the batch machine shares the file system with
049 the interactive machine that you are using.
050 This is usually the case when you use &lsf; for batch submission, and work on
051 a machine such as lxplus that hosts your home directory on &afs;.
052 </para>
053
054 <para>
055 First, create the following five directories in a work area that is
056 reachable from the batch machine (for example, your home directory on &afs;):
057
058 <screen>
059 $~> mkdir batch
060 $~> mkdir scripts
061 $~> mkdir log
062 $~> mkdir output
063 $~> mkdir pool
064 </screen>
065
066 There is nothing special about these names, so you can change them.
067 However, if you do, make sure that you modify the names in all the scripts
068 that make use of these directories (the main script allows the location of the
069 above directories to be changed with a command line argument).
070 </para>
071
072 <para>
073 Then, write an &athask; based submission script.
074 A basic example, that has most of the needed functionality, follows, but it is
075 recommended to write a script tailored (if nothing else, at least make sure
076 that the directory locations are proper, especially if they're not located in
077 $HOME) to your own needs:
078
079 <programlisting language="shell">
080 <phrase class="comment">#! /usr/bin/env zsh</phrase>
081 <phrase class="keyword">local</phrase> jobdir=<phrase class="string">"$HOME"</phrase>
082 <phrase class="keyword">local</phrase> packages=<phrase class="string">""</phrase>
083
084 <phrase class="comment"># parse and verify options; set local variables accordingly</phrase>
085 <phrase class="keyword">while</phrase> getopts d:p:hH o ; <phrase class="keyword">do</phrase>
086 <phrase class="keyword">case</phrase> <phrase class="string">"$o"</phrase> <phrase class="keyword">in</phrase>
087 d) jobdir=<phrase class="string">"$OPTARG"</phrase> ;;
088 p) packages=<phrase class="string">"$OPTARG"</phrase> ;;
089 h|H|[?]) echo <phrase class="string">"Usage: $0 [-d jobdir] [-p packages] script"</phrase>
090 exit 1 ;;
091 <phrase class="keyword">esac</phrase>
092 <phrase class="keyword">done</phrase>
093 <phrase class="keyword">shift</phrase> $OPTIND-1
094
095 <phrase class="keyword">if</phrase> test $# -ne 1 ; <phrase class="keyword">then</phrase>
096 echo <phrase class="string">"This job requires one &athask; script, but none were given ... stop!"</phrase>
097 exit 1
098 <phrase class="keyword">fi</phrase>
099
100 <phrase class="comment"># environment variables used in this and the &athask; script</phrase>
101 <phrase class="keyword">export</phrase> JOBDIR=$jobdir
102 <phrase class="keyword">export</phrase> MYPOOLFILE=$JOBDIR/pool/PoolFileCatalog.xml
103
104 <phrase class="comment"># copy the &athask; script to batch work directory</phrase>
105 cp -f $JOBDIR/scripts/$1 .
106 <phrase class="keyword">if</phrase> test $? -ne 0 ; <phrase class="keyword">then</phrase>
107 echo <phrase class="string">"Failed to copy script \"$1\" ... stop!"</phrase>
108 exit 2
109 <phrase class="keyword">fi</phrase>
110
111 <phrase class="comment"># copy POOL catalog file, if available</phrase>
112 test -f $MYPOOLFILE && cp -f $MYPOOLFILE .
113
114 <phrase class="comment"># copy all required packages</phrase>
115 <phrase class="keyword">setopt</phrase> shwordsplit
116 <phrase class="keyword">for</phrase> p <phrase class="keyword">in</phrase> $packages; <phrase class="keyword">do</phrase>
117 cp -r $JOBDIR/batch/$p .
118 <phrase class="keyword">if</phrase> test $? -ne 0 ; <phrase class="keyword">then</phrase>
119 echo <phrase class="string">"Failed to copy package \"$p\" ... stop!"</phrase>
120 exit 2
121 <phrase class="keyword">fi</phrase>
122 <phrase class="keyword">done</phrase>
123
124 <phrase class="comment"># setup done, execute &athask; script and return result</phrase>
125 ask $1
126 exit $?
127 </programlisting>
128
129 The script takes one argument: the &athask; script that is to be used.
130 Optionally, you can provide an alternative directory to locate the
131 conventional directories (you can also modify the default in the script), and
132 a list of packages that need to be copied.
133 The script copies the required files into the work directory on the batch
134 machine, starts &athask;, and has it execute the given script, which can
135 contain any commands as you are used to issuing on the &athask; CLI.
136 Note that that includes any and all &python; code.
137 </para>
138
139 <para>
140 For the tutorials, just copy the above script (call it
141 <filename>tutorialjob.sh</filename>) into your work directory and make it
142 executable.
143 </para>
144
145 </sect2><!-- Preparation -->
146
147 <sect2>
148 <title>Submission and monitoring</title>
149
150 <para>
151 The following command will submit the job, with &athask; script
152 <filename>myscript.py</filename>, into the '1nh' (one normalized hour) batch
153 queue and will copy the log file as <filename>mylog.log</filename> into your
154 <filename class="directory">log</filename> directory:
155
156 <screen>
157 $~> bsub -o log/mylog.log -q 1nh tutorialjob.sh myscript.py
158 </screen>
159
160 If you are not running at CERN, or need a queue that allows for jobs that run
161 longer than one hour: issue the <command>bqueues</command> command,
162 which will show you all available batch queues.
163 You can retrieve the status of your job with the <command>bjobs</command>
164 command, and you can look at its output with '<command>bpeek</command> #id'
165 (where '#id' is the job identifier, as obtained from <command>bjobs</command>;
166 use '-f' to follow the tail).
167 Please be considerate, however: if several people issue many of these
168 monitoring requests to the system, it will grind to a halt and no-one will be
169 able to get any work done.
170 </para>
171 </sect2><!-- Submission and monitoring -->
172
173 </sect1><!-- Job -->
This page was automatically generated by the
LXR engine.
|
|