Scheduler FAQ |
First thing you have to prepare your XML description. When you have prepared your files, you can type:
star-submit jobDescription.xml
where jobDescritpion.xml is the name of the file.
You can use one of the "cut and paste" examples. Of course, you still have to change the command line and the input files. I am sorry I couldn't prepare the exact file for you... :-)
Yes, this is normal. For every process a script and a list of files are created. It's something that will be fixed in the final version. You can delete them easily, since they all start with script* and fileList. Remember, though, to delete them _AFTER_ the job has finished.
Well, you shouldn't panic like that! You can send an e-mail to the scheduling mailing list, and somebody will help you.
In the comment at the beginning of each script there is the bsub command used to submit the job. You can copy and paste it the command line and execute it. Be sure you are in the same directory of the script.
You can use the name attribute for job like this:
<job ... name="My name" ... >
In the command section you can put more than one command. You can actually put a csh script. So you can write:
<command> starver SL02i root4star mymacro.C </command>
The file catalog is actually a separate tool from the scheduler. When you write a query, the get_file_list.pl command is used. So, the full documentation for the query is available in the file catalog manual. You will be interested in the -cond parameter, which is the one you are going to specify in the scheduler.
If you are asking this question, it's because you have been trying to submit something like:
<command>root4star -b -q doEvents.C\(9999,\"$FILELIST\"\)</command>
This won't work because doEvent interprets $FILEST as an input file and not as a filelist. But, if you put @ before the filename, doEvents (and bfc.C, ...) will interpret the filename correctly. So you should have something like:
<command>root4star -b -q doEvents.C\(9999,\"@$FILELIST\"\)</command>
Before version 1.8.6 the job will start default location for the particular batch system.
If you are using LSF jobs will execute in the directory in which you are submitting the job, which is the same directory where the scripts will be created, which is also the same directory you should be in for resubmitting the jobs.
In version 1.8.6 and above the default directory in which the job starts is define by the environment variable $SCRATCH. This will most likely be a directory local to the worker node. The base directory path will be different for every site. The directory and all its contents will be deleted as soon as the job is finished. For this reason do not ever attempt to change this directory. All files that need to be saved need to be copied back to some other directory.
You don't, but you can tell how long your job is, so that the scheduler can choose the correct queue for you. Remember: the scheduler will have to work on different sites, on which queue name and lengths will be different. The scheduler will know that.
You can look at this example.
Check you .cshrc file. First of all, as Jerome says:
*********************************************** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** **** DO NOT SET LSF ENVIRONEMNTS YOURSELF ***** ***********************************************
Furthermore, you may want to have a look at this page
You can use the get_file_list.pl to know which values are can a particular have. For example:
[rcas6023] ~/> get_file_list.pl -keys filetype daq_reco_dst daq_reco_emcEvent daq_reco_event daq_reco_hist daq_reco_MuDst daq_reco_runco daq_reco_tags dummy_root ...
You can do the same for any keyword
You might want to have a look at the star-submit-template. Here is an example.
This is an XML problem: '<', '&' and '>' are XML reserved, so you can't use them, but you can use 'something else' in their place. Use the following table for the conversion:
< | < |
> | > |
& | & |
So, for example, if you need to put the following in a query:
runnumber<>2280003&&2280004
runnumber<>2280003&&2280004
Yes, it doesn't look so nice... :-( There is unfortunately not much I can do...
I suggest you have a look at this site: it has a lot of tutorials about XML, HTML and other web technologies. It's at a entry level and it has a lot of examples.
You can find a list of the installed policies here.
Use $FILEBASENAME as the stdout file name. This only works in version 1.6.2 and up. It only works if there is only one output file or one file preprocess.
Consult the manual if you need an example.
Emphasis is always placed on generating real detailed and meaningful user feedback at the prompt when errors occur. However more information can be found in the users log file. If the scheduler did not crash altogether it will append to the users log file, where the most detailed data of the internal workings can be found.
Every user that ever used the scheduler, even just once has a log file. It holds information about what the scheduler did with your job. To find out more about reading your log file click here.
Note: This only refers to version 1.7.5 and above, as resubmission was only built-in at this version. If you submitted via an older scheduler version the resubmit syntax will not work and you will not have a session file.
Note: When a job is resubmitted the file query is not redone. Scheduler uses the exact same files as the original job. Be careful when resubmitting old jobs as the path to the file may have changed.
When a request is submitted by the scheduler, it produces a file named [jobID].session.xml. This is a semi-human readable file, that contains everything you need to regenerate the .csh and .list files and to resubmit all or part of your job.
If you wish to resubmit the whole job, the command is (replace [jobID] with your job Id): star-submit -r all [jobID].session.xml
Example: star-submit -r all 08077748F46A7168F5AF92EC3A6E560A.session.xml
If you wish to resubmit a particular job number, the command is (where n is the job number): star-submit -r n [jobID].session.xml
To resubmit all of the the failed jobs, the command is: star-submit -f all [jobID].session.xml
Type : star-submit -h for more options and for help. There are a lot more options available this is a very short overview of the resubmission options.
Note: This only refers to version 1.7.5 and above, as resubmission was only built-in at this version. If you submitted via an older scheduler version the resubmit syntax will not work and you will not have a session file. It is also recommended you read 20. How resubmit jobs as there is more information about the session file in there.
The command to kill all the job in the submission is (replace [jobID] with your job Id): star-submit -k all [jobID].session.xml
If you wish only to kill part of the jobs, substitute the word all for a single job number (for job 08077748F46A7168F5AF92EC3A6E560A_4 the number is 4). A comma delaminated list may also be used (example: star-submit -k 5,6,10,22 [jobID].session.xml) or a range (example: star-submit -k 4-23 [jobID].session.xml)
Note: This only refers to version 1.7.5 and above.
This information is stored in the [jobID].report file in a nice neat table (I recommend you turn off word wrap to view this file). We are trying to put more and more information for users in this file with every new version. The file also stores information about queue selection. So it will probably answer such questions as "Why did my job have to go into the long queue as opposed to the short queue".
In a non-grid context when you ask for data to be moved back from $SCRATCH
using a tag like this one the:
<output fromScratch="*.root" toURL="file:/star/u/lbhajdu/temp/" />
Is translated into the cp command like you see below:
/bin/cp -r $SCRATCH/*.root /star/u/lbhajdu/temp/
If the cp command returns "/bin/cp: no match" it means it did not match
anything. This is because no files where generated by your macro for it to copy.
This can be verified by adding an “ls $SCRATCH/*” to the command block of your
job right after your macro finishes to list what file it has produced in
$SCRATCH.
Examine your macro carefully to see what it’s doing. It could be writing your
files to a different directory, or not writing any files at all or crashing
before it gets a chance to write anything.
In scheduler version 1.10.0c and above there is an option to have the scheduler
write the files somewhere other then your current working directory. A basic
example of this capability would be to create a subdirectory off your current
working directory and have the scheduler copy everything there. To do this first
create the directory [mkdir ./myfiles]. Then add these tags inside the body
of your xml files jobs tag (that is the same level as the command tag):
<Generator><Location>./myFiles</Location></Generator>
These tags have far more fine grain options fully documented inside the manual
in section 5.1.
Levente Hajdu - last modified this page