How to Run DØ Monte Carlo Generation

Iain Bertram

Version 4.0
July 10, 2003

Before you Start
Software Installation
Creating Jobs
Nikhef Batch Jobs Example Lancaster Scripts

Before You Start

If you want to run a farm or site that will generate OFFICIAL DØ Monte Carlo event generation you will need to be familiar with several packages. The fundemental ones are:

How to install the required software.

A list of all the software currently being run in Monte Carlo Production is given at here.

Before you start installing software you will need to decide on a directory structure on all of the farm machines. I suggest locating the executables in the same location on all of your nodes. For example at lancaster we have created a /data area on all machines. I will call this location $MC_HOME in the rest of the documentation.

The software is availabe from d0mino.fnal.gov in the directory: d0mino.fnal.gov:/d0dist/dist/minitar/tarfiles

You will need to copy over the following tarfiles (these are current mcp13 version of the code, see http://www-d0.fnal.gov/computing/mcprod/Software/Software.html for the latest versions):

You need to untar these files in the $MC_HOME on all of the machines you intend to run the code in. This will create a directory structure called mcc-dist. Updates to this code will be announced on the mailing list.

The code is now installed and you are now ready to run.

Creating Jobs: Some Changes

To create jobs you will need to setup some environment variables before running. From the directory $MC_HOME you need to have the following set of commands run:

. /d0/fnal/local/etc/setups.sh
export SRT_LOCAL=$MC-SOFTWARE-DISTRIBUTION # directory containing mcc-dist
cd ${SRT_LOCAL}
cd mcc-dist
source mcc_dist_setup.sh
cd ..
export BFCURRENT=p13.08.00
setup mc_runjob

The output from setup mc_runjob will list the releases you have available in your mcc-dist area:

***mc_runjob***: setup mc_runjob
***mc_runjob***: Supported Available Releases:
***mc_runjob***:  p11.14.00 p11.12.01 p11.13.00 p12.03.00 p13.01.00 p13.00.00 p13.02.00 p13.03.00 p13.04.00                                

You are now ready to run mc_runjob. Recall you need to understand the mc_runjob documentation (see: http://www-clued0.fnal.gov/mc_runjob/mainframe.html)

To run mc_runjob you will need a mc_runjob macro, to create the job. For this example I will just look at creating a simple pythia QCD job ( Pythia_Example.macro).

I will not place the whole macro here but will comment on some of the lines

attach samglobal

# Global variables
# $MC_HOME/curr will be the directory the job will be created in

cfg samglobal define string CurrentDir ./curr
# The job name will be Pythia-Example-$(A time stamp)
cfg samglobal define string JobName Pythia-Example-
# $MC_HOME/dest is where the output will be found
cfg samglobal define string DestinationDir ./dest

To run the the above macro (after initializing the above environment variables) type in the directory $MC_HOME:
mc_runjob -macro=Pythia_Example.macro

and you will get the shell script that runs the job as output, for example:

$MC_HOME/curr/Pythia-Example-02023152238/Pythia-Example-02023152238.sh

The resulting directory $MC_HOME/curr/Pythia-Example-02023152238/ that is created is:

GlobalParams.MDC  Pythia-Example-02023152238.sh      mc_runjob_error_log  td_Pythia-Example-02023152238.conf
MakeMetadata.py pythia

to run the above job you need to do the following,

#!/bin/sh -f
source /d0/fnal/local/etc/setups.sh
export SRT_LOCAL=/system/d0software/p07-p08/
cd $SRT_LOCAL
cd mcc-dist
source mcc_dist_setup.sh
cd /data/p07-p08/
setup fcp
. /data/p07-p08/curr/Pythia-Example-02023152238/Pythia-Example-02023152238.sh /data/p07-p08/curr/Pythia-Example-02023152238/ 

After running the directory will look like:

GlobalParams.MDC  Pythia-Example-02023152238.sh      mc_runjob_error_log  td_Pythia-Example-02023152238.conf
MakeMetadata.py Pythia-Example-02023152238.sh.log pythia

In the directory $MC_HOME/dest/ the following files will have been created (the output of the job)

ls dest/*
d0gstar: d0g-p13.04.00_CAEP-off_Geo-plate_Iain-Bertram_algo_recocert_lancs_2404_02347172401 import_kw_d0gstar_d0g-p13.04.00_CAEP-off_Geo-plate_Iain-Bertram_algo_recocert_lancs_2404_02347172401.py d0reco: import_kw_d0reco_reco-p13.04.00_Iain-Bertram_algo_recocert_lancs_2404_02347172401.py import_kw_d0reco_tmb-p13.04.00_Iain-Bertram_algo_recocert_lancs_2404_02347172401.py reco-p13.04.00_Iain-Bertram_algo_recocert_lancs_2404_02347172401 tmb-p13.04.00_Iain-Bertram_algo_recocert_lancs_2404_02347172401 d0sim: import_kw_d0sim_sim-p13.04.00_Noise-on_NMB-0.0_MB-Fixed_Iain-Bertram_algo_recocert_lancs_2404_02347172401.py sim-p13.04.00_Noise-on_NMB-0.0_MB-Fixed_Iain-Bertram_algo_recocert_lancs_2404_02347172401 generator: gen-pythia-p13.04.00_Dec-incl_NumEv-1_Prod-qcd_Iain-Bertram_algo_recocert_lancs_2404_02347172401 import_kw_pythia_gen-pythia-p13.04.00_Dec-incl_NumEv-1_Prod-qcd_Iain-Bertram_algo_recocert_lancs_2404_02347172401.py

The metadata is stored in the same files as the output file from each stage.

Nikhef Farm Batch System

Nikhef has one of the most automated systems. What follows are its job scripts. While it uses FBS it does have appropriate scipts.

As batch system FBSNG (Farm Batch System New Generation, developed at FNAL) is used.  
FBSNG is installed with upd.
A batch job is specified with a jdf (job description file):

batch.jdf
SECTION      mcc
EXEC=/d0gstar/curr/qcdJob128131611/batch
NUMPROC=1
QUEUE=FastQ
STDOUT=/d0gstar/curr/qcdJob128131611/stdout
STDERR=/d0gstar/curr/qcdJob128131611/stdout
SECTION rcp
EXEC=/d0gstar/curr/qcdJob128131611/batch_rcp
NUMPROC=1
QUEUE=IOQ
DEPEND=done(mcc)
STDOUT=/d0gstar/curr/qcdJob128131611/stdout_rcp
STDERR=/d0gstar/curr/qcdJob128131611/stdout_rcp
SECTION sam
EXEC=/d0gstar/curr/qcdJob128131611/batch_sam
NUMPROC=1
QUEUE=IOQ
DEPEND=done(rcp)
STDOUT=/tmp/stdout_sam_qcdJob128131611
STDERR=/tmp/stdout_sam_qcdJob128131611

batch
#!/bin/sh
. /usr/products/etc/setups.sh
cd /d0gstar/mcc/mcc-dist
. mcc_dist_setup.sh
mkdir -p /data/curr/.database
cp -p /d0gstar/curr/.database/* /data/curr/.database
mkdir -p /data/curr/qcdJob128131611
cd /data/curr/qcdJob128131611
cp -r /d0gstar/curr/qcdJob128131611/* .
touch /d0gstar/curr/qcdJob128131611/.`uname -n`
sh qcdJob128131611.sh `pwd` > log
touch /d0gstar/curr/qcdJob128131611/`uname -n`
/d0gstar/bin/check qcdJob128131611

batch_rcp
#!/bin/sh
if [ ! -f /d0gstar/curr/qcdJob128131611/OK ];then exit; fi
mkdir -p /data/disk1/sam_cache/qcdJob128131611
cd /data/disk1/sam_cache/qcdJob128131611
node=`ls /d0gstar/curr/qcdJob128131611/node*`
node=`basename $node`
i=qcdJob128131611
rcp -pr $node:/data/curr/$i/Metadata/reco* .
rcp -pr $node:/data/curr/$i/Metadata/*.params .
rcp -pr $node:/data/curr/$i/Metadata/*.py .
touch /d0gstar/curr/qcdJob128131611/RCP

batch_sam
#!/bin/sh
/usr/bin/rsh -n -l willem schuur /d0gstar/curr/qcdJob128131611/run_sam

run_sam
#!/bin/sh
locate(){
file=`grep "import =" import_${1}_${job}.py | awk -F \" '{print $2}'`
sam locate $file | fgrep -q [
return $?
}
. /usr/products/etc/setups.sh
setup sam
SAM_STATION=hoeve
export SAM_STATION
job=qcdJob128131611
if [ ! -f /d0gstar/curr/${job}/RCP ];then exit;fi
cd /data/disk1/sam_cache/${job}
list='pythia d0gstar d0sim'
for i in $list
do
until locate $i || (sam declare import_${i}_${job}.py && locate ${i})
do sleep 60; done
done
sam store --descrip=import_reco_${job}.py --source=`pwd`
if grep -q ERROR /tmp/stdout_sam_${job}
then
touch /tmp/${job}_SAM_FAILED
else
touch /tmp/${job}_SAM_OK
fi

Lancaster Software

All of this software will be updated one we have released the SAM-Grid...

Getting The Request

Running the Request

I use the fbs batch system: and run form a script called run_MC.sh which requires run_template.txt, batch.base, batch_rcp.base. This creates scripts and submits them to the fbs batch system and sets up a job to copy file sto a central sam enabled node. This of couyrse will be included in SAM-grid ;-). In the same directory you require: sam_jdf.base , run_sam_d0g.base, and submit.sh.

Storing Files

One file is needed inthe sam account for storing the files (again being integrated): import_files_d0g.sh