Edited by AR 20.07.2002
##############################################################################
#   Tools to run GEANT3 simulation production as a Virtual Data Derivation   #
##############################################################################

You can find here a tar file with a complete ATLAS GEANT3 simulation production
toolkit. This setup is able to run on any Linux system without an additional
software (you will need input data, however). 

All production done with this script should first be registered in NOVA VDC
database. Go to http://atlassw1.phy.bnl.gov/NOVA/VDC/phpMyAdmin/index.php3 
(standard ATLAS web pw) and click on Dataset icon in the upper left corner
to see the existing dataset descriptions. 

If you need other datasets please contact nevski@bnl.gov or vaniachine@anl.gov
for new entry registration. When the production parameters are taken from the
NOVA VDC we garantee the correctness of the simulated data.

You have to perform the following steps in order to get production system 
running:

step 1: Installation 
=======

- Create a production directory which will keep all codes needed for program 
  running (you will need minimum 120 Mb there).

- Create a run directory (the one where from you will start jobs submission)
  We recommend it be different from the production directory to keep
  generated data separated from the original production codes.

- Un-tar the distribution tar-file directly from CERN (if you have afs) or 
  from your local copy into the production directory. You can also download 
  the tar-file from http://www.usatlas.bnl.gov/~nevski/localcache

- Copy example of the production script "atlsim_prod.job" from the production
  directory into run directory.

This step may look like:

cd  somewhere
mkdir  prod run
tar xfvz /afs/cern.ch/user/n/nevski/public/adist/3.2.1.tz  -C prod  
cp  prod/atlsim_prod.job  run/atlsim_prod.job

step 2: Customizing
=======

Now you have to customize your copy of the production script "atlsim_prod.job"
in your "run" directory. Leave the original script as it is - it can be used 
as a test to generated 50 short (2 events per job) test jobs with known input
(Higgs to 4e) and output. We recommend you to create a different copy for each
new dataset you are simulating. 

1) You have to select a DataSet from the existing database entries, i.e.:

- dataset simul_001000 is a test sample. The input for this dataset is
  distributed in the same tar-file with the toolkit, so that you can run few
  test jobs with it immediately after step 1 is done.


2) Describe your storage layout in the environment variables.

- PRODDIR:    directory where you have the atlsim production kit installed,
 (mandatory)  i.e. "somewhere/prod" as used in STEP 1 of the example above.

- RUNDIR:     directory which contains this customized production scrip  
 (mandatory)  and where from you will submit batch jobs,
              i.e. "somewhere/run" as used in STEP 1 of the example above.
         
- INPUTDIR:   directory (tree) where you will input EVGEN files or links 
 (optional)   pointing to their real places. If undefined, RUNDIR will
              be searched for existing input files. If requested input 
              file is not found, the script will try to stagein new input 
              If INPUTDIR is undefined, stageing is done in the job working
              directory

- LOGDIR:     where you want to keep production log files (optional,
              by default they will be created in the RUNDIR)

- JOBDIR:     where you want to keep simulation output files (2 Mb per event 
              is needed on everage). If undefined, job working directory will
              be used to keep zebra and histogram outputs.

3) In addition, when your site has a mass storage system (CASTOR, HPSS, etc),
   you can activate the post-production output archiving by defining in 
   addition:

- STORE:   mass storage for output files to be copied using RFIO
  
- SPARE:   reserve area for data saving in case of STORE failure or absence

- HPSSIN:    mass storage accessible using stage_in script (stagein at CERN)

More examples are exposed as comments in the original atlsim_prod.job.


step 3: Testing
=======

If you want to run test job interactively type the command:

cd run
emacs atlsim_prod.job
  replace "PRODDIR `pwd`" by "PRODDIR yourpath/prod"
atlsim_prod.job

   This will create a dataset "simul_001000" which points to the input data
   file "dc1.001000.evgen.0001.test.pythia_100h_4e.zebra"  in the directory 
   "pythia_100h_4e", is provided with the same distribution. 
   This file contains Higgs events with the mass of 100 GeV decaying into
   four electrons in GENZ format generated with PYTHIA event generator. 

To start batch job at CERN on LSF:
bsub -q 8nh atlsim_prod.job
To monitor the job use: bjobs or xlsf

To start batch job at Lyon on BQS:
qsub -l t=138:00:00,M=256MB,hpss,platform=betaLINUX,scratch=2000mb -V -q T atlsim_prod.job
To monitor the job use: qjob and qcat

d) If the test job runs normally you will find the logfile in
   simul_001000/atlas.0001.log
   and simulated events and histograms in
   simul_001000/atlas.0001.zebra
   simul_001000/atlas.0001.his


step 4: Running production.
=======

 The only action required is to put the correct dataset name into your 
production script and to submit batch jobs.

 The correctness of the production is checked by inspection of the logfiles
in the $LOGDIR and the test of the histograms in $STORE/dataset/his and
reconstruction of events in $STORE/dataset/zebra

#

few comments
- For the moment we support only 3.2.1 distributive, so no need to bother
  with signature parameter.

- if you let us add your site description in NOVA VDC database
  we can provide more support for you in the future. Otherwise
  you can always use "default" site parameter.


############################################################################
For all question please contact Pavel Nevski (nevski@bnl.gov)
15-july-2002