Summary of CAF Specs


In the following we summarize the numbers provided in the CAF specs as detailed in CDF Note 4072 and CDF Note 4100, as well as our present best guess at the datasets as presented in CDF Note 4718 (trigger proposal, see p.107 for summary table) and CDF Note 5565 (datasets and streams). For the latter, we updated the information to be consistent with the present summary table in CDF Note 4718.


CAF specs

The intention of the CAF specs from 1997 seem to have been to define central analysis facilities that would provide a level of "user satisfaction" similar to Run I.
We'll need to get some info on what level of satisfaction there was in Run I. Preferably something quantitative like "prior to summer conferences dddd it took X hours to for user analysis skim on dataset Y"

     Summary of CDF 4072 and 4100:
     -----------------------------

     1200 MIPS sec for full reconstruction of an average event

     56,000 MIPS CPU power for reconstruction
                     (that's the PC farm, I suppose, and is thus
                      irrelevant for us here.)
                                               

     90,000 MIPS CPU power for analysis (that's central facilities, I suppose)
            This was obtained by scaling Run I capability by a factor 25.
            For comparison, the luminosity ratio Run II/Run I is 20.

     160TB of PAD data. For kicks, you can arrive at a very similar
                        estimate by taking 75Hz L3 output @ 1e32 = 750nb
                        750nb * 2fb-1 = 1.5e9 events
                        1.5e9 events @ 100kB/event = 150TB of PAD output.
                        The actual estimate was done in a much less
                        naive fashion.

     18% of PAD = 28TB desired to be disk resident @ central facility
                       This was chosen as it is the same fraction as Run I.

     4TB data skimming/serving per day on central facility
          which translates into a 50MB/s I/O bandwidth requirement.
          This number was arrived at as follows:
          5% of events are part of Run I official data sets
          assume we want to be able to run through 1/2 of that in one
          day => 5% * 0.5 * 160TB = 4TB.

     Apart from these numbers, CDF/DOC/COMP_UPG/PUBLIC/4100 also has a
     variety of more or less detailed usage cases for actual analyses
     that were done in Run I.

     fkw notes:
     ----------
     Some of us should take a look at the use cases they
     report in 4100 in some detail, and figure out if the 4TB/day is
     sufficiently well justified. E.g. one might naively expect that in
     Run II the fraction of useful data is much larger than 5% given
     the change in Level 3 trigger !?!


CDF 4718 and 5565 summary

CDF 5565 describes 9 streams:
-----------------------------

Stream           brief description           Xsection [nb] out of L3
---------------------------------------------------------------------
1         express: J/psi,W,Z,zero-bias             ~20
2         High-Et leptons et al.                   ~60
3         photon triggers (mostly High-Et)         ~50
4         Di-tau and alike (mostly High-Et)        ~70
5         zero-bias & diffraction                  ~33
6         missing Et no leptons                    ~80
7         QCD jets                                 ~50
8a        hadronic 2-track                         ~110
8b        lepton+track b-trigger                   ~50
9         J/psi and other di-lepton                ~40

Note: fkw split 8 into 8a and 8b because 5565 is missing the 100nb
      hadronic 2-track triggers that are included in latest version of 4718.

4718 summary table


Miscellaneous other numbers and comments


Modified: Mon Aug 20 12:07:52 CDT 2001 Frank Würthwein