The Level-3 Data Model
Loosely, the data model is what D0 will use to access the D0 data. Clearly,
the offline group will use a full
blown object oriented method to get at the data. The Level-3 environment
puts some unique requirements on the data model that do not exist in the
offline environment.
Data Model Requirements
In Level-3 speed is paramount. As much time needs to be devoted to the
algorithms as opposed to careful book keeping and other overhead. However,
it would be nice to use the same offline data model as the rest of D0:
- Won't have to have two implementations of the data model
- Will be able to move code between the offline and online environments
much more simply. Perhaps, more importantly, programmers used to programming
offline code will have a relatively easy time retooling themselves to writing
tools for the online system.
- If one data model is used through-out the system, there will be no
need to translate between the two data models. Just one more chance for
bugs and a further maintence headache!
As such, follows is a partial list of requirements the Level-3 environment
might put on the data model. Note that a faster data model in the Level-3
environment is also a faster data model in the offline environment. Some
of the extra requirements follow:
Technical
- Don't copy data. There is no better way to eat up CPU time than making
multipule copies of the same data as it moves through the system. Best
to use things like ref-counted objects.
- Arbitrary blocks of memory should be able to be added to a data store
without having to make a copy of the block of memory. The data from the
detectors arrives in the form of 4 or 8 DMA transfers from a bus adaptor
card. Tool code will want to access this data directly; no need to shuffle
it around memory more than just getting it in.
- When adding the arbitrary blocks of data, one should not need header
words around the blocks of data; or the words should be kept to a minimum.
When the DMA is done, we are left with 4 or 8 blocks of data. One thought
is to spit those blocks of data up by crate (67 of them in all), and have
an object in the data store for each crate. As can be seen, the control
we have over the bytes surrounding each crate is relavitely minor, and,
besides, we may want to save it... so there isn't space for extra control
words the data model may require.
- Data object lookup must be really fast. As mentioned above,
we need to spend time triggering, not getting at the data. The key format
should be flexible enough to handle similar results built by different
parameter sets (i.e. same tool run with different parameter sets).
- Linkage to other parts of the offline must be minimal. It does us no
good if we have to link in the whole offline to use the DMG; our computers
won't have memory for that!
- The same for runtime access; the data model should not require extensive
off machine access. It may be possible to get at some items on the local
disk (what if the trigger computers are diskless?). In particular, regular
running (event-by-event) should require no out-of-memory access.
- Control word space requirments should be small. The online system must
shuffle all data it triggers up to the host system and between lots of
the host systems (examines, etc.). A small byte count helps improve the
efficiency!
- Thread saftey
Tools And Scritpts
- If auto generation of data is part of the data model, we should be
able to disable easily.
- Code management must be such that we can easily understand what algorithms
are loaded in any trigger, and, further, what versions of the algorithms.
This may not have to do directly with the design of the data model, but
is does have everything to do with how the physial project is put together.
The Brown Level-3 Group (3/19/97)