Annotated Event Data Model WBS 2000 Rob Kennedy ChangeLog --------- Original: RDK 21 March 2000 Revised: RDK 23 March 2000 - Add comments, time estimate to 2.3.2.6 Content ------- 2.3 Event Data Model 2.3.1 TRYBOS - still in use, but package is in maintenance-only mode * Rob K., began in pre-history, on-going 2.3.2 CDFEDM2 - policies, software defining how event data is manipulated 2.3.2.1 - Management and Support 2.3.2.1.1 - coordinate software development related to CDFEDM2 * Rob K., help from Liz S-K, on-going 2.3.2.1.2 - follow-up on build and validation failures * Rob K., help from Liz S-K and others, on-going 2.3.2.1.3 - maintain web page * Rob K., on-going 2.3.2.1.4 - consult with users on CDFEDM2, C++, etc. * Rob K., help from Liz S-K, Rick S, Chris G, on-going 2.3.2.2 - Documentation 2.3.2.2.1 - write first draft of EDM-for-novices note, post for comments by mgmt and EDM-WG, but do not announce yet * Rob K., began 3/1, complete 3/24, current task * Edm-for-novices note is Second Top Priority at present. 2.3.2.2.2 - create basic set of tools for CDFEDM2 and document - in conjunction with EDM-for-novices notes * Rob K., began 3/1, complete 3/24, current task 2.3.2.2.3 - incorporate examples, datafiles, put into validation - in conjunction with EDM-for-novices notes * Rob K., David Dagenhart, Ken Bloom, and others * began 3/1, complete 4/1, current task 2.3.2.2.4 - later: revise note per comments, fill in missing pieces, test tools on all OSs, announce publicly (This will become "maintain EDM-for-novices on all OSs") * Rob K., begin 4/1, on-going from then 2.3.2.2.5 - create programmer reference guide * Rob K., begin post-MDC2 2.3.2.2.6 - maintain programmer reference guide, test on all OSs * Rob K., begin after creation, on-going from then 2.3.2.3 - Build procedures 2.3.2.3.1 - Fix arch_spec_rootcint.mk to handle indirect depends * Rob K., Liz S-K, input from Jim Amundson, began 3/1 * Liz S-K to test proposed fix by 3/24 after minimal drop * I/O list task completed, release by 4/1 * Release of EDM optimizations is waiting on this * Fixing rootcint.mk is Third Top Priority at present. 2.3.2.3.2 - Adapt official arch_spec_root*.mk to ease transition by some away from RootObj variants of arch_spec files * Rob K., input from Pasha Murat, began 3/1 * Release adaptation after indirect depends fixed 2.3.2.4 - Optimize EDM2 implementation for sequential I/O 2.3.2.4.1 - use profiling, benchmark programs on MDC1 data to find more bottlenecks, "slow code", inefficient algorithms * Rob K., Philippe Canal, Rick S., Chris G., and others * began 3/1 when local MDC-1 data samples used * on-going process, need MDC-2 goals to be met * Strictest goal appears to be 20 MB/sec in Splitter 2.3.2.4.2 - rework postread() to use an object id-to-address map * Rob K., begin 4/15, complete 4/22 * Should yield large reduction in CPU during post-read. 2.3.2.4.3 - use more sophisticated internal event data structure to speed up object selection, iteration, dropping objects * Rob K., begin 4/22, complete 5/1 * Should yield moderate reduction in CPU used. 2.3.2.4.4 - use memory managers once available * Rob K., interest shown by Rick S. also * requires ZOOM or Jim K/Marc P to produce memory mgrs * desirable to begin testing in late April 2.3.2.4.5 - consider making postread() calls optional or on-demand * Rob K., unknown being and completion dates * this may not be required by Splitter program 2.3.2.5 - Optimize EDM2 for Data Logger = Write(TBuffer) - Provides desired alternative to format XXX without - sacrificing rate in Data Logger (writes BLOB to TBranch) * Rob K., begin 3/23 when Philippe Canal releases RH5.X version * version of ROOT with this capability (RH6.1 exists) * I expect 1-2 weeks for completion of tests and integration. * Write(TBuffer) is First Top Priority at present. 2.3.2.6 - Optimize EDM2 for Splitter = Read(TBuffer), etc. - Allows splitter to read most of event without allocating - memory for individual objects in event. Coupling with the use - multi-branch events, this may be needed for MDC-2. * Rob K., begin when Philippe Canal/ROOT team releases version * of ROOT with this capability. I expect 1-2 weeks of testing, * integration after that. More difficult than Write(TBuffer). * Desirable, but not required, for MDC-2 # 3-23-2000: Discussed with Miro. Tentative estimate for availability is end of April, depending on delivery of feature by ROOT team. He is eager to apply this to the Farms Concatenator program which does not need multiple branches (all information needed is external to event), but has a more severe I/O requirement since there is only one process treating all the data. Adding Read(TBuffer)/Write(TBuffer) functionality to the existing Sequential Root event model should yield a great improvement in the Concatenator's processing speed. Because it is simpler and applies to a more severe I/O requirement, this should be tried before exploration of multiple branches and the more complicated mixed Read(TBuffers)/Read(TBranch), which means read and interpret one TBranch while only reading into TBuffers the byte streams for other TBranches, as may be done in the Farms Splitter program. 2.3.2.7 - Extend EDM2 to support I/O drop/keep lists 2.3.2.7.1 - initial minimal implementation * Liz S-K, began about 3/15, completion 3/24 2.3.2.7.2 - more general API and sophisticated implementation using optimized internal event data structure * Rob K. and Liz S-K, begin sometime after MDC-2, * Completion in about one week, but depends on completion * of RCP integration. 2.3.2.8 - Extend EDM2 to support object compression in various modes - This is a broad topic, including the appproach to summary - objects, DATSQZ-like approaches, and ROOT branch compression. * Rob K., Rick S., and others interested. First need to define * an overall plan as to what will be supported. Unknown begin * and completion dates. Not needed for, but may be desired by, * MDC2. Sometime in early April, we should have Henry Frisch's * High-Level objects group, EDM working group, and other * interested parties (Pierre, Avi, ...) to flesh out this topic. 2.3.2.9 - Extend EDM2 to maintain list of class names in data file - Class name list is saved in a separate branch in data file. * Rob K. May be only a client task of EDM, or may require * modifications to the EDM in order to be CPU efficient. * Depends on completion of first multi-branch data file class. * Should interface with the Production database in * space-efficient fashion. Unknown begin and completion dates. * Desirable, but not required, by Run2. 2.3.2.10 - Extend EDM2 to support templated find and append methods - Permits class specific information to be used in find. * Rob K., unknown begin and completion dates. 2.3.2.11 - Extend EDM2 to support "indexedLink" classes which provide - persistent pointers to Streamable (rather than Storable) - Object classes within Storable Containers * Initial prototypes by Armin Reichold and Pierre Savard * General case integrated into EDM by Rob K. * Begin 3/20, no specific completion date yet. * Desirable, but not required, before MDC-2. 2.3.2.12 - Adapt EDM2 to new RCP system - adds ability to drop/find by creating module, parameter set * Liz S-K and Rob K., should take only a few days to complete * the EDM-specific portion of this task. 2.3.2.13 - Adapt EDM2 code to use ErrorLogger (rather than std::cerr) * Does not require an expert for most of the work, * only for final tweaking of error reporting levels. * unknown begin date, should only take a few expert days. 2.3.2.14 - Improve linking to new objects in production output stream - General utilities now have no automatic way to recognize - new objects put out by production unless hand editting done. * Desirable for Run2 2.3.2.15 - Improve robustness, close loopholes (esp. collections) * Rob K., on-going work, a few specific instances need to be * A few known cases now which may need to be looked into * before examples may disappear in development 2.3.2.16 - Improve error handling, adapting ROOT errors to ErrorLogger * Rob K., with some input from ZOOM w.r.t ErrorLogger code * being compatible with ROOT error-handling scheme. * Perhaps we can request the Zoom work begin after ROOT-related * optimization work begins to slow down. * Desirable, but not required, for MDC-2. 2.3.3 ROOT Event Model - organization of objects in event on disk 2.3.3.1 - Implement first sequential event model * Rob K., COMPLETED 2.3.3.2 - Explore usefulness of ROOT data browsing/interaction and implications on event model, CDFEDM2 code, and rootcint. Is an n-tuple look-and-feel data file possible with CDFEDM2? * Liz S-K and Philippe Canal to look into this soon. * Not required for MDC-2. 2.3.3.3 - Propose, implement first multi-branch event model * Rob K., begin soon, may take one to two weeks to implement * This may be needed by MDC2 to improve rate in Splitter 2.3.3.4 - Optimize multi-branch event model for production use * Rob K., unknown begin date, estimate 2 weeks to implement * Not required for MDC-2, should be done before Engineering Run 2.3.3.5 - Create mechanism for groups to define their own branches and class-branch assignments, and record these as meta-data * Rob K. and others, with input from physics groups and DH. * Not required before Run II begins, but desirable 2.3.3.5 - Determine how/where we can take advantage of Split-Root files * Generally unknown how well we can match our DH system with * this approach to treating ROOT data. ROOT team knows about * this and is open to suggestions to improve our ability to * co-ordinate files containing differents pieces of the same * events given our DH system may deliver files "out-of-order". * Not required before Run II begins, but desirable 2.3.4 Event Data Format - format of data in objects 2.3.4.1 - Management and Support 2.3.4.1.1 - Determine which objects lack support and are needed, when, and by whom * Rob K., David Dagenhart, On-line and Off-line managers * needed as soon as possible 2.3.4.1.2 - Co-ordinate definition and implementation of objects * Rob K., David Dagenhart, On-line and Off-line managers * needed by Engineering Run 2.3.4.1.3 - Consult on implementation related to CDFEDM2, C++, etc. * Rob K., David Dagenhart, Rich Glosson, David Waters, * and others. Effort is on-going. 2.3.4.2 - Header Banks/objects (LRIH, EVCL, etc.) * Rob K., Liz S-K, and Rick S. * Run I banks classes done, new LRIH done * Major overhaul of what was stored in EVCL bank expected, * the details have not been worked out, nor a person assigned. 2.3.4.3 - Raw Data, Calibration, Pedestal Banks (D, C, P banks) 2.3.4.3.1 - CDF 4152 maintenance and support * David Dagenhart, on-going, with slow-down by 5/12 2.3.4.3.2 - Bank accessor functions * David Dagenhart, Rich Glosson, and * Simona Rolli (for Trigger banks) * began in pre-history, completed by 5/12 2.3.4.3.3 - Bank simulation functions - Simplify Simulation's task of filling raw data banks - Includes add_block() and commit() methods * David Dagenhart has installed constructor(array) * Related to adapting Simulation software to use * StorableBanks (now use Generic Banks for some banks). * unknown who will complete this 2.3.4.3.4 - Bank channel map functions - Map logical id's to readout element or detector element - May be should be separated from Banks classes and - treated something like Geometry with database behind it. * David Dagenhart documented in CDF 4152 * Calorimeters uses CalorKey, but other detectors? * unknown who will implement this 2.3.4.4 - Trigger Banks (T banks, trigger readout banks) * Simona Rolli (L1/L2) and Kevin McFarland (L3) * Some work is progressing in parallel with raw data banks * For status, see WBS Section 2.12: Trigger Simulation and * WBS Section 2.2.5: Level 3 Modules * Some needed by MDC-2, others no later than Engineering Run 2.3.4.5 - Simulation objects (HEPG, OBSP, OBSV, M banks) * Pasha Murat * First version of nearly all Simulation ('M' banks) now * consistent with Offline. * Missing StorableBank classes for MCOT and MVTX. * For status, see WBS Section 2.11: Detector Simulation * Needed by when??? 2.3.4.6 - Reconstruction objects (Track, Jet, Electron, Photon, Muon) * Rick S., Pierre S., Muon Group * Most already exist, improvement and support on-going * For status, see WBS Section 2.2: Event Reconstruction * Needed now for algorithm development 2.3.4.7 - Higher-Level Physics objects * Henry Frisch and Simona Rolli * First prototypes appearing in Offline. * For status, see WBS Section ??? * Needed into Run2