Minutes of the Event Data Model Working Group Meeting 02 February 1999 Rob Kennedy, for the CDF Run II Event Data Model Working Group Attending: Rob Kennedy, Pasha Murat, Dmitri Litvintsev, Rick Snider, Liz Sexton-Kennedy, Chris Green, Jim Kowalkowski, Jim Loken By Video: Mark Lancaster, Paolo Calafiura By Phone: Betsy Hafen I) ROOT Output Module - Pasha Murat The following is an ASCII transliteration of Pasha's slides, followed by some of the associated discussion. ----------------------------------------- 1) ROOT Output Module For AC++ P. Murat Feb 02 1999 - Requests for Output Module Writing Out MultiBranch Trees: - OSU (several months ago) - Online Consumers - Top Group (Vecbos Pre-RunII production) - Code: RootMods/TRootOutputModule.cc + References Therein ----------------------------------------- 2) Dealing with the TRYBOS Banks ----------------------------- - TRYBOS_BANK: public TRY_Generic_Bank +----------------------------+ | TObject | TRY_Generic_Bank | +----------------------------+ (RDK: This is done in part to gain access to protected data in TRY_Generic_Bank, p_type) - Add such an object to the event in a regular way - On output: Stream it out 2 Strategies - As a Generic Bank Monotype I4 Done - Overload Streamer for each Named Bank ... Mix 2 strategies above - On input: Read it in, - If TRYBOS record is defined, create YBOS part of in [sic] on the record - If not, on the Head ----------------------------------------- 3) TRootOutputModule: public APPFileOutputModule +--------+ | Module | +--------+ +-------+ +-------+ +-------+ |Stream1| --- |Stream2| ... |Stream3| +-------+ +-------+ +-------+ | | | +-------+ +-------+ +-------+ | File 1| | File 2| | File 3| +-------+ +-------+ +-------+ - The same list of streams - Streams are different from APPStream, bu not much ----------------------------------------- 4) TRootStream: public APPStream TFile* fFile ; TTree* fTree ; ------ GetFile() ; GetTree() ; CreateTree() ; TRootStream(char* name, char* filename) - How does the Stream get initialized? - Filename - AC++ Tools - Tree: Use default constructor, add branches on the fly - 1st Implementation: All the output streams have the same Tree structure ----------------------------------------- 5) Each "Worker" Module - Has a list of NAMED output objects (PREDEFINED) - In the end of the event entry point, module loops oevt the output objects and adds them to the Event +------------------------------+ | Add Object "A" to Branch "B" | +------------------------------+ - In the beginning of the job, User talks to the module and defines (if necessary) which object goes to which branch - There should be reasonable defaults ----------------------------------------- 6) +---------------+ | Event |-------------+ +---------------+ | | | +-------------+ V | | Output List | +-------------+ | +-------------+ | List of all | | | | the objects | | | +-------------+ | +--------+ / | Branch | / | Record | +----/ +--------+ V | +----------+ | | Trybos | +--------+ | Record | | Branch | +----------+ | Record | +--------+ . . . +--------+ | Branch | | Record | +--------+ - At Begin Run (Job?) Output Module looks at the output list and for each stream initializes its Tree (adds Branches to it) - How the output list gets filled? ----------------------------------------- 7) +-----------------------+ | Branch Record | +-----------------------+ +---------+ +----------------------+ | Event # | | List of Things | | Run # | | | +---------+ +----------------------+ | +-----+ | A | +-----+ | +-----+ +-----+ +-----+ +-----+ | B |-| B1 |-| B2 |-| B3 |... +-----+ +-----+ +-----+ +-----+ | +-----+ | C | +-----+ "B" is a list of things itself ----------------------------------------- 8) Conclusions ----------- - We have code for ROOT output module, which writes out ROOT Trees - Trees are configurable at Run-time - Structure of Branches is also configurable at Run-time - TRYBOS Banks are handled transparently, so no modi- fication of user code is necessary +-------------------------------------+ | Completely Transparent for the User | +-------------------------------------+ - Input module is on its way - Working with the "online consumers" (RDK: Kaori Maeshima and Hans Wenzel) - Write out VECBOS output with VECBOS configuration ----------------------------------------- Associated ROOT Output Module Discussion: The scripting involved in this prototype uses cint as the interpreter. Pasha indicated after the meeting that this was simply a convenience choice for prototype work, and that using AC++ and TCL would be feasible as well. One of the main points of agreement reached was that we need to consider carefully the new degrees of freedom introduced by Root Tree/Branch structure. We do not want to present users with too much flexibility (Branch organization of objects becomes chaotic), but we do want to permit some run-time configurability for the sake of unanticipated user requirements. RDK stated that he had some interest in Pasha's work as a possible early prototype for the composite record class which supports Banks and more general objects. In this case, the C++ objects that would be storable would in fact be ROOT objects. If rootcint (or d0cint, with caveats) were to progress to the level of handling our CDFTrack and CDFTrackCollection classes cleanly, then we might prefer that approach to developing persistent representations using auxiliary files (DDL input) manually kept in synchronization with C++ headers and developing generator software. In the meantime, the recipe can be developed for generating the ROOT functions for persistence by hand or translating pieces of C++ rootcint does not understand. Philippe Canal and RDK are discussing the tests needed to prove this can be done with an acceptable transition path. Chris stated that he was concerned that we might be buying into more of ROOT in our infrastructure than originally planned or approved, more of ROOT which would have to be maintained in an optimized, robust state. All generally agreed this was a concern, with whatever implementation we choose for storage of more general objects, but especially with ROOT where we (via PasFrg) have concerns about ROOT's modularity. In this particular application, though, RDK felt that we would not be requiring the use of rootcint in order to read/write our data, only to make it available as a tool to generate the Streamer and associated methods to generate support for a persistent representation. There is still some exploratory work to be done to be sure that rootcint can be separated out as a mere tool rather than as a required element of object I/O. II) General Discussion RDK said he would make snapshots of the EDM Upgrade Proposal available every few days, and that these would not be polished at first. Betsy mentioned that she knew of several grammatical and typographical errors in the posted version, and volunteered to help proofread the drafts as they become available when she arrives at FNAL next week. RDK invited any contributions folks would be willing to make. RDK indicated that, after considering Al Lee's comments by e-mail, that he would include treatment in the proposal of storing objects other than Banks in the record, even though there is no specific design sketch for how to do this yet. (One is expected soon based on a composite record, but perhaps not before this proposal will be complete.) In several instances, as RDK has written up use-cases for the upgrades related to bank number uniqueness and robustness, he found that unexpected consequences resulted which should be documented publicly as soon as reasonably possible in order to generate discussion. For instance, if bank numbers are to be unique for a particular bank name throughout the record's lifetime, then the record must dole out the bank numbers. If a user then creates a bank on the heap, there can be no meaningful bank number assigned since the record is not consulted. Adding the record as an argument to the on-heap bank constructor makes the constructor look just the in-record constructor. There is nothing fatal about this consequence, but it does require more thought to insure that users are given a clear set of instructions, and that the transition does not lead to significant code changes. The question was raised whether we can permit some objects to be storable (passed between modules in the event record), but not persistable. Clearly we do not want this to be permitted in general since we require that a complete snapshot of the event record can be saved to disk at any module boundary. Realistically, though, while we are in transition to the new EDM, we might permit some objects to storable but not persistable, so long as there is committment to make them persistable within a reasonable timeframe. Each object will have methods is_storable() and is_persistable(). Managers can then quickly and automatically list which objects in the event record are not yet persistable by inserting a little audit module at each module boundary in the production executable. This audit module would list each non-peristable class of object found and which module boundary it was first found to occur in the event record. In a similar vein, it was suggested that we require, as a transitional compromise, that all singletons being used to pass event data between modules must have the same method name to return a pointer to an instance, the consensus choice being ``instance()''. This will help us create an audit of how many known singletons are still in use. Its important to note that we cannot discover singletons by any simple means, and must rely on detailed surveys to uncover all that are in use. We would only be able to track already known singletons this way. .the end.