Revised EDM Model Notes 10-11 February 1999 Revised: 16 February 1999 Rob Kennedy The base class for all "storable" objects is StorableObject. A StorableObject contains a StorableLink, which consists of a StorableObjectName (string) and a StorableObjectNumber (int). StorableLink is similar to the TRY_Bank_Key class. (Sidebar::: D0 EDM roles: Chunk ~~ StorableObject, ChunkID ~~ StorableLink. I chose "StorableXXX" for names since most people do not know what a Chunk is, or whether a particular class is a kind of Chunk. We can worry about the length of the resulting class names later. One draft of this used the name Hunk instead of StorableObject, but the result I think was somewhat non-intuitive text and code. I avoided using "StorableKey" because some groups prefer to see search keys used to find object in the event record to be more elaborate than name, number.) The StorableObjectName uniquely identifies the class of the StorableObject. Once stored, the StorableObjectNumber is unique for the given StorableObjectName for the EventRecord in which it is stored for the lifetime of the EventRecord. If the object is not stored, the StorableObjectNumber is undefined. A StorableObject can contain as data members: char, uchar, short, ushort, int, uint, float, double, bool, string, and any kind of StorableObject. Support for containing enums, complex<> only on approved request. A StorableObject cannot contain pointers. Instead, pointers are replaced by StorableLinks. A StorableObject cannot contain static data members which affect its logical state. A StorableObject must contain the following methods: a) is_persistable() b) stream_read() = read persistent format, translate to storable format c) stream_write()= translate to persistent format, write persistent format d) conversion constructors = translate transient to storable format ----------------------------------- An EventRecord is a StorableRecord with some added data members for bookkeeping and internal maintenance. A StorableRecord is a kind of StorableObject. It is an ordered collection with random-access retrieval and deletion, but insertion at end only. No duplication of StorableLinks (associated with the contained StorableObjects) is permitted. The StorableRecord design is driven by backwards compatibility with existing software. Ideally, the StorableRecord would be an unordered collection (a set) to decouple its structure from assumptions made in algorithms. We have too many algorithms, however, which assume that objects in the event will remain in the order in which they were inserted into the event. It may take a substantial retrofit to change these algorithms to use an unordered collection of objects. All StorableObjects in the EventRecord are read-only for clients. StorableObjects outside of the EventRecord are read-write. Since StorableCollections are simply specializations of StorableObject, their read/write status is treated in the same manner. That is, collections and the elements of the collection all become read-only once they are stored. Input filtering: The entire EventRecord (at least the selected branches) are read in. Then all objects and classes of objects referred to in the ``drop upon input'' ObjectList are removed from the EventRecord by AC++. The ``class input'' ObjectList is then created which lists all the classes which were input. Output filtering: Only objects referred to in the ``object output'' ObjectList are written to a particular stream. Each output stream has its own such list. Each list is initialized with the ``class input'' ObjectList. References to objects or classes of objects in a ``drop upon output'' list are then removed from the ``object output'' list. References to objects or classes of objects in a ``keep upon output'' list are then added to the ``object output'' list. Only those objects or classes of objects that remain in that list are written to the output stream. Users can manipulate the content of these ObjectLists, but only I/O modules perform the filtering. ----------------------------------- EDM classes ----------- StorableObject the base class for all Storables StorableLink (name,number), similar to TRY_Bank_Key StorableVector like std::vector<> StorableList like std::list<> StorableSet like std::set<> StorableRecord collection to support EventRecord StorableCollectionSummary contains a StorableVector StorableAssociationSet contains a set of StorableLink StorableObjectList contains StorableNames and StorableLinks EventRecord contains a StorableRecord and a hashtable from which to assign StorableLinks Consider: StorableMultiset<>, StorableMap<>, StorableMultimap<>, StorableHashTable<> --------------------------------------- Transition Issues ----------------- 1) Bank number values are currently defined by the user and can be re-used. Some code explicitly expects to find its favorite bank with a specific bank number. No code now expects to receive the bank number as a return value from an insert(). 2) Some of the policies and examples would require a fair amount of redesign of algorithms or of classes. The amount and the value of immediate adaptation versus gradualy transition needs to be determined. 3) <<>> --------------------------------------- Situations to describe below: 1) Self-contained object a) Storable as-is b) Uses adapter class 2) Object with associations a) Unidirectional association b) bidirectional association 3) Collection of objects a) Homogeneous collections b) Heterogenous collections 4) Choice of depth of object collections ------ 1a) TotalMissingE is a (hypothetical) class that does derive from StorableObject. It contains a few data members implemented by builtin data types, and so is compatible with the restrictions on StorableObjects. The user can create the object, store it in EventRecord, find it in EventRecord, and so on without having to perform a format translation. We might consider recommending a naming convention to make it obvious to users of the class that no transient to/from storable format translation is required. Perhaps StorableTotalMissingE is too long a name, though. // Create and Store a StorableObject // ================================= bool event_AAA(EventRecord* p_event) { float Et_mag = 0.0 ; float Et_eta = 0.0 ; float Et_phi = 0.0 ; TotalMissingE myMissECalc(Et_mag, Et_eta, Et_phi) ; // Note: Add an argument identifying *who* created this StorableObject // Note: Add an argument identifying *what* were parameters used StorableLink myMissEID = p_record->insert(myMissECalc) ; if (myMissEID.is_invalid()) { /* ERROR: */ } return(true) ; } // Retrieve a StorableObject // ================================= bool event_BBB(EventRecord* p_event) { RecordIterator otherIter = p_record->find_first("TotalMissingE") ; if otherIter.is_invalid()) { /* ERROR: object is not in event */ } TotalMissingE otherMissECalc(myIter) ; // OBJECT MARKED AS READ-ONLY StorableLink otherMissEID = otherMissECalc.storableLink() ; // unnecessary float Et_mag = otherMissECalc.et_mag() ; float Et_eta = otherMissECalc.et_eta() ; float Et_phi = otherMissECalc.et_phi() ; otherMissECalc.set_et_mag(9.0) ; // ERROR: object is read-only return(true) ; } // Update a StorableObject // ================================= bool event_BBB(EventRecord* p_event) { RecordIterator otherIter = p_record->find_first("TotalMissingE") ; if (otherIter.is_invalid()) { /* ERROR: object is not in event */ } TotalMissingE otherMissECalc(myIter) ; // OBJECT MARKED AS READ-ONLY StorableLink otherMissEID = otherMissECalc.storableLink() ; TotalMissingE myMissECalc(otherMissECalc) ; // Note: Add an argument identifying *who* created this StorableObject // Note: Add an argument identifying *what* were parameters used myMissECalc.set_et_mag(9.0) ; myMissECalc.set_et_eta(1.0) ; myMissECalc.set_et_phi(3.0) ; StorableLink myMissEID = p_record->insert(myMissECalc) ; // myMissEID != otherMissEID if (myMissEID.is_invalid()) { /* ERROR: insert failed */ } return(true) ; } // Delete a StorableObject // ================================= // Only AC++ actually deletes objects. A user can add a particular object or // a class of objects to a ``drop on output'' object list. // How do we adapt to this change? IMPLEMENTATION NOTES ==================== a) TotalMissingE myMissECalc(Et_mag, Et_eta, Et_phi) ; StorableLink myMissEID = p_record->insert(myMissECalc) ; The EventRecord is a StorableSet. Since we are not using a handle layer in StorableObjects, however, I cannot easily use reference counting on StorableObjects to minimize the memory allocation and data copying associated with storing an object. The insert() will allocate and construct a new TotalMissingE instance using myMissECalc to initialize it. A pointer to the new instance will then be put into the StorableSet. Users may be able to make use of reference counting themselves, but I do not see a grand infrastructure solution. b) RecordIterator otherIter = p_record->find_first("TotalMissingE") ; TotalMissingE otherMissECalc(myIter) ; // OBJECT MARKED AS READ-ONLY otherMissECalc.set_et_mag(9.0) ; // ERROR: object is read-only The find_first operation will, at the primitive level, return a pointer to a StorableObject, which I call RecordIterator here. This corresponds to an int* in use in Trybos at present. Note that the constructor is intended to refer to an object that is stored elsewhere. I use the int* pointer now with a handle jacket to accomplish this... which I lack in this model. I need to copy the data member values out of the stored object and into a heap object. This implements the read-only "feature" in a sense, but we need to lock the heap object anyway since existing code assumes the stored object is what will be altered by the set method, not a copy on the heap. c) // Update a StorableObject // ================================= bool event_BBB(EventRecord* p_event) { RecordIterator otherIter = p_record->find_first("TotalMissingE") ; if (otherIter.is_invalid()) { /* ERROR: object is not in event */ } TotalMissingE otherMissECalc(myIter) ; // OBJECT MARKED AS READ-ONLY StorableLink otherMissEID = otherMissECalc.storableLink() ; TotalMissingE myMissECalc(otherMissECalc) ; // Note: Add an argument identifying *who* created this StorableObject // Note: Add an argument identifying *what* were parameters used myMissECalc.set_et_mag(9.0) ; myMissECalc.set_et_eta(1.0) ; myMissECalc.set_et_phi(3.0) ; StorableLink myMissEID = p_record->insert(myMissECalc) ; // myMissEID != otherMissEID if (myMissEID.is_invalid()) { /* ERROR: insert failed */ } return(true) ; } 1b) RecordHeader is a class that does not derive from StorableObject. An adapter class StorableRecordHeader class publicly derives from both RecordHeader and StorableObject. The developer must integrate the translations between RecordHeader and Storable RecordHeader. The user must cause the translations to occur in constructors, assign methods, etc. Note that one can create a StorableObject on the heap. // Create and Store a non-Storable Object // ====================================== bool event_AAA(EventRecord* p_event) { RecordHeader myHeader ; // Create non-storable object myHeader.set_run_number(12345) ; // and then set its data values myHeader.set_event_number(79) ; myHeader.set_experiment_type(EventHeaderConstants::run2_monte_carlo) ; StorableRecordHeader myStoreHeader(myHeader) ; // TRANSLATE CLASS AND FORMAT // Note: Add an argument identifying *who* created this StorableObject // Note: Add an argument identifying *what* were parameters used StorableLink myStoreHeaderID = p_record->insert(myStoreHeader) ; if (myStoreHeaderID.is_invalid()) { /* ERROR: */ } return(true) ; } // Retrieve a StorableObject // ================================= bool event_BBB(EventRecord* p_event) { RecordIterator otherIter = p_record->find_first("RecordHeader") ; if otherIter.is_invalid()) { /* ERROR: object is not in event */ } StorableRecordHeader otherStoreHeader(myIter) ; // OBJECT MARKED AS READ-ONLY StorableLink otherHeader = otherHeader.storableLink() ; int run_number = otherStoreHeader.run_number() ; int event_number = otherStoreHeader.event_number() ; int experiment_type = otherStoreHeader.experiment_type() ; otherStoreHeader.set_run_number(9) ; // ERROR: object is read-only return(true) ; } // Update a StorableObject // ================================= bool event_BBB(EventRecord* p_event) { RecordIterator otherIter = p_record->find_first("RecordHeader") ; if otherIter.is_invalid()) { /* ERROR: object is not in event */ } StorableRecordHeader otherStoreHeader(myIter) ; // OBJECT MARKED AS READ-ONLY StorableLink otherHeader = otherHeader.storableLink() ; RecordHeader myHeader(otherStoreHeader) ; // TRANSLATE TO TRANSIENT CLASS myHeader.set_run_number(12345) ; myHeader.set_event_number(79) ; myHeader.set_experiment_type(EventHeaderConstants::run2_monte_carlo) ; StorableRecordHeader myStoreHeader(myHeader) ; // AVOID set METHODS // Note: Add an argument identifying *who* created this StorableObject // Note: Add an argument identifying *what* were parameters used StorableLink myHeaderID = p_record->insert(myStoreHeader) ; // myHeaderID != otherHeaderID if (myHeaderID.is_invalid()) { /* ERROR: insert failed */ } return(true) ; } // Delete a StorableObject // ================================= // Only AC++ actually deletes objects. A user can add a particular object or // a class of objects to a ``drop on output'' object list. // How do we adapt to this change? 2a) Suppose Child contains a unidirectional association to Parent, which is implemented as a pointer. StorableChild implements this association as a link to StorableParent. A link contains the StorableLink name of the object class and a StorableLink number unique to that name. When StorableChild is created from a Child, what does it do about the pointer in Child? If Parent has not already been stored as a StorableParent, then there is no StorableLink to store in StorableChild. The module which has transient Child and Parent objects must translate to and store the StorableParent object first, put its StorableLink into the StorableChild, and store the StorableChild. Suppose Child contains a unidirectional association to Parent, but that association is considered unnecessary for StorableChild. Some other means of fully restoring the state of the Child using the StorableChild data must be defined by the user. It is desirable, one can store the association in a separate object, StorableChildParentAssociation, which can be used if required to restore the association in Child. If some means of restoring the association is not defined, the developer should remove the association from Child. IMPLEMENTATION NOTES ==================== i) Create and Store Parent aParent(...) ; Child aChild(...) ; // NULL link to parent, could take &aParent arg achild.setPointer(&aParent) ; // state now fully defined StorableParent aStorableParent(aParent) ; StorableChild aStorableChild(aChild) ; // no link to parent StorableLink parentLink = p_record->insert(aStorableParent) ; aStorableChild.setLink(parentLink) ; StorableLink childLink = p_record->insert(aStorableChild) ; ii) retrieve RecordIterator childIter = p_record->find_first("Child") ; StorableChild aStoredChild(childIter) ; // OBJECT MARKED AS READ-ONLY StorableLink parentLink = aStoredChild.parentLink() ; RecordIterator parentIter = p_record->find(parentLink) ; StorableParent aStoredParent(parentIter) ; // OBJECT MARKED AS READ-ONLY Parent aParent(aStoredParent) ;// Transient parent instance Child aChild(aStoredChild) ; // Parent pointer left NULL inside aChild aChild.setPointer(&aParent) ; // Parent pointer restored inside aChild // OR Child aChild(aStoredChild, &aParent) ; // Parent pointer restored This shows the process to restore an association (set the correct value for a pointer data member). This cannot be done automatically since the transient instance of Parent cannot be "found" using the parentLink. The code author must generate the transient instance of Parent, then inform the transient instance of Child where the Parent is. Note that this permits the code author to avoid restoring the association if it is not needed. In this case, the instance of Child is not fully restored to its former state. 2b) Suppose Brother and Sister contain a bidirectional association. Each has a pointer to the other. Either this association is not stored (cannot store circular link chains), or the association is captured in its own object. The storable classes are StorableBrother, StorableSister, and StorableAssociationSet. IMPLEMENTATION NOTES ==================== i) Create and Store Brother aBrother(...) ; // NULL link to sister Sister aSister(...) ; // NULL link to brother aBrother.setPointer(&aSister) ; // state now fully defined aSister.setPointer(&aBrother) ; // state now fully defined StorableBrother aStorableBrother(aBrother) ; // no link to sister StorableSister aStorableSister(aSister) ; // no link to brother StorableLink brotherLink = p_record->insert(aStorableBrother) ; StorableLink sisterLink = p_record->insert(aStorableSister) ; StorableAssociationSet relationship(brotherLink, sisterLink) ; StorableLink relationshipLink = p_record->insert(relationship) ; ii) Retrieve RecordIterator relationshipIter = p_record->find_first("StorableAssociationSet"); StorableAssociationSet relationship(relationshipIter) ; // READ-ONLY StorableLink brotherLink = relationship.brotherLink() ; StorableLink sisterLink = relationship.sisterLink() ; RecordIterator brotherIter = p_record->find(brotherLink) ; RecordIterator sisterIter = p_record->find(sisterLink) ; StorableBrother aStorableBrother(brotherIter) ; // READ-ONLY StorableSister aStorableSister(sisterIter) ; // READ-ONLY Brother aBrother(aStorableBrother) ; // NULL link to sister Sister aSister(aStorableSister) ; // NULL link to brother aBrother.setPointer(&aSister) ; // state now fully restored aSister.setPointer(&aBrother) ; // state now fully restored 3a) Suppose a class StorableMuon is derived from StorableObject, and the user is working with a homogeneous collection of these called MuonSet. To be storable itself, MuonSet must derive from StorableObject. Let us assume that MuonSet derives from StorableSet in order to re-use its functionality. For all practical purposes, then the MuonSet is treated as just another StorableObject, except that there is data structure functionality added to its interface. IMPLEMENTATION NOTES ==================== The user code to store and retrieve a homogeneous collection would look nominally just like the code to treat a simple object. The implementation would be similar to storing associations. A collection summary object would be created and stored after all the collection elements were stored, and it would be regenerated before the collection elements were retrieved. There are some complications of this if the elements of the collection contain associations themselves. Track is an element of the collection TrackSet. Each Track contains an association with a lower-level object SegmentSet. IMPLEMENTATION NOTES ==================== i) Create and Store... general case TrackSet aTrackSet(...) ; // Assume it is already fully initialized StorableTrackSet aStorableTrackSet(aTrackSet) ; // This would lead to creating a complete new copy of the collection, with each // Track copied into a StorableTracks. This could be very time-consuming. // Implementing the above constructor: _the_set.clear() ; TrackSetIter current = aTrackSet.begin() ; while (current != _the_set.end()) { Track aTrack = current.value() ; StorableTrack aStorableTrack(aTrack) ; ===> Source implementing above line { // copy simple data members // leave links in "null" state } ===> StorableSegmentSet tempSegmentSet(aTrack.p_SegmentSet) ; StorableLink aSegmentSetLink = p_record->insert(tempSegmentSet) ; aStorableTrack.setLink(aSegmentSetLink) ; _the_set.insert(aStorableTrack) ; ++current ; } Thus, to fully capture the state of the TrackSet, the developer must follow all pointers, transform and store the pointed-to object, then set the link in the storable form of the pointing object. Clearly this may lead to significant duplication and waste. A little cleverness can avoid much of this, relative to this general case. ii) Minimize storage among overlapping collections The HitList class is a collection of Hits. Hits are fairly large objects. Each Track is associated with a HitList collection. There is significant overlap in the Hits contained in the HitLists. We want to minimize storage requirements for a TrackSet. We do not want to store multiple copies of the same Hit while we store each Track's HitList. This leads to treating the HitList as a reference list rather than a value list. All values are stored once in grand HitList. HitLists for individual Tracks are then stored as StorableCollectionSummary of the Hits. *) Store the "grand" HitList collection in the EventRecord, and create a mapping of StorableLinks of the Hits in the collection to/from Hit address or indices depending on how individual Hits are referenced by the HitList. *) While converting each Track to storable form, create and store a StorableCollectionSummary containing the StorableLinks of the Hits in this Track's HitList. Store the StorableLink of this in the StorableTrack. *) Restoring the TrackSet, Tracks, HitLists, and Hits is roughly the reverse of this procedure. 3b) There is no direct support for heterogenous collections. The user must save and restore the collection elements individually and must save and restore a collection summary object containing links to the members of the collection. However, we might consider supplying a StorableCollectionSummary class with data members: number of links, "vector" of links. 4) It is a matter of choice what "depth" of object collections to use, depending on the algorithm in use and the interests of the module writer. We usually work with objects in homogeneous collections (a TrackSet, a TowerSet), and in some cases with single objects (RecordHeader). In some cases, treating objects in tightly coupled heterogenous collections is appropriate (DecayTree). One should avoid using the EventRecord as the ordered container of multiple objects as was done in Run 1 (loop over electron banks used a record iterator over ELES banks instead of a specialized ElectronListIterator over an ElectronList). Collections should be supported with a Storable collection class, both for the sake of efficiency and to decouple the object collections from the event record design. At the same time, one should avoid excessive depth in collections. A collection such as SetOfSetsOfThings may be a convenience to someone, but there is likely to be some storage space and/or CPU time overhead to capture the relationship implied by each level of a collection. The question to ask is, "are the associations implied by this extra level of collection worth the storage space and CPU?".