New Reco Framework Wishlist with Comments

Link to Sergei's version

Item
Topic
Description
Priority
Comments
Supported?
1
Transparency
Minimize inheritance and overloading


This goes along very well with ABC's - remember no implementation means nothing to inherit and/or override (I think Jin meant overriding, not overloading here).
2
Transparency Candidates to be infinitely modifiable until written to disk - No cloning


Yes - new reco objects are modifiable. There have to be some decisions made about what can be modified and when, otherewise it can become chaotic.
3
Transparency Unified approach to useage of navigation tools and other common utility functions

Absolutely! STL is the "new and unified" approach to navigation and common utility functions. Please learn and use STL.
4
Transparency Minimize need for boilerplate code


Handles and Lists are class templates in the new framework, so a large amount of "CandXxx[List]Handle" classes (that usually require boilerplate code) are not existent. A reco develper has to write only an Algorithm class and  nothing else. (One could add their own implementations of the reco objects, but it is not necessary).
5
Transparency Rewritten code conforms strictly to common set of coding practices


There would be very strict (draconian) rules in the new reco fwk. For example with Handle<Xxx> objects the rules  are:
1) no pointers to hanldes
2) no "new"
3) no Handles should live between records
Documentation with howto's and examples is essential and would be provided.
6
Transparency
Changes to one candidate interface should minimally impact other developers


This is true in the sense that the changes to say TrackABC do not affect Shower objects in any way. But changes of TrackABC interface absolutely have to affect all the track implementations - they all have to be modified to accomodate any interface change and there is no way around it (of course we could provide a "default implementation" that will be changed along with the interface; but then having a default implememntation goes against item 1 of this wishlist). Compiler will immediately point out what changes have to be made in the implementtation classes when the interface is modified.
7
Transparency No string lookups - or else all strings should be defined in one place in the code


We agree, but this is orthogonal to the proposed framework.
8
Transparency In production for each object type write out only 1 list from 1 algorithm. Object modules know about all possible algorithms for creating that object.


Do not understand what this means.
9
Data I/O Model
'Snarl'-centric, 'event'-centric, 'slice'-centric data models should be supported. Event structure should be modified to provide a more natural container for event components.

This is all up for discussion. Any-centric system can be implemented.
10
Data I/O Model Navigate through objects in a natural fashion in both directions in the hierarchy


Navigation from top to bottom is trivial (as in event -> tracks -> strips -> digits). Navigation from bottom to top in the sense of every object having a "back-link" to its parent object is not built into the new framework (adds a lot of complexity without being real necessary). This functionality could be trivially implemented using STL search algorithms.
11
Data I/O Model Ntuple-like interace/structure to the candidate to provide all the currently available functionality with additional capabilities provided by bi-directional hierarchy flow. Eliminate standard ntuples. Ability to reprocess candidate files no longer a requirement.

For ntuple interface to candidates: must be a one-to-one correspondance between candidate Get/Set methods and ntuple data member.
Caius: (a) ability to reprocess is useful! (b) Candidate files don't need to be Draw()able - just need a 2 step process to get to analysis ntuple.
The main goal of the proposal is to replace the Candiddate files. As far as we are concerned the ntuples could be left as is or modified to reflect the object structure better. We think that making objects in the proposed framework to behave like ntuples could potentially break or significantly alter the framework design.
12
Data I/O Model I/O performance of new system must not be a reason for not using the candidate files. Consider Partial I/O to speed up access to subsets of data.


This is the most  important consideration. The new reco files should be "usable" in terms of I/O - where the useability condition could be set as "can read one month of ND spill data in X hours"). We have to achieve this by "shrinking" the reco info if necessary (eg dropping digits) otherwise the whole idea is pointless. We are not proposing to replace one type of files that nobody ever uses with another type that nobody would use.
13
Code Development Some place to put arbitrary user data into ntuples and candidates
Once making candidates is easy, can just make a custom CandUser to write user data.
Revive the idea to put a registry in reco objects where some arbitrary data can be saved? Wouldn't be a problem. This could be allowed only for development, but not for production jobs.
14
Code Development New candidate creation should not require an algorithm. Algorithms should act on created data objects, not create them themselves.


We do not quite agree with this.  Algorithms bring some structure into the system.
15
Custom Analysis
Candidates can be truncated or removed from the event structure (to enable re-reconstruction, etc)


No problems dropping List<Xxx> lists. Could add feature to drop individual reco objects if desired.
16
Code Development
Ditch MakeCandidate function and allow one Alg to create N cands.


Agree. Always felt awkward about having AlgXxx and AlgXxxList. In the proposed framework Algorithms return List<Xxx> - i.e. lists of objects.
17
Code Development
Able to keep mulitple copies of an event, e.g. same event processed two different ways, in same mom.  Should also be able to have a reliable selection method.


Also orthogonal to the proposal and could be easily implemented.
18

Documentation!




Other Comments:

Jim: The main benefit to my way of thinking of this exercise is not captured in the list above, though, and that is that it forces a rigorous re-examination or replacement of every line of reco code.  The benefit of doing this is not easy to characterize;  no new 'feature' or behavior results, but is one of the few things I can think of that really justify this effort.
                                                                              
Sue: The use of an ntuple format for the new candidate records should be explored to do a better assessment of the benefits and costs of such a format relative to the two-tiered system (cand -> sr ntuple) we currently support. To this end, we should investigate the use of TRef to maintain ptrs across different branch data, in place of the current indices, since support for use of TRef's in TTree::Draw statements has improved in newer root. Otherwise, if a two-tiered format is maintained, I agree with Nathaniel's comments that the interface between the two tiers should be standardized.