benchmarks: ----------- simpler to harder (depends on completeness of implementation) ERRMSG ESUM ESUM, TSUM, HSTR, HEAD: toy microdst RCP? (access to data via names) effort to add a new object to an event data structure input of modified event output of modified event user code make a jet clustering algorithm reading MC data, writing clusters out+back fast MC comment: "object presistence" is really only object data persistence subverts information hiding to some extent some things we ought to look at ------------------------------- Fortran 77 + D0flavor + ZEBRA baseline maybe take look at DZDOC etc from CERN? Fortran 90 (+D0flavor for some VAX constructs, eg do while) C? except as subset of C C++ EIFFEL Sather (public available descendant) (See Computers in Physics, Sept 1992, O.O. Programming issue) IDL as a data definition language packages for above: (see d0news WWW lists) these are my VERY superficial, perhaps WRONG, summaries; I'd suggest that you look for yourself!!!!! ADAMO F77 runtime object creation, I/O CHEETAH C (C++?) object I/O see FreeHep JAZELLE F77 + Mortran see FreeHep for FTP info DSPACK C,F77 runtime object creation; returns id# ("handle") MOP Moose projects Object Persistence Package Possible contact: Irwin Gaines; else see WWW list in D0news OBJECTSTSORE http://asdwww.cern.ch/pl/cernlib/rd45/index.html click "References"; down UK Contacts same place area many others. Someone should scope out what RD45 plans to look at, if possible. PTOOL (son of, perhaps rewrite by CAP); original via D0news WWW lists have refs to ptool FARFALLA mainly C++ object creation/ I/O TYPES/VF F77 runtime object creation, I/O various data transport packages used outside HEP: CDF, netCDF, HDF, PDB/PACT We have some relevant experience with object-oriented code already, but not with inheritance. ERRMSG, RCP, ESUM are conceived as a mostly-hidden data structure with associated routines for creation, manipulation, display, and destruction. This section is mainly to clarify some of my own thoughts about one viewpoint on these packages. Feel free to ignore it if you find my way of looking at it confusing. Generic ZEBRA banks accessed only through GTxxxx routines would also be in this category, except we seldom do it From this we have learned some things - PATH: makes most objects accessible by form path.name it is useful to have different versions of the same bank, and later to be able to compare them the depth-one stack is not too much of a nuisance to use easier, more general than forcing explicit reference by path.object -banks problem areas are debugging no compiler help for overwrites, out of array bounds readability of code no names as we in practice wrote it incomplete self-description of data no names of elements, comments - RCP: objects are named by bank.var.n where n is the array element only one instance at a time managing the bank name on a full stack is a rather expensive feature which is hard to debug and generates quite a bit of extra code one cure would be to ask for the full name in every access function, but that seems to give up too much flexibility (high-level redirection in a control routine) maybe going to a one-level stack would be a useful intermediate thus introducing a path notion in STP would be useful in several contexts some utilities were hard to write (dropping) because of complex data structure which didn't map directly to underlying memory manager - ESUM: objects are referenced by source.type.element instances are recognized by system and either classified as new or old if new, a new instance is created, but no pointer is returned at that time; all instances of a given source.type.element may be returned on request adding a new source is very easy: just make a new identifier (registry not needed because few of them) adding a new object type is more work: add to include files compile several routines find which ones need it: a pain in a production release possibly change print routine adding a new element is the most work new code in many places, including user code, since no notion of "default" for missing arguments limited customization by allowing "flag word" element which different object types can use differently - ERRMSG: identified by ascii message_id id space is large enough to avoid clashes (no management) objects created by simply referencing them instances are counted but not saved added attributes: routine, severity, variable text not part of saved data structure originally later save instances in zebra and add to event structure all error messages inherit from errmsg base type behavior of various severities event number, run number for display methods for dumping counting etc limited customization: encoding into variable_text field "no" work to add instance (particular routine, variable text) characteristic bug: error in this encoding runtime-only error code blows up first time it would have actually been useful hard to force error to occur, since it IS error-handling code most work is adding new severity type implementation by fixed arrays small object, so could over-dimension with little cost errors are SUPPOSED to be rare adding persistence was relatively easy: map onto a zebra bank in event structure -TIMERS: numerical id returned by system; you have to keep track of it this is sometimes called a "handle" rather than a pointer, since id is known to system and used as a pointer to a pointer internally; you're fine as long as you don't ever alter the handle -HBOOK: object instances identified by dir.id from user dir can be set, but not a stack need redefinition of "all": all in dir, or all everywhere dir adds new level of navigation needs id sent by user this is also a "handle" uniqueness a problem; dir helps some; try to enforce by registry or convention error usually detected at runtime (creation of objects) poor checking for access of nonexistent object no error messages generated mode sometimes convenient in that need only turn off creation of object -GTUNIT/RLUNIT id sent by user; design error: no guarantee unique not very critical so didn't matter much in practice