When I sent out my proposed draft, the intent was to ferret out the requirements for what we are doing. Attached below is a draft requirements document for discussion at the telecon. I'm sure it will be much modified as a result. Once we've agreed on this we can go ahead and work out a common design. First let me reiterate the requirements that were agreed at the ADEC telecon. There was a goal that I neglected to include in the first message I sent out to everyone. My apologies for that. Goal: The top level goal is to provide seamless (one stop shopping) access to all NASA astrophysics catalog and data services, irrespective of where the information is located. Requirements: 1. All services will be accessible via standard protocols. 2. Access will be via the existing user interfaces. 3. Results will appear consistent with the interface that initiated the request. 4. Attribution of data to original site will be clearly stated. From these requirements, discussions I've had with several of you, and the draft designs that several of us have sent out, I'm suggesting the following somewhat more detailed requirements. I've enclosed various comments of my own in square brackets []. I'd like to see if we can agree upon the requirements at tomorrow's telecon. There's currently a fair bit of disagreement, e.g., I read the IRSA design as suggesting that we do not need the first section at all. Feel free to comment right away. Tom ADEC ITWG Requirements V0.2 Data Registry. 1.1. All services shall be published in a common registry. [ Alternative 1: is to have each site have its own registry and then have a register of sites. In that case the site registry basically just points to the registries at the sites. Here I'm assuming that there may be multiple services available at a given site even though we may wish to start with the primary service at each site. Alternative 2: assume a constant set of links. I think this is what IRSA is proposing in their design. My inclination is to go for a common registry since the whole intent of this effort is to integrate our systems. Personally I feel a dynamic registry system is very important. Although IRSA may well be right that the base URLs are relatively constant the registry (or something like it) is the place where we describe the evolving capabilities and changing interfaces to our system. I.e., I don't want to chain the HEASARC to supporting a specific keyword/value sequence forever.] ] [ Implementation comment: Obvious candidates for such a registry are GLU and UDDI -- the later especially if we use SOAP in other aspects a la Tim's design. Both of these support all of the requirements below. ] 1.2. Each site shall have the ability to add or delete to the list of services it provides. [Do we need a security requirement that a site cannot modify the service descriptions of other sites? We might get away with a static site registry.] 1.3. The description shall provide at least the following information regarding the service: a. The hosting site b. The name for the service c. The parameters available for specifying the action of the serice d. The types (if any) of data products associated with the service. e. A description of the service. f. A description of the interface to the service in a TBD syntax. 1.3.1 A standard list of parameters that may be queried shall be established. The list in 3.c shall be a subset of this list. 1.3.2 A standard list of types of data products shall be established. The list in 3.d shall be a subset of this list. [Here the idea is that we need some standard vocabulary in the registry to describe what's available.] 1.4. A site shall be able to retrieve all registry descriptions using a standard TBD API. 1.5 The registry shall be maintained by a TBD party. Data query. 2.1.All catalog services shall have the capability to respond to requests in an agreed machine-readable TBD format. The response shall include complete metadata for the request including descriptive information on the source table and each column as well as any associated data products. All options supported in the current human/HTML interfaces shall be supported in the TBD interface. This may be implemented using new options on an existing service or as a parallel service. [ This may need to be split out somewhat. There may a more generic term than 'catalog' that compasses NED and ADS services better. Assuming there is some registry, there is some break point between what tinformation is provided in the registry and that included in a query result. I would suggest that all data needed to select and query a resource goes in the registry and that data is duplicated in the result metadata. Personally I'd strongly recommend that we use a new option on existing services. Otherwise keeping the machine and human interfaces in sync is going to be painful. ] 2.1.1 All catalog services shall provide a mechanism to return only metadata information. [I think this will be a practical necessity, though it may not be absolutely required.] [The metadata could be voluminous, do we want to require an option to turn it off, e.g., if a user is make a series of hundreds of requests?] 2.1.2 The machine-readable format shall be chosen in coordination with NASA and non-NASA VO activities. 2.2. All sites shall provide direct URL access to data products. The metadata described in requirement 2.1 shall be sufficient to generate these URLs for data product associated with a row. [The IRSA design requires direct URL access and I'm only too happy to agree. We do need to link these URLs to user queries somehow. Alternative is simply to have the URLs returned in explicit data product columns. We sometimes have dozens of data products and hundreds of distinct files so I'm a little wary of doing this. Probably the data product information should be optional if we go that route. ] [Do we need to have a requirement to support compression in our transfers? A lot of our URLs are compressed and in any case we'd want to promote use of compression.] 2.3 Access protocols shall not compromise the security of the host system. [Brian Thomas pointed out that this may be an issue we need to explore if we use SOAP or other new technologies.] User interface. [Many of these requirements come directly from the telecon, so nominally they are already agreed upon.] 3.1 All sites will provide links within their primary data services to all NASA astrophysics catalog and data services. [ I've limited this to 'primary' data services so that we don't feel that all software we provide needs to provide such links. This is a biggie in any case, and it's one of the reasons that I included the simple cross-links in my proposal. However it's really less stringent than the original goal up top, which requires 'seamless' integration. ] 3.2 All catalog information and data retrieved from a non-host site shall have its provenence clearly displayed. 3.3 Other than provenance, the display of foreign and local data shall be the same. 3.4 To the maximum extent practical, all facilities which manipulate data retrieved from the host site shall also be usable on foreign data, i.e, plotting, cross-correlation, etc.