Cooperation between XML Registries and Related Registries

A Collaborative Effort between the XML Working Group and

Federal and State Government Agencies

XML Working Group Task 2.2.3.2 Registry Standards Harmonization

 

I. Purpose

 

This is part of an interagency collaboration for development, demonstration, and harmonization of XML registries and related registries. The focus of the effort is to work with standardization groups and with prototype demonstration activities involving government agencies. The project includes organizing a major conference to draw together users and developers of these emerging technologies.

 

II. Background

 

This project is a part of a broad interagency collaboration involving the CIO Council’s XML Working Group, the General Services Administration, the National Institute of Standards and Technology (NIST), the Environmental Protection Agency (EPA), the Department of Energy (DOE), the Department of Defense (DOD), states, local governments, and other agencies. The Lawrence Berkeley National Laboratory (LBNL) is providing assistance in the collaborative activities.  The participating agencies recognize the great potential of XML technologies for improving the collection, maintenance, dissemination and sharing of data, but also recognize the possibility of exploding XML tag confusion, which could worsen rather than improve data management and interchange. Metadata registries and XML registries are two emerging technologies which can help data specifications to converge on standards, rather than diverge into complexity and incoherence. The agencies involved in this collaboration are taking an active role in assuring that the emerging metadata and XML registry technologies meet government needs. The work involves the identification of relevant standards; engaging the standards development efforts; prototyping, testing and demonstrating the technologies; and engaging developers and users through conferences, and through organizations such as the CIO Council’s XML Working Group. 

 

A. Organizational interests:

 

Marion Royal of GSA and Owen Ambur of the Fish and Wildlife Service (FWS), are co-chairing the XML Working Group, which is chartered under the Federal CIO Council. The purpose of the XML WG is to accelerate, facilitate and catalyze the effective and appropriate implementation of XML technology in the information systems and planning of the Federal Government. The Working Group seeks to achieve the highest impact from resources by building on initiatives and projects that are underway in the Federal Government, or elsewhere in the public or private sectors. Current highest-payoff opportunities for application of XML technologies include: XML as a non-proprietary and inexpensive way to achieve a high degree of interoperability among heterogeneous systems; XML in a networked environment where there is a requirement to work with a rapidly changing set of partner and customer systems with unknown and diverse architectures; XML as a way to promote reuse of data by providing a way to locate it (semantic search), and by providing a standard way to transform and move it between applications.

 

The XML Working Group will foster coordination and cooperation with existing Council and other committees and project teams to:

 

NIST has long experience in the area of metadata registries and also in XML-related projects.   NIST initially chaired the Registry/Repository Technical Committee for Organization for the Advancement of Structured Information Standards (OASIS), which has incorporated previous work on the ebXML registry specification. This work has produced Version 2 of an OASIS/ebXML Registry specification. In addition, NIST and others are interested in articulating the OASIS/ebXML registries with registries based on the ISO/IEC 11179 Metadata Registry standard. A proof of concept system that implements the Registry/Repository specification of the OASIS/ebXML model is being developed by NIST. This prototype is a tool that can be used to store XML Schemas, Document Type Definitions (DTDs) and other XML artifacts such as Trading Partner Agreements (TPAs) for comparison and interchange among agencies. 

 

EPA, DOD and DOE are interested in metadata and XML registries for program management and scientific research. Each of these agencies has environmental responsibilities. Information is fundamental to the work of environmental protection. State environmental agencies and U.S. EPA depend upon the rational flow of quality information for every aspect of their work. Two significant trends highlight the need for a new approach to environmental information systems. First, environmental protection agencies collect, access and utilize increasingly more environmental data, as the scale and complexity of the problems addressed has grown. Second, a widening system of environmental information exchanges has already evolved with the use of new web based technologies.

 

In response to these trends, and to the growing expectation that this information and government services themselves be available online, EPA, states and others are testing and demonstrating the NIST prototype Registry/Repository. The pace and intensity of information system changes have brought the limitations with the traditional system-to-system approach into clear view. As agencies make new investment decisions, they have asked for a framework that can coordinate their efforts and build on a common vision. State environmental agencies and the U.S. EPA have struggled with modernizing systems at different paces, making it difficult to maintain the traditional direct system-to-system exchanges. The rapid growth of the Internet and electronic-commerce (e-commerce) is beginning to provide the basis for a solution–an Internet-based voluntary National Environmental Information Exchange Network (NEIEN) for state, federal and tribal environmental agencies. A Network based on standardized Internet language allows individual agencies to invest in internal data storage systems of their choice at a pace they can afford, while also supporting easy exchange of environmental data. The NIST prototype is being tested as a part of the NEIEN effort. DOD is working with XML registries and metadata registries to manage  the semantics of data, to manage XML data structures and to associate the data semantics with the XML artifacts.

 

DOE’s LBNL has played an active role in the development of the critical standards in this area: the ISO/IEC 11179 family of standards as well as the W3C standards for Resource Description Framework and XML Schema. LBNL is now engaging the OASIS Reg/Rep effort. LBNL works to formulate government requirements into the technical input needed by the standards committees. LBNL is working to draw the relevant standards together, so that the technologies interoperate smoothly. LBNL participates in national and international prototyping and demonstration efforts for these technologies.

 

B. Types of metadata registries

 

There are several types of registries that have different purposes, but which are potentially related. All of the registries contain metadata, but they vary on the granularity and on the proportion of syntactic vs. semantic information they contain. A part of this project is to explore the potential relationships and to try to achieve a reasonable level of interoperability. The types of registries include:

 

 

Items registered are XML artifacts, such as Schemas and Document Type Definitions. In traditional EDI/bureaucratic terms each XML schema typically corresponds to a paper form, EDI document, or reporting requirement. There is a preponderance of syntactic (structure) information.

 

 

Items registered include individual data elements and related metadata items, such as data element concepts, value domains, taxonomies, and identification of responsible parties. The emphasis is on semantic information such as definitions of data elements, definition of each value meaning, and stewardship responsibilities. A focus of these metadata registries is data administration, semantics management, and data standardization. ISO/IEC 11179 metadata registries support data sharing, data reporting, system development, and dissemination of information that describes data products.

 

 

The Universal Description, Discovery and Integration project is a broad industry initiative to develop a platform-independent, open framework for describing services, discovering businesses, and integrating business services using the Internet. Items registered are web services and business activities of firms. The emphasis is on interface specifications.

 

 

These are sometimes called system catalogs. Items registered are elements of database schemas: data elements, relations, integrity constraints. The emphasis is on the information needed to make database management systems work for queries, etc. There is usually not a strong emphasis on semantic management.

 

 

These are sometimes called encyclopedias. These contain the information needed to create a database and potentially the program code for a system. They contain database schemas. There is not usually a strong emphasis on semantic management.

 

 

Items registered are concepts, relations among concepts (subsumption, inheritance, ...), and axioms for inference among concepts, e.g., temporal/spatial reasoning, etc. The emphasis is on semantics.

 

C. Standards efforts

 

 

OASIS and ebXML have collaborated on development of registry/repository specifications, now referred to as “repository” specifications. While the efforts were at one time rather independent, they are now merged in a single OASIS Technical Committee. The two specifications are being drawn together into a stronger specification. A major strength of the OASIS specification resides in its classification capabilities.  Associations among the objects in the registry make it easy to compare versions of similar objects, for instance TPAs, and DTDs. Registered objects tend to be coarse-grained XML artifacts, e.g., assemblages of data elements in an XML-Schema. The OASIS/ebXML registry specification should enable individual organizations to register their objects of choice, capture different associations and dependencies among the registered objects, and record additional metadata for registered objects as appropriate.  An XML interface provides easy access to this collection of information about the registered objects.  The members of organizations should be able to exchange information with one another.  The major advantage of this scheme is that the interface can remain stable for each interchange partner.

 

 

ISO/IEC 11179 – Metadata registries, is developed by the International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC), Joint Technical Committee 1 (JTC1), Subcommittee 32 – Data Management and Interchange (SC 32) Working Group 2 – Metadata (WG2). The standard specifies basic aspects of the kind and quality of metadata necessary to describe data, and it specifies the management and administration of that metadata in a metadata registry.  It applies to the formulation of data representations, meanings, and relationships between them to be shared among people and machines, independent of the organization that produces the data.  In ISO/IEC 11179, metadata refers to descriptions of other data.  Metadata in 11179 registries tends to be fine-grained, e.g., data elements and components of data elements such as definitions and value meanings. ISO/IEC 11179 registries are focused on semantics (the meaning of data) and management of semantic change, e.g., the creation, derogation, and retirement of enumerated values (and value meanings) for, say, country code as geo-political changes occur.

 

Major aspects of the ISO/IEC 11179 family of international standards apply to activities including:

·       The design and specification of application-oriented data models, databases and message types for data interchange;

·       The actual use of data in communications and information processing systems;

·       Interchanging or referencing among various collections of data elements.

 

Major implementations of ISO/IEC 11179 exist for environment, healthcare, intelligent transportation system, aviation, demographic and other programmatic areas. It is desired to make these metadata registries interoperate with XML registries, with the 11179 registries handling deeper levels of semantic management.

 

UDDI

The Universal Description, Discovery, and Integration (UDDI) Project is an effort to define a set of specifications that will make it easier for businesses to accelerate the use of B2B and commerce over the Internet. UDDI does this by defining how companies can expose their business applications -- like ecommerce, order management, inventory, marketing, and billing -- as Web Services that can be directly and securely defined, discovered and integrated with business applications at trading partners and customers. This direct application-to-application integration over the Internet is a core building block of the Digital Economy and holds great promise for how businesses will transform themselves over the next decade.

The UDDI Project is based on existing Internet standards and is platform and implementation neutral. UDDI involves the shared implementation of a Web Service based on the UDDI specifications. This Web Service -- the UDDI Business Registry -- is an Internet directory of businesses and the applications they expose as Web Services for trading partners and customers to use. Business programs will use the UDDI Business Registry to determine the specifications for programs at other companies in a manner similar to how people use Web search engines today to find websites. This automated application-to-application discovery and integration over the Internet is intended help reduce many of the configuration and compatibility problems that are preventing businesses from more widely adopting B2B, despite B2B's potential for cost savings and improved efficiency.

Development of the initial UDDI specifications was drive by major industry firms. The effort is expected to transition into a formal standard. This specification may have relevance to XML and metadata registries. At a minimum, XML and metadata registries could be registered in UDDI registries.

 

III. Tasks

 

The following major tasks undertaken in this collaboration are specifically relevant to the CIO Council’s Strategic Plan and to XML Working Group Task 2.2.3.2  Registry Standards Harmonization.

 

A. Conference

 

Organize a major conference drawing together standards developers, software developers and practitioners. The major topic will be ISO/IEC 11179 metadata registries and XML registries/repositories. Related topics such as resource discovery (UDDI), terminology, etc. will also be covered. The conference organizers will work with standards developers, software developers and practitioners to encourage progress toward interoperability. The conference is intended to showcase progress made. Therefore, the conference will be scheduled at a future date that enables the participants to report on actual progress. The conference is expected to have separate tracks for various communities of interest. The conference is expected to have multiple sponsors. Tracks in the conference will cover tutorials about the relevant standards and use of the standards in practical work in a number of application areas, such as the environment, healthcare, statistics, transportation, and learning technologies.

 

Primary organizations (standards developers, software developers, and practioners) will be contacted with continuing outreach efforts. Key organizations and standards committees will be asked to name representatives to an organizing committee. The actual date of the conference will be in accord with the organizing committee recommendation and with conference facility availability. The date is expected to be a year or so out, since the point of the conference is for reports of progress made. Thus, the preliminary activities are as important as the conference itself. It is expected that the impending conference will in itself generate many meetings between potential participants. This work will be measured for success by the attendance at the conference of the key groups and communities of interest and by reports of real progress toward interoperable registry technologies that can be utilized by government agencies.

 

 

B. Harmonize ISO/IEC 11179 metadata registries with OASIS/ebXML Registry/Repositories.

 

The following activities are intended to stimulate harmonization among the relevant standards by raising the issue, pushing for action, and by inviting participation in the above conference to demonstrate how the technologies can be utilized in a cooperative/interoperable manner.

 

1. Work with ISO/IEC committees, the OASIS XML Registry Technical Committee and other standards bodies to identify points of articulation between the emerging XML registry and ISO/IEC 11179 standards. Develop material to help and encourage the standards efforts develop interoperable (or at least cooperating) specifications. Present material describing the potential for cooperation and interoperability.

 

2. Work with NIST prototype XML registry efforts (e.g., NIST – EPA National Environmental Information Exchange Network) to test and demonstrate interoperability between 11179 metadata registries and XML registries

 

Test registry cooperation/interoperation to inform efforts of standards developers and practioners. Tests of the NIST prototype are expected to be accomplished over the coming year. Successful completion requires EPA and state environmental agency actions. Results will be reported at the conference, above.

 

3. Work with practitioners in environment, healthcare, energy and other subject areas to demonstrate the utility of the registry standards, software and operating procedures. This work will also be demonstrated at the conference.

 

The currently funded tasks are expected to be complete by March 30, 2003.