Semantic Plug and Play

Preface

This paper has been prepared as a contribution to the Joint Workshop on Standards for the Use of Models that Define the Data and Processes of Information Systems, September 1996 in Seattle, Washington.

The objective of the paper is to identify opportunities for collaboration among organizations whose charter is the development of standards for information systems based on models. In the paper I describe what is currently going on in the standards organizations I have worked in over the last few years, why it is important to business, and how the business goals can be better achieved through the development of a convergent process.

The paper is organized into the following sections:

Preface

Semantic Plug and Play -- Model-Driven, Interoperable Information Systems

1. Interoperability through semantic plug and play

2. Scenario for rapid interoperation

3. Process of semantic plug and play

4. Architecture for semantic plug and play

5. Some conclusions about semantic plug and play

Appendices: Background for Semantic Plug and Play

Appendix 1: Concepts of Semantic Plug and Play

Appendix 2: Modeling and Modeling Techniques

Appendix 3: Current Standards Relevant to Semantic Plug and Play

I have tried to show a number of relationships, similarities and differences among existing standards. To do so I appeal to a number of concepts that are relatively common to thinking about information management. Since the terminology used to express those concepts varies significantly across the global audience of the workshop, and since some members of the workshop may find some of the concepts novel, I have provided explanations of them, but moved them to Appendix 4.1, so as not to bog down your reading of the main message. Similarly I have placed in Appendix 4.2 brief descriptions of the work of those standards in terms of those concepts.

I have taken advantage of the HTML format in which this paper is published by incorporating "hot links" from the text to these appendices. This feature allowed me to minimize the amount of explanation required of a concept or standard in order to talk about it. I have also included URL links, where I am aware of them, to pages on the World Wide Web to provide access to further information about the various standards. I hope you find this organization useful.

Your feedback is encouraged. Please send your comments to Jim Fulton at jfulton@atc.boeing.com.

Semantic Plug and Play --
Model-Driven, Interoperable Information Systems

Boeing, along with most large corporations, uses many different computing tools to manage vast volumes of information. Competitive pressures require continuous improvement not only in the individual tools, but in their ability to share data, and by so doing to work together, to interoperate. The cost of this interoperation is high. Neither the vendor products available in the marketplace, nor Boeing's home-grown tools are designed to make agile changes in the set of tools they interact with. The task of integration is time-consuming and labor-intensive, and must be repeated with every change in the tool suite. Moreover, the unrelenting competition in the global marketplace means that changes are seemingly always upon us.

Suppose though that interoperability were somehow built into the tools. How would things be different? One approach is suggested by the current marketing buzzword "Plug and Play"

1. Interoperability through semantic plug and play

We hear the words "Plug and Play" and they sound neat, but only real techno-weanies know what they really mean (until we're told that the system we have doesn't really support it). Taking the words at face value we can distinguish at least three kinds:

1.1. Hardware plug and play

The concept of "plug and play" has become a major factor in the marketing of personal computers. Customers are thrilled:

customers who have had to endure a poorly documented, arcane and error-prone process of installing new additions to their computer systems, especially additions with both hardware and software components, whose installation required setting switches and jumpers just so, depending on what other components had been installed, and then adding just the right lines to their CONFIG.SYS or AUTOEXEC.BAT files, with just the right parameters to suit the combinations of hardware and software they had already installed,
customers who then had to revise it all upon their next return from the local computer superstore, only to find out that system A could not be made to work with system B, and of course it's system B's fault (says System A),
customers who know well the real meaning of caveat emptor,
in other words, you and I. (You mean I can just plug it in, run the setup routine, and start using it?)

Of course, marketeers are salivating. Especially when indications are that the concept might actually work (with the usual limitations and conditions and exceptions and exclusions from blame, all spelled out in legally unexceptionable language).

1.2. Object plug and play

On a larger scale, we witness the emergence of another kind of plug and play in the marketing of architectures that support the sharing of interoperable objects. SQL, CORBA, OLE, ODBC, HTML, JAVA or whatever is the latest bit of alphabet soup to leak from the ad agency's pen (now there's an anachronism) -- all promise us that we can mix and match our information systems on whatever platform we choose. Just install our operating system, buy our software development kits, and just sign here, if you please. Systems users and administrators can't wait:

users who have had to cope with assembling vast amount of interrelated data, each piece compiled with different software (That drawing looked fine in MacDraw! Look what happened when I converted it for WordPerfect!), into a presentation with appropriate professional panache,
users who have resorted to retyping data from one system into another or literally cutting and pasting physical pieces of paper into a simulacrum of what the computer should have done in the first place,
administrators who have had to balance the shifting demands of function (We can't do our job without that software!) and integration (Their results have to fit in our document!),
in other words, you and I. (You mean Joe can just link Betty's spreadsheet into his document, and the changes Betty makes will automatically be changed in Joe's document! Where do I sign?)

If only it worked!

1.3. Semantic plug and play

By semantic plug and play (SPP) I mean an architecture in which the relationships among the data manipulated by various applications are managed through the models that define that data and the operations performed upon it. Hardware plug and play promises that the physical components of an architecture can be changed and continue to function and exchange signals as necessary without excessive dependency on the user's knowledge of how to control their interrelationships. Object plug and play promises that different object managers and databases can be assembled to exchange objects as necessary without excessive dependency on the user's knowledge of how each manager controls the objects within its scope. Semantic plug and play promises that different applications can exchange specific types of objects, each with specific roles in each of several applications, without excessive dependency on the user's knowledge of how that specific data functions in those other applications. Again, these promises can be exciting both to users and to administrators:

users who have resorted to retyping data from one system into another because the computer couldn't accept their knowledge that the data was the same,
administrators who have had to spend enormous resources to make a new computing system work effectively with the other tools in the environment,
- reverse-engineering it in order to find out what data it needed, what data it produced and how to map that data to what was used and produced by other tools,
- building a network of interfaces and add-ons to make the new tool do what the business needed it to do in that environment,
- starting all over when any one of those tools released a new and better version that users just had to have,
in other words, you and I.

How would semantic plug and play work? We can imagine the following scenario:

2. Scenario for rapid interoperation

The new analysis application has just arrived from the vendor, and it is your job to install it. More importantly, you have to integrate it with the other applications your company uses. This new tool will analyze data from your CAD (computer-aided design) applications, comparing overall cost and quality factors, and feed the results of the analysis into your materials planning applications. This of course means that it has to be able to find the right data in your CAD tools and produce the right data for materials planning. That your job is to link the data needed by this tool to the data available from those other applications.

Fortunately, according to the marketing blurbs this tool conforms to the industry standard for this kind of analysis, and will work readily with any CAD and materials planning tools that conform to their own industry standards. However, the tool also offers additional features that provide analytic capabilities that significantly improve its value to the company, and the decision has been made that those features must be implemented. But those features require data that are not included in the CAD standard, and the results are not part of the materials planning standard. So some tinkering is going to be needed.

You begin the install process. The loading and configuring required for the tool to operate in standalone mode is straightforward. The installer detects that you have common brands both of an object-request broker (ORB) and a relational database (RDB), each with an interface that conforms to its respective standard, and asks which of these it is to use to access CAD data. You select the ORB. It asks the same question about where to put the results of the analysis, and you select the RDB. It asks whether each tool maintains its own object dictionary, and you point it instead to an integrated dictionary in the RDB. The installer examines the object definitions for the ORB and detects that the standard CAD data is available. It asks whether you want to accept the standard mappings for the CAD data, and you agree. It then asks whether you want to make use of the extended feature set. When you answer affirmatively, it informs you that additional CAD data is needed beyond what is needed in the standard. Here the tailoring begins.

During the planning that led to the selection of this tool, the business view of the data manipulated by the tool was mapped to the business view of data used by other tools, and the engineering process was modified to assure that the right data was captured in the CAD tool. Furthermore, an approach to solving some of the implementation problems was selected: some of the data, which classifies parts in ways not defined by the standard, will go into minor text fields that the standard provides primarily for documentation purposes; the rest, numeric arrays that are a natural by-product of the engineering process, are already available in the CAD tool, but are not included in standard exchanges. These revisions to the CAD data model have already been documented in the dictionary. The installer examines the object definitions and produces a list of object types whose physical data representations are compatible with the data needed for the additional features, including the definitions you had already placed there for this purpose. Even with those definitions, however, the semantically correct mapping is not one that the installer can ferret out for itself with accuracy, but it finds and invokes a synonymy analyzer in your dictionary, and produces a set of alternative mappings. One by one you select the object types that have been allocated for these purposes.

When you come to the minor standard fields that are being adapted to this purpose, the installer informs you that those fields can be used only if their contents are restricted more tightly than the standard permits. Since the CAD tool allows anything permitted by the standard, the installer asks if it can take over the management of those fields by replacing the modules the CAD tool uses to put data into them. You say no (under certain conditions, engineering needs to be able to put values in those fields other than what this analysis tool allows), and the installer informs you that it will not be able to perform its special analyses for a CAD design unless the values in these fields meet its requirements. It asks if it can put a wrapper around the modules that modify those fields to issue a warning when values are inserted that do not meet its requirements, and you agree. It also does a quick check of the data already collected in those fields by the CAD tool, and tells you that 83% of those fields contain data that prevents the use of the special features. It asks you if you want to install a conversion add-in to the CAD tool that will help the engineers to find those fields and reset the values where appropriate; you say yes.

You go through a similar process for all the CAD data the tool requires, and again for all of the data produced by the tool to make it available to materials planning and other downstream applications. The effect of all of this is to map the definitions of the data manipulated by this tool to the definitions of the data already being manipulated by other applications. When you are finished with this process, the installer asks if you are ready to complete the installation. When you select "deferred installation" the installer generates the changes to the database schema needed to accommodate the storage of its results, and compiles all the linkage modules necessary to share the selected data with the other tools. It then gives you the name of the installation routine to be executed later that night when it won't conflict with users. The next day some engineers begin using the conversion add-in to enable the analysis routines; others start performing the analysis, and by week's end the materials planning people are seeing benefit from the improvements the analysis makes in their process.

3. Process of semantic plug and play

The insertion into the architecture of tools that are designed for semantic plug and play should follow a process very similar to the scenario described above. If we strip away the anecdotal character of the scenario and ask what such a process would really involve, we get something like the following:

Explicit models of the s required and provided by each tool, are mapped to the models of the enterprise:
- Models packaged with the tools are made available to the modeling repository, which already has access to the models of the current enterprise. These models define the services to be exchanged between the tool and other tools within the enterprise, including data, processing, object and knowledge services. These enterprise models include
- A model integration facility (MIF)
  - presents those models along with current enterprise models,
    - models of the enterprise conceptual schema, if the enterprise has implemented a three-schema architecture,
    - models of the enterprise internal schemas, if the enterprise has implemented a two-schema architecture,
    - models of the enterprise external schemas, if the enterprise has implemented a one-schema architecture,
  - suggests candidate mappings based on industry standards, model structure, similarity of names and descriptions, and any other criteria that might be applied by the MIF,
  - allows the integrators to extend the enterprise model as necessary to accommodate the data, processing, object and knowledge services of the tool,
  - allows the integrators to semantically map the services of the tool model into those of the enterprise model;
  - allows the integrators to specify which the services in the mapping will be served by the new tool and which by other applications in the architecture,
  - allows the integrators to analyze the performance of the tool in the architecture and decide whether to distribute a physical copy of the services it uses in a form that improves performance.
The mappings between the models are then compiled into the code that links the tool into the architecture:
- If a tool uses import/export files as its mechanism of exchange, then facilities are generated to parse such files into the data managers of the enterprise, and to extract and format those files from those data managers.
- If a tool uses a data manipulation language such as SQL as its mechanism of exchange, then the tool is linked to the enterprise gateway for that language, and through it to appropriate servers.
- If a tool operates as a server for a data manipulation language such as SQL, then its services are incorporated into the enterprise gateway for that language.
- If a tool invokes functions from an API to obtain services from its environment, then those functions are implemented in terms of the enterprise library of reusable code.
- If a tool provides services to its environment through an API, then those functions are incorporated into the enterprise library of reusable code, and the tools that are to be served by those functions are linked to them.
The tool then operates as though its view of the enterprise were the only view there is.
- Its requests for services against that view are automatically interpreted through the mapping into requests for services against the enterprise view, and then into requests for services against the view of the application (or applications) that provides those services. If performance requires, the tool may make local copies of the objects provided by other services, and synchronize the copies according to a pre-planned process.
- Requests for services provided by the tool are routed to it from other applications in the same way.
- Actual values of performance attributes are fed back into the models to provide a basis for retuning the physical structure.

As a result the tool then interoperates with all the other tools in the environment.

In order for a such a process to work,

Each tool to be inserted into the architecture must provide a model of the data, processing, object and knowledge services it needs to exchange. That model must be in a form compatible with the users' model integration facility (MIF).
The enterprise must have a model of the services available for use by the tool, and of the services needed by other applications that could make use of the services provided by the tool.
The enterprise must have a MIF that
- is capable of manipulating such models, and thus itself conforms to modeling standards,
- supports the semantic mapping process described above,
- can either generate, or provide a controlled process for creating the code to implement the linkages described above.

4. Architecture for semantic plug and play

Semantic plug and play will require some significant improvements to existing technology, which will require a cooperative interaction between tool users, tool vendors and industry standards. It is unreasonable to expect that any but the largest of corporations (if they) can afford to build and maintain a model integration facility of the kind described above on their own. It is also unreasonable to expect that any vendor can offer a commercially viable version of such a facility unless each of the components in the architecture conform to an appropriate set of standards. In what follows I will describe the kinds of technologies and standards that seem to be required to achieve a semantic plug and play environment.

4.1. Modeling languages

The models that define the data, processing, object and knowledge services in the view of any application must be expressed in languages that conform to appropriate industry standards. For an MIF (model integration facility) to function as described above, it must be able to compare the models delivered with a tool with the models already available in its local enterprise repository. This means that tool models must be delivered in a form that the MIF can recognize and translate into the form used by local integrators.

This does not mean that there will be a single standard modeling language. Modeling languages, as seen by the creator or user of a model, are presentations of views of information systems appropriate to different phases and tasks of the life cycles of those systems (i.e., modeling languages present views of information systems, which themselves present views of the business). It is not likely that any one model presentation language can supplant all of the modeling languages currently available. Even for a particular modeling domain, e.g., data modeling, competition among modeling tool vendors will create evolutionary pressure on the languages.

What a tool must provide, however, is not the presentation of its models but their public interface. That is, the tool must provide a representation of the models that can be imported into the MIF and presented through whatever model presentation tools have been adopted. This requires that the tool vendor and the tool user adopt modeling languages that conform to a common standard for public interfaces.

4.1.1. Modeling language standards

Today if vendors wanted to provide models to their customers, they could do so through any of several standard forms:

CDIF (CASE Data Interchange Format) from EIA (Electronic Industries Association)
EXPRESS from STEP (STandard for the Exchange of Product model data )
IDL from IDEF (Integrated Definition Language)

Unfortunately, these standards are relatively immature and are not fully implemented in the modeling tools available on the market. Moreover the standards are overlapping and non-interchangeable. A vendor who selected a CDIF-based tool could not supply models to an EXPRESS-based MIF, or vice-versa. But vendors should not have to know or care whether their customers model in CDIF or EXPRESS or IDEF, any more (now that STEP is in place) than CAD tool vendors need to know which product data managers are used by their customers.

4.1.2. Requirements for modeling languages

To achieve semantic plug and play in a form that allows vendors and customers relative freedom of choice over modeling tools, there needs to be a harmonization of standards for modeling languages:

4.1.2.1. Standard public interface

A standard public interface for modeling tools would allow tool vendors to use any standards-conformant modeling tools to define the services that their tools provide or require, and to make those models available to any customer regardless of the modeling tools used by their customers.

4.1.2.2. Standard integrated semantics

A standard integrated semantics for the public interface should allow the exchange of multiple interrelated views of the tool and its role in the information system. It may be convenient presently to exchange data models, processing models, object models and knowledge models as separate units of modeling information. Those are the modeling techniques currently in use, and many vendors may well be able to specify fully the services they require from and provide to an environment using only one of those models. However, complex tools require the ability to specify interconnections between the definitions in these models:

Models of processes that transform data or objects should be able to inherit and make use of the integrity constraints defined for those data or objects in their own models. Both CDIF and IDEF have claimed that their data and process models were views of a common semantics, but neither has provided standard rules for a mechanism for integrating the two.
Object models should be able to incorporate definitions from data and process models.
Data, process and objects models are in effect sets of simple "rules" which should be the foundation upon which knowledge models build more complex rules.

4.1.2.3. Process-driven modeling techniques

Standards organizations should strive to develop process-driven modeling techniques. Every STEP application protocol (AP) is driven by a consensus business process, as expressed in an application activity model (AAM). The process is defined generically so that it can be tailored to the details of any enterprise, but the process provides a context for deciding what kinds of data services must be exchanged through a STEP standard.

Modeling techniques and languages, and the standards that codify them, seem to be driven more by religious fervor than by either science or consensus (although the recent trend toward using or incorporating formal logic into modeling languages is a hopeful sign). It is time that the work on measurable, repeatable software engineering processes, which is going on in part in ISO/IEC JTC1/SC7, be used to provide a similar foundation for modeling languages.

It is perfectly reasonable that a vendor should promote a modeling tool as the best approach to defining the functionality of an application, because the promotion addresses its visible compromise of presentation, functionality and performance. It is not reasonable for standards organizations to engage in competitive non-cooperation in formulating a standard for exchanging models.

4.2. Model mapping

A critical step in a semantic plug and play architecture is the semantic mapping of the services of one component to those of another (or to the conceptual schema that brokers the exchange of services between them). Unless an application specifically states that a service is defined in terms of an unrestricted industry standard, the mapping of the services of one application to those of another will invariably require the knowledgeable participation of experts in the applications, both in the tool and in the business processes that the tool supports.

The objective of model mapping is to create rules by which instances of data, processing, object and knowledge services defined in one view can be linked to instances defined in another view. Such a linkage should allow those services to be physically provided by one application (the server) but accessed by another (the client). Model mapping facilities are needed whether the client handles its own interfaces to the various servers that support it, or negotiates the services it requires through a conceptual processor that hides the details of physical distribution of services among the various applications.

Regardless of the tools or techniques used initially to define the views, the need for view mapping requires that those definitions conform to a modeling standard, which can be input to a model integration facility (MIF) in which the services required by each client application can be mapped to the services provided by the server applications. Once the mapping is accomplished, the view mapping facility should be able to generate the code that enables services requested by the client to be provided by the server.

4.2.1. Model mapping Standards

At present the integration of a tool into an environment is not accomplished through model mapping, although reverse engineering a model from a tool might be used as a step in that process. Even if tools came out of the box with complete models, there are neither technology nor techniques nor standards available to use those models to support the integration process. Instead tool integration is achieved at the implementation level by writing translators or APIs that physically translate a client view into a server view. Modeling tools that might provide cost or impact analysis are not generally used to accomplish such a mapping.

Some modeling languages, such as EXPRESS and CDIF, have capabilities that could be used to relate the objects in one model to those in another, but those capabilities were not designed for the application of modeling to semantic plug and play environments. However, ISO TC184/SC4 has done some work in this area:

In a STEP AP the mapping between an application resource model (ARM) and an application interpreted model (AIM) is achieved through the use of a mapping table which defines the components of the ARM in terms of equivalent structures of components of the STEP GRM. The mapping language used in the mapping table is currently not offered by SC4 as a separate standard.
STEP is investigating the development of a dialect of EXPRESS (currently referred to as EXPRESS-X) to be used as a mapping language between EXPRESS models.

4.2.2. Requirements for model mapping

The requirements for model-mapping languages are similar to requirements for modeling languages in general.

4.2.2.1. Language-independent mapping languages

In a semantic plug and play environment, semantic mapping is performed by an enterprise to map a tool into an enterprise model. The model integration facility (MIF) that supports that mapping needs to be able to import models in the format of the public modeling interface, whatever their source. Thus the MIF must be independent of the presentation language in which the models were built.

4.2.2.2. Standard semantics for mapping languages

Moreover, the mappings themselves are shareable information. An international enterprise that performed a mapping at one site would presumably like other sites to take advantage of that work, regardless of the commercial MIF selected at those other sites. Moreover, having built up a set of mappings from various tools over a number of years, the enterprise would like to be able to migrate those mappings to the new and improved MIF product that it purchases.

The interchange and migration of mappings between models, as well as the models themselves, requires an industry standard for the public interface for the mappings. As with modeling languages themselves, the presentation language that an MIF provides to create and view mappings is a competitive feature of the MIF, as long as the presentation is equivalent to the public interface.

4.2.2.3. Process-driven mapping language

Just as I recommended above that standards for modeling languages should be driven by the requirements of the entire life cycle of software engineering, I recommend here that standards for model-mapping languages should be driven by the requirements of the software engineering processes that are sensitive to those mappings. Certainly one such process is the generation of the code to implement service exchange, a part of the semantic plug and play installation process. But this may be just one of many views of the use of those mappings. The software engineering process being examined by JTC1/SC7 should be reviewed to determine whether other processes might be sensitive to those mappings, and what other information should be associated with the mappings to serve those processes.

4.3. Model implementation

For semantic plug and play to work as described above, the process of mapping the models of a new tool into the enterprise models must result in the tool's interoperation with the other tools in that environment. To achieve this at minimal cost, the services required or supported by the tool, and the mechanisms by which the tool accesses or provides those services, must be implicit in the models. Furthermore the infrastructure embedded in the MIF (model integration facility needs to have the functionality to realize those implicit services with real ones.

4.3.1. Model implementation standards

Industry standards currently support three approaches that I know of are for making explicit the services that are implicit in models:

The SQL standard is implemented in many modeling tools in a way that enables the generation of relational schemas (in SQL) from data models. The databases produced from these schemas can then be accessed using the data manipulation features of SQL. The emerging ODBC standard may result in freeing tool builders from dependence on the implementation of SQL in particular environments.
STEP Part 21 defines the format of a data exchange file from a model defined in EXPRESS. Given an EXPRESS model there is one standard format for that exchange, and tool vendors have been quick to build compilers to generate pre- and post-processors for Part 21 files from the EXPRESS models.
STEP Part 22 (SDAI) defines the format of a library of data access functions (an API) from a model defined in EXPRESS. Even though this Part of STEP is still only a Draft International Standard (DIS), vendors are already designing their products to access data through an SDAI-conformant API.

4.3.2. Requirements for model implementation

In order to create an semantic plug and play environment, the modeling infrastructure needs to be improved in several ways supported by standards.

4.3.2.1. Modeling language-independent compilers

Any processes that operate on models should be implemented in terms of the standard public interface for the models, not in terms of the presentation provided to modelers . For example, it should be possible to generate the pre- and post-processors for a STEP Part 21 file format, from any standard data model, whether that model be constructed using EXPRESS or any other presentation language. And the same principle holds for SQL and SDAI.

Of course we can approximate that independence by extracting an EXPRESS file from the integrated models, and then passing that file to an EXPRESS compiler. But tools tend to gravitate toward more complex data structures, that allow them to pass on to users more precise decision-making information.

4.3.2.2. Integrity preserving exchange mechanisms

There should be a standard set of integrity-preserving access functions derivable from any data model, where "integrity-preserving" here implies preservation of consistency, not necessarily preservation of completeness. The ability to restrict access to such functions is critical to protecting data in a shared environment. The primary work of the model integration facility (MIF) will be to link these access functions from the clients through the view mappings to the servers. Functions outside this standard set will normally require a controlled design process, and special protections for the data they access.

One of the chief problems for a tool is the reconciliation of the integrity constraints built into the tool as purchased with those that have been established in the enterprise. An example might be a text-valued field intended to support user-classification of parts. The standard in which the field is defined might make no restriction on the values of this field, other than to specify its length, and a tool that conforms to the standard would in turn impose no restriction. But a given enterprise might well decide to impose a usage for this field that restricted it to values that could be used to pass data critical to down-stream processing. (The semantic plug and play scenario above suggested such a situation.) How does the tool learn of and implement this specialization of the field? Several possibilities need to be explored:

A client application might delegate responsibility for verifying data correctness to the server that manages the data.
A server application might import the enterprise constraints and compile them into its own operations.
Alternatively a server application might provide mechanisms for an enterprise to insert local processing to assure that its rules are obeyed.

It is premature at this point to speculate which of these approaches is best suited for standardization.

4.3.2.3. Compilable Model-Mappings

The mapping between tool models and enterprise models specifies how services are to be exchanged among the tools used by the enterprise. For semantic plug and play to work economically, this mapping must be compilable, i.e., there need to be tools that compile the mapping into linkages between the services of a tool and services of the enterprise.

One approach to this compilability would look something like this:

We already presume that models exist both for the tool and the enterprise.
We also already presume that there is a standard set of access functions implicit in and compilable from those models.
- Applications that play the role of client for a given service will request that service by calling the appropriate compiled module
- Applications that play the role of provider for a given service will provide an implementation of the compiled module
The mapping between tool and enterprise models then associates client to server for each service, and the mapping compiler need only bind requested module to provided module.

Note that in this section I talk about compiling the mappings, not the mapping languages, although this amounts to compiling the standard public interface for the mappings.

4.4. Application domain standards

If a tool defines its view of the world from scratch, making no attempt to reuse publicly available definitions of the data, processing, object and knowledge services that it exchanges, even if those definitions are made available in a standard language, then the tool can hardly be said to support semantic plug and play. Even when tool developers are not extending the state of the art but only implementing concepts that everyone agrees are "implicit" in the domain, the task of resolving differences in terminology, description and details of representation for a large and highly interrelated collection of data is an extremely arduous process.

On the other hand, when a tool implements an existing industry standard that has already been incorporated into an enterprise, then the focus of integration can be directed to the differences between the tool and the standard, i.e., to the extensions to the standard and to the unsupported features of the standard. This is a much more manageable task.

4.4.1. Current application domain standards

A computing tool almost invariably provides a proprietary exchange mechanism that enables some level of data sharing among its different installations. When different users select different tools, however, data sharing depends on the agreement of the vendors to adopt some sharing mechanism as a standard. For a while the exchange format used by the dominant vendor might suffice as a standard, but the cost to other vendors of keeping up with changes in this format invariably lead to a breakdown in such informal arrangements, and if market pressure is sufficient, a formal standard is sought.

For example, the various application protocols (APs) defined in STEP (ISO 10303 -- Standard for the Exchange of Product model data) define the data requirements of a number of product data domains. STEP APs provide a working mechanism for domain-specific data sharing. Recently ISO TC184/SC4 has been exploring how the STEP APs, together with the various parts of the Parts Library standard (PLIB), can be made to support cross-domain data sharing (AP interoperability), but no long-term approach has yet been adopted.

STEP Architecture versus a Three-Schema Architecture. On the surface the STEP architecture looks to be one in which sharing of services through views of a single source of services would be a natural consequence. Every application resource model (ARM) of every AP is mapped through an application interpreted model (AIM)to the STEP generic resource models (GRM). To the uninitiated this looks like a three-schema architecture, with an interface between a conceptual schema and several external schemas; but there are significant differences:

Currently, the only component of STEP that is intended to be implementable is the AIM.
STEP does not currently identify the GRM as an implementable component of the standard. A product data manager that implemented the GRM and used the AIMs to map the APs to it as views of a conceptual schema would not be STEP-conformant. The only current role of the GRM is to support the reuse of models, i.e., of definitions, not the data instances defined in those models. That level of reuse is valuable to STEP, but the cross-domain sharing of data defined by STEP is valuable to industry.
STEP does not currently identify ARMs as implementable components of the standard. They exist only as a model of information requirements to be implemented in the AIM. Tools cannot be designed to exchange the data defined in an ARM and still be conformant to the AP. Even when the ARM accurately represents the data requirements of the domain, tools must talk in the relatively arcane language of the AIM, and not in terms directly relatable to the user needs. (The recent email traffic on this issue has been overwhelming.)

STEP has developed a mechanism for data sharing among APs: an application interpreted construct (AIC) is a model that is shared among multiple AIMs in a way that allows sharing of instances among those APs. But the value of AICs, given the potential of the GRM as a true conceptual schema, is to my mind negligible!

4.4.2. Requirements for application domain standards

Certainly the process of developing application domain standards should proceed as rapidly as possible. Such a process has many benefits, including the following:

Each standard, properly developed with input from both users and vendors, solves a problem for exchanging specific kinds of data.
Implementing standards provides important feedback on what such standards should contain, and on how to produce quality standards.
An increasing smorgasbord of commercial, standards-conformant technology facilitates the establishment of information systems architectures based on that technology. This in turn will
- erode the tendency to build parochial legacy systems, (which, it has been pointed out, should properly be called heritage systems, since "heritage" means what we get from our predecessors, while "legacy" means what we leave to our successors; and we certainly wouldn't want to admit that these so-called legacy systems are what we are going to leave to others, would we?)
- fuel the demand for more and better standards and conformant-technology,
- improve the ability of enterprises to adapt their technology rapidly to changing business conditions, i.e., to make enterprises more agile.

To play an effective role in a semantic plug and play in a business arena that allows vendors and customers relative freedom of choice over modeling tools, the standards for application domains should support, promote and exploit the convergence of standards for a modeling infrastructure.

4.4.2.1. Convergence with modeling language standards

A domain standard defines the format and interpretation of a public interface that enables the exchange of information between tools that implement that domain in different (or the same) enterprise. The definitions of the domain by the standard should themselves be made available through a standard public interface for modeling tools. However, this does not imply that the models that define the standard must be either created or used in a single standard model presentation language. When STEP first started participants used a variety of modeling languages - ER, IDEF, NIAM, etc. - which were basically similar but differed in details (and which tended to enflame religious disputes between modeling ideologies). More importantly, at that time there was no public interface that allowed models to be exchanged between implementations of these languages.

I participated in a PDES (Product Data Exchange using STEP--the US activity contributing to STEP) project whose goal was to enable such an interchange of models. That project concluded that since models were in reality only sets of rules, the only adequate mechanism for exchanging models would have to be logic-based, for that was the most comprehensive formal foundation for rules. We expressed that conclusion in the Technical Report on the Semantic Unification Meta-Model, Volume 1: Semantic Unification of Static Models [ISO TC184/SC4/WG3 N175], and referred to the Semantic Unification Meta-Model as the "SUMM". I fear this was another case of the right answer and the wrong solution. We completed a technical report but no mechanism for actually converting models from one form to another. In the meantime STEP went though a shift in procedures and began modeling entirely in EXPRESS, and the market for model exchange within STEP disappeared.

Thus was EXPRESS born of necessity, to serve both as a presentation language and as a public interface. In this dual role it serves a number of purposes:

It enabled STEP participants to creating and review models for using text editors, the lowest common denominator, as their only modeling technology.
It enabled enterprises to get those models from
It enabled enterprises whose sole focus is a particular STEP application protocol (AP) to satisfy their immediate modeling requirements simply by buying an EXPRESS tool.

And this has spawned a market for EXPRESS tools, with EXPRESS-G presentations, model editing capabilities, and compilers that generate database schemas, parsers for STEP Part 21 exchange files, and (on the horizon) APIs that conform to SDAI (STEP Part 22--Standard Data Access Interface). So STEP is wedded to EXPRESS. (As with most marriages, this one is not without its marital spats and demands.)

However, EXPRESS was not designed around the processes by which models are created, integrated, analyzed, shared, reused, adapted, and managed within a large enterprise. Much less was it design to support an environment in which different processes require different views of models. In such a context, the use of an public interface as a presentation language, begins to fail to meet user requirements. In those enterprises what are valuable in STEP (or any domain standard) are the following facts (roughly in order of importance):

The standard models dictate a public interface for the domains within which enterprises need to exchange data (and it helps that there are tools that can compile the public interface form of the models into the code that will manage the public interface for the domain being modeled).
Vendors of tools for a standard domain can be persuaded to
- conform to that public interface, and
- provide a presentation of the information defined in the models that improves the quality, time and cost metrics for the tasks supported by those tools.
The models can be obtained through a standard public interface for models.

What is not important to an enterprise is that these models were created or are used within the enterprise in the form of EXPRESS as a presentation language. Any presentation will do, as long as it meets the needs of the processes that use and create models as the product of their activity. An integrated standard for modeling is much more likely to achieve this. As with most marriages, STEP's relationship to its modeling technique will be the stronger when is attracted to the reality, not the image.

4.4.2.2. Convergence with model-mapping language standards

Similarly, no standard, including STEP, should develop its own parochial approach to mapping one model as a view of the other. The need for such mapping is generic, it applies to all tools, whether they support end-users or modelers or systems developers; it is certainly not specific to product data, or any other application domain. It should be possible to incorporate application domain standards into an enterprise with the same kind of model integration facility (MIF) that is used to incorporate new tools. That means that the model mapping language used in the development of standards for an application domain, should not be specific to that application domain, but should be a generic standard.

4.4.2.3. Convergence with standards for a three-schema architecture

The most critical requirement for application domain standards is to provide a mechanism by which individual domains can interoperate with others. This kind of interoperation cannot be based on the assumption that the interoperating domains are all standardized within the same framework. Industry needs to ability to plug in new technology, regardless of whether it implement an application domain standard. The approach taken by STEP to render its own APs interoperable must be compatible with the standards we need to implement semantic plug and play for technology in general .

The three-schema architecture is probably the firmest foundation that standards can give to semantic plug and play. STEP's mapping of APs to the GRM (Generic Resource Models), especially with the introduction of AICs (application interpreted constructs) is, as we have seen an incomplete step toward a three-schema architecture. This direction should be followed to its conclusion, in cooperation with the work being done on Conceptual Schema Modeling Facilities (CSMF) by JTC1/SC21/WG3, should lead to effective standards and technologies to support the three-schema architecture.

5. Some conclusions about semantic plug and play

Let me conclude by summarizing my suggestions for specific improvements to standards, and then tying it all together in a grand synthesis, just like they taught us to do in school.

5.1. Summary of standards required for semantic plug and play

To review let me briefly repeat the requirements described above for improvements in the standards for modeling and the use of models.

We need the following improvements in standards for modeling languages:
- A standard public interface that enables the exchange of definitions data, processing, object and knowledge services, regardless of the tools used to build or to analyze models.
- A standard integrated semantics for the public interface should allow the exchange of multiple interrelated views of the tool and its role in the information system.
- Standards organizations should strive to develop process-driven modeling techniques.
We need the following improvements in standards for model mapping:
- A language-independent mapping languages that provides way to map the components of one model to those of another, regardless of the presentation language used to create the models.
- A standard semantics for the public interface of mapping languages should allow the exchange of mappings among conformant MIFs (model integration facilities).
- As with modeling languages, the details of mapping languages should be driven by the processes that rely on the mapping..
We need the following improvements to the infrastructure by which models are implemented:
- The standards for compilers and other tools that operate on models should be formulated in terms of the public interfaces for those models, not on their presentation forms.
- The standard functions compilable from the models should include a well-defined set of integrity preserving exchange functions, sufficient to support shared access to services.
- Standards are needed for the compilation of mapping languages into linkages between the services provided by a server and those needed by a client.
We need the following improvements to standards for application domains:
- The standards for application domains should begin a process of convergence with standards for modeling languages, so that domain-specific standards can be exchanged between different modeling tools.
- The standards for application domains should cooperate with other standards in the development of a mapping language that can facilitate the mapping of tool models to enterprise models.
- Standards are needed for assembling tools into an interoperating network through a three-schema architecture.

5.2. What is really needed: cooperative progress toward standards for semantic plug and play

You will have noticed from what I have said is that I believe that some existing standards don't stand up to all my theoretical biases. What you might not have noticed is that I don't fault any of the organizations for these "shortfalls". Theoretical purity prevents compromise when compromise is needed, and invariably overlooks practical needs and obstacles that an ecumenical approach can usually recognize and provide for.

The question now is how to move toward an environment that allows not only sharing of services not only across STEP APs, but across any domains where an enterprise finds such sharing useful, whether those domains be standardized by STEP, by some other standard, or not at all. The primary thesis of the paper is that semantic plug and play is an achievable approach to such sharing, that builds upon techniques pioneered by STEP, by CDIF, by SQL, by CSMF and by others.

By way of a straw horse for consideration, let me suggest the following:

Organizations such as TC184/SC4, which are developing application domain standards, should cooperate with organizations such as JTC1/SC7 and JTC1/SC21, which are building infrastructure standards, in the development of a joint standard for an implementable three-schema architecture, in which the relationship between application views and an single source of services (SSS) is well defined by models. Such a standard would allow individual application domain standards to be implemented on a stand-alone basis, thereby enabling semantically precise communication among specific classes of tools. But it would also allow those tools to be plugged into the semantically rich backbone of an enterprise and play a critical niche role.

Such an architecture requires a much more extensive consensus among the cooperating organizations about what the architectural components are, how they are interrelated, how they are defined in models, what standards apply to what components, and who is responsible for what standards. This kind of consensus cannot be achieved through the casual, occasional and superficial monitoring that typifies most liaison activities. Although organizations readily agree to the establishment of liaisons, all too often the message being communicated is "We'll open our doors and let you see how we are doing things right, and then you can follow our lead and stop your inane folly." Rarely, I should say "never in my experience", has a liaison been able to trigger a profound synthesis of two standards organizations, when both had the inertia of existing standards.

To achieve this consensus will require each organization to make a major commitment to a cooperative joint process, and to recognize that not only do they have something critical to contribute to the process, but something to gain. The process must provide a manageable, politically sensitive compromise between the needs of each organization to carry out its program to provide timely standards that meet immediate needs, while at the same time taking meaningful steps toward a convergent architecture. What can be done to initiate this process? Possibilities include

Joint working groups (JWG) to enable multiple standards organizations to explore together in greater depth the state of the art, the state of the standards, and technical opportunities for convergence in various aspects of the architecture needed for semantic plug and play. TC184/SC4/JWG9 is an example of such an approach, in which several IEC (International Electrotechnical Commission) committees are cooperating with STEP to produce an cohesive package of electrical product standards. Among the possible joint working groups are
- A JWG on modeling languages, including model mapping languages. This JWG should close ties both with the JTC1/SC7 work on software engineering processes, with the JTC1/SC21 work on database design, and with the TC184/SC4 work on implementation methods.
- A JWG on architecture to explore techniques of using modeling languages to define and implement a three-schema architecture.
- A JWG on approaches for extending the current focus on data modeling to include modeling of processing, objects, and knowledge.
Joint assignment of responsibility for specific work and allocation of resources.
Joint development and approval of the resulting standards.

Appendices: Background for Semantic Plug and Play

The following sections provide background material for the above discussion of semantic plug and play:

Section 4.1 (Concepts of Semantic Plug and Play) provides explanations of concepts that I invoke regularly in the text above.
Section 4.2 (Current Standards Relevant to Semantic Plug and Play) describes some of the standards that address the issues of semantic plug and play.

Most of you will be familiar enough with most of this material (enough more than likely to find some errors, or at least points of disagreement, with what is said here), but the audience for this paper is so diverse that I could not rely on an assumption that it was so familiar that it needn't be said. By way of compromise I have moved this background material here to an appendix and provided hot links to it from the text above.

Appendix 1: Concepts of Semantic Plug and Play

In order to describe the architectural requirements for semantic plug and play, it is useful to make a number of distinctions. Please accept my apologies if this section seems somewhat didactic. My goal is to describe how to make semantic plug and play work in a technologically heterogeneous environment. Hence I want to describe the various technological cultures that make up that environment with as little prejudice as I can muster for or against any of the technologies or paradigms.

A-1.1. Information Services for Semantic Plug and Play

Applications interact by providing services to one another. These services can take any of several forms: data, processing, objects or knowledge.

A-1.1.1. Roles in the Exchange of Information Services

When a computing architecture is built so that several concurrently operating hardware components are arranged so that one provides common services to the others, it is called a client/server architecture. It is useful to generalize this terminology somewhat to distinguish the role of applications in all the various exchanges of services that are required:

An application is a client for a service if it is designed to invoke that service from another application.
An application is a provider of a service, i.e., a server, if it is designed to perform that service when requested by another application.

An application can be a client for some services and a server for others. Moreover, an application might initially invoke a service from a server, and then repeatedly use the results of that service without recurrent requests to the server. It might even put those results into another application, e.g., a local database, so that it can retrieve them more efficiently.

A-1.1.2. Mechanisms for the Exchange of Information Services

If an application needs a service performed by some other service provider, whether the service involve data, processing, objects or knowledge, it can achieve that objective in any of several ways:

Exchange. The client can request a copy of the results of a service from a provider. This is effectively what happens when a file is exchanged between applications, using for example the STEP File Exchange standard. Of course those results can be made obsolete by subsequent activity of the provider, but where that is a potential problem, it is usually corrected by imposing controls on the execution of the individual applications.
Linking. The client can link to, i.e., become a client for, a particular service of a particular provider. This is effectively what happens when one document is linked into another, using for example the OLE, standard. The serving application takes over and the client thereby relinquishes control and responsibility over that service. The client can of course disconnect the link, and may in so doing copy the current results of the service, thereby making the service one of exchange. In this form of service access, the client establishes and discards links on an individual basis.
Procedural Access. The client can establish a procedure, i.e., an algorithm in a programming language, for finding and invoking the services it needs on any particular occasion. Such procedures typically work by navigating among linked nodes in a structure in a structure of services, at each node deciding whether to select the service represented by that node, and then which link, if any, to follow from that node. The service structure can be as simple as a list, or as complex and global as the World Wide Web. Rather than hard-coding each individual link, this form of access provides a means of selecting among the services available at the time the procedure is executed. Such procedures, however, usually depend on physical features of the service structure being navigated, and have to be reprogrammed when those features evolve.
Declarative Access. The client can declare a set of rules, for example in SQL, for specifying the services it needs on any particular occasion, and let the "environment", i.e., the application in the architecture responsible for monitoring and brokering service requests, be responsible for finding and invoking those services. Theoretically declarative access eliminates dependency on the physical aspects and distribution of the service structure. In practice, efficient performance of declarative access has only been demonstrated in relatively constricted data spaces.

A-1.1.3. Kinds of Information Services

Although I have spoken of data, processing, objects or knowledge services as though they were just special cases of a general concept of service, this is really a case of generalization not specialization. The mechanisms for sharing data, processing, objects, and knowledge have for the most part evolved independently and to some extent in competition with one another. To provide an adequate view of what those services are, we need to examine each type of service individually and then look for potentially valuable generalizations.

A-1.1.3.1. Data Services

Data are any putative representations of facts, whether they be sentences in a natural language, labeled diagrams, or structured sequences of binary ones and zeros in a computer. The simplest and most common form of service between applications is the exchange of data. Successful exchange requires that both sender and receiver agree on the format of the data to be exchanged, and also to some criteria for determining that a particular packet of data is the "right" response to a particular request. It is not required that the sender know what the receiver will do with the data, nor that the receiver know how the sender got the data in the first place.

Although much of the debate about data architectures today seems to be couched in terms of the relational versus the object-oriented paradigm, from the standpoint of the applications, the real question is one of deciding how best to access the data that the application shares with other applications. There seem to be three options: file exchange, database query, and database navigation.

A-1.1.3.1.1. File Exchange

File exchange is the oldest form of data sharing among applications, and is the simplest form to set up initially. Although applications might use the exchange format as its primary data structure, this normally only works with relatively simple applications. Normally an application will translate the exchange form into an internal data structure that allows it to provide competitively high performance.

Reliance on file exchange as the mechanism for data sharing tends to create a one-schema architecture, with the consequent N-Squared Problem. Nonetheless, because of its simplicity it is commonly used. Part 21 of STEP defines the format of a file to exchange instances of data conforming to an EXPRESS model, and hence conforming to the specific EXPRESS models that define the STEP application protocols. It is currently the only form of data sharing within STEP that has been made an international standard, although a draft international standard is under review for another mechanism, which is mentioned below.

The ripple effect of the N-Squared Problem seems largely nullified when the exchange format is standardized. Internal application data structures have to evolve in ways that allow them to continue to exchange the standard data format. Problem still remain however, especially when a company needs to support cross-domain data sharing:

Either
- the standard file exchange format must include all of the data used by any of the applications in the network, in which case each application has the task of filtering out what is important to it, or
- applications must conform to multiple standards in order to exchange the right data with its various exchange partners.
The standard itself can come to have a dampening effect on the spread of new and reusable improvements in data structures.

A-1.1.3.1.2. Database Query

Access through a query language is the hallmark of the relational database paradigm. Of course the word 'query' here is misleading, since query languages are used to add and alter data in a data base as well as to answer questions. A RDBMS (relational database management system) typically provides a query language that is a dialect of the standard for the Structured Query Language (SQL), where 'dialect' implies both additions ("enhancements") to and subtractions from ("unsupported minor features of") the standard.

The database query approach to data access enables a single database to support data sharing among multiple applications. Databases can be designed to integrate the data from all applications, and each application can access just the data it is interested in by tailoring the queries it makes to its own purposes.

The essential feature of the query approach to data access is that it is "set-oriented": a query retrieves (or updates) the data specified about each member of the set of objects selected in the query. Queries about individual objects are special cases of this set orientation, in which the selection criteria specifies data values that are unique to the individual, i.e., key data: the classical search mechanism still examines all the rows of the table to find "all" the rows with the specified value. Of course, the only RDBMSs that survive in the marketplace are those that replace the classical search mechanism with an indexed search.

The set-oriented approach works well when all the data required by the query can be "normalized" to a single two-dimensional table, or at least to a small number of such tables. When relationships are required between tables, the classical solution is to perform a "join" to form a virtual super-table that combines the data from all the joined tables in a highly redundant form. The classical form of the join process was extremely poor in performance, and commercial relational databases have competed in finding proprietary ways for using indexes and other techniques to support cross-table queries with acceptable performance.

A-1.1.3.1.3. Database Navigation

Although query languages are commonly associated with the "old" relational database paradigm, they are not the original form of database access. When database first emerged as mechanisms for sharing data among applications, they provided two essential components

data structures that integrated the data required by each of the applications,
functions that could be invoked by programs to navigate through those data structures to retrieve that data.

The library of data access functions is known as an application programming interface (API). Such APIs are the typical form a data access in the "new" object-oriented database paradigm. Although a version of the SQL standard is being written to support access to objects in an ODBMS (object-oriented database management system), it is not yet clear that use of a query language for will ODBMS will become as wide-spread as it did for RDBMSs.

A-1.1.3.2. Processing Services

Although theoretically each application could do all of its own processing, i.e., perform all the transformations required to meet the objectives of the business that are its responsibility, this is generally an expensive practice. Once a computing process has been designed, implemented and tested, it is likely to be less costly to reuse that process in other applications than for them to implement their own versions. Reuse can take place either at compile-time, in which case a copy of the code is inserted into the reusing application, or at run time, in which case the reusing application (the client) asks the original application (the server) to perform the process for it.

A-1.1.3.3. Object Services

An object, as defined by the adherents of object-oriented design and programming, is an encapsulated package of data and processing. Object services enable one application to invoke the data and processing services of an object without itself having to contain the code to perform those services. All that is required is a set of object interface services, for which there are several competing emerging standards: OLE, CORBA, ODBC.

"Object" seems to be one of those words whose application to information processing has made them emotively non-neutral. Everyone has a strong opinion about what they mean, whether positive or negative. Advocates seem to be able to use the words in glowing ways demonstrating how obvious it is that they have a new and better way of doing things, that this is the way of the future, and that funds should be diverted from obsolete approaches to the new paradigm. Detractors object that the hyperbole for the new approach is just old wine in new bottles, i.e., new words for the same old thing, that they can do with their old techniques anything that the new approach can do, and that what little real innovation is to be found was something they were thinking about adding to their approach anyway.

However this debate turns out, it does seem clear that object definition involves a "packaging" or "encapsulation" of data and processes that provides a kind of cohesion that is not normally found in traditional information systems. This packaging often involves implementations that presume a kind of inter-object messaging in order to be executed. Thus even if the details of object definitions are very like data and process definitions, at the very least the packaging will need to be defined in order for the semantic plug and play process to map the objects to sharable enterprise objects with any accuracy.

A-1.1.3.4. Knowledge Services

The work "knowledge", like the word "object" discussed above, seems to be emotively charged in its application to computing systems, as in knowledge-based engineering (KBE) or knowledge-exchange. However we react to the hype surrounding the word, there does seem to be a potential for changes in the kinds of interactions between computing systems.

The key word that typifies a knowledge-based system and distinguishes knowledge-exchange from data exchange is "rule". Of course all software is constructed of rules; a program is always a set of rules. Ordinary programmers, not LISPers, just don't happen to use that word. But that misses the essential point. It is not that KBE systems are built with rules; it's that they exchange rules.

Contrast a traditional exchange of geometric data between CAD (computer-aided design) tools, perhaps through STEP, and an exchange of "knowledge" between KBE tools. Where the CAD tools exchange the data needed to reconstruct in the receiving tool the geometric specification of a particular part, the KBE tools exchange rules that define a whole class of parts, from which the geometry of that particular part can be generated by specifying appropriate parameters.

The rules to be exchanged can vary from parametric descriptions (functions that generate specific parts from particular parameters), to the programs that implement such functions, to sets of axioms that can be fed as input to an "inference engine". Usually exchanges presume a common "ontology", i.e., a predefined set of terms (data definitions), functions (executable processes) and background rules (axioms), so that what is exchanged is only the rules particular to a specific application of the ontology.

Knowledge services therefore would include the ability to exchange, combine, verify, and execute sets of rules, to invoke "inference engines" to those rules to determine consequences, and to apply them to local data to derive instances.

A-1.2. Approaches to Client/Server Architectures

The complexity of a semantic plug and play architecture depends to a large extent on the diversity of the applications that must interoperate within it. It is inherently more difficult to support interoperability when the mix of applications is varied and changing. To bring this dimension into our discussion, it is useful to distinguish domain-specific data-sharing from cross-domain data sharing:

Domain-specific data-sharing takes place among applications supporting users who do the same kind of thing. For example, structural engineers define the geometry of parts that make up airplanes, cars, ships, etc. The computing tools that support that activity are commonly called CAD (computer-aided design) tools. Different companies (sometimes different divisions of the same company) select different CAD tools to support this activity, and build their business processes around the tool they select. These companies (and divisions) have found that they need to exchange data between these tools because they define the parts that are exchanged. The information needed by the engineer in the company buying the part is much the same as that needed by the engineer in the selling company (i.e., of the same domain), but the different representations in the different tools has made that exchange difficult.

Cross-domain data sharing takes place when different kinds of applications have to share a common subset of data, or to put it another way, when they work on different subsets ("views") of a logically common pool of information. For example a CAD tool might have to share data with an engineering analysis tool. The analysis tool you installed in our make-believe scenario above, does not do structural design, but it uses some of the results of structural design to perform certain analyses, which the structural engineers might want to reflect in the drawings the produce to present their designs. So the CAD tool might send certain geometric attributes of parts identified by certain key data to the analysis tool, which will return, say, the values of a stress analysis for the CAD tool to display using its visualization capabilities. Although it is possible for a single vendor to support multiple domains with a tool suite integrated through a proprietary data format, it has often been found that such suites achieve integration at a significant cost in functionality, relative to comparable standalone products. Industry has tended to select tools from different vendors that do their respective jobs well, and to integrate them with some home-grown procedures and interfaces.

A-1.2.1. Data Sharing through Views

Cross-domain data sharing implies an environment in which most of the applications use a portion of the data available. Indeed, for most objects, it is unlikely that any one application will use all of the data about objects of that kind. I will use the word "view" to refer to a perspective of a business that distinguishes its information requirements from those of other perspectives. For example, engineering, manufacturing, marketing, purchasing, etc., are all major perspectives, i.e., views, of a business. Each view

is concerned with a collection of objects about which it needs information,
expects to receive that information from other processes in a timely fashion, preferably in a form that facilitates its application to local procedures, (of course if a process cannot get reasonable access to that information from other, more official sources, it will spawn its own subprocesses to provide it, redundancy and consistency be d____d!),
is responsible for certain kinds of processes that manipulate those objects and result in new information, and for providing the results of those processes to

It should be noted that this very approach means that one cannot expect a precise definition of what a view is: just as a view is one among many perspectives of a business, which is implemented to facilitate the operation of the business from that perspective, views are multi-dimensional objects that admit of multiple perspectives. Whether it is very like a snake, or very like a rope, or very like a wall, or very like a tree depends on what part of the elephant you grasp.

A-1.2.2. Three-Schema Architecture

The classical three-schema architecture (3SA) organized the definition of data into three kinds of schemas:

External schemas define the data of the tools that are directly used by end users, i.e., the external processors. An external processor, in the classical sense, would be a pure client, "sullied" only to the extent that it managed the physical access either of some local data that it did not share with other applications, or of physical copies of shared data that it created to improve performance.
Internal schemas define the data as manipulated by the tools that manage the physical storage of data, i.e., the internal processors. An internal processor, e.g., a database, would be a pure data server.
Conceptual schemas define a view of the data that was
- independent of any particular external or internal schema,
- included each of the external and internal schemas as a subset (and thus served as an integrated enterprise data),
- managed by conceptual processor that served as an access mediator between external and external processors and would thus be both a data server, in that it responded to access requests from external processors, and also a data client, in that it distributed those requests to the various internal processors available to it, and merged the results back to the initiating application.

The objective of these distinctions was an architecture would enable every application, external or internal, to interact in its assigned role through a single two-way interface to the conceptual processor, which would effectively provide a virtual single source of data (SSD).

A-1.2.2.1. One-, Two- and Three-Schema Architectures

It was this single interface that distinguished the three-schema architecture from a one-schema architecture or a two-schema architecture:

A one-schema architecture (1SA) has only end-user applications with external schemas, no shared internal processors, no conceptual processors. Theservices required between applications must be established on a point-to-point basis, i.e., by an interface between each client-server pair. Such an architecture has a strong tendency toward the "N-Squared Problem": If "N" is the number of applications in a one-schema architecture, then "N*N" interfaces have to be. Changes to the data structure of any of these applications cause changes to each of its interfaces, which in turn cause changes to the data structures of the interfaced application. The potential for this ripple effect tends to create a gridlock in application evolution.
A two-schema architecture (2SA) has end-user applications with external schemas, and shared internal processors (databases), but no conceptual processors. Each application maps its external schema to the internal schema of each of the databases, and accesses the data it requires from each. In such an architecture the N-Squared Problem mutates to an "N*I Problem, in which the number of interfaces is the product of the number of external schemas times the number of internal schemas, normally a much smaller number. Nonetheless each application has to keep track of what data is currently in which of the databases available to it..
In a three-schema architecture, each application, whether internal or external, can evolve independently, having only to update that one interface when it made a change to itself. Of course, when it started producing new kinds of data, some internal processor would have to take on the responsibility of managing that data, and then some other application would see that that data was available and start exploiting it. And of course if an application decided it no longer made sense for it to produce certain kinds of data, then the applications that used that data would have to figure out how to do without it or to find new sources. But these were manageable changes in the configuration of the data architecture compared with the gridlock imposed by the N-square problem of a two-schema architecture.

Assuming the technology can be made available, a semantic plug and play environment is easier to establish in a three-schema architecture. The model of a tool need be mapped only to the conceptual schema, and only one interface need be constructed.

A-1.2.2.2. Three Schema Architectures for All Information Services

The distinctions among these architectures was originally defined in terms of data, but there is no reason in principle why it does not apply just as well to all information services. A conceptual schema can provide definitions not just of the data services but also of processing, object and knowledge services. With the proper directory a conceptual processor can serve as a broker for all of those services, thus creating a single virtual source of information services, or more compactly, a single source of services (SSS).

Such a generalization tends to blur the distinction between internal and external schemas. The conceptual processor would route requests for processing, object and knowledge services to applications that in a classical three-schema architecture would be classified as external, since that is where data conversion is done. Of course that distinction has been fuzzy anyway, since commercial applications, especially those implementing current computer-aided design (CAD) standards (STEP Part 203 - Configuration Controlled Design). Increasingly these tools are not only supporting users directly in their design activities, but also providing access to the data from those activities through a STEP interface, thereby serving as both an external and an internal processor. This blurring of distinctions is no major loss to the objectives of the three-schema architecture. What is important is that a client application negotiates its service requirements through a single server (the conceptual processor) that in turn routes service requests to the original service providers.

A-1.2.3. Federated Applications

Although there are a number of data management systems that claim that they can play the role of internal processor, and perhaps at least part of the role of conceptual processor, there seem to be few applications that are designed to play the role of external processor in the classical sense, i.e., few are designed to plug into a data management architecture and access the data they require through the facilities of the conceptual processor. Each, it seems, wants to have control of its physical data structures.

There are some good reasons for this:

There are no conceptual processors on the market that facilitate the rapid mapping of an application's external schema to a conceptual schema in a way that automatically links the application's access requests to the functions available through the conceptual processor. Database management systems can be used for this role, but the result has invariably been an application database with a two-schema architecture.
Certain kinds of applications -- CAD tools for one -- seem to require physical data structures that are tuned to their processing techniques. Although theoretically, such tools could operate against a shared, general purpose data schema, attempts to do so have resulted in grossly non-competitive performance.

Federated data architectures seek to achieve an adequate measure of data sharing without compromising the responsibility of each application in the network to meet its own users' performance and functionality requirements. In such an architecture, each application decides

decides what data it needs to access in what way,
determines whether it will produce that data itself or get it from somewhere else,
decides which of its data and services it will make available to other applications, under what conditions, and by what access techniques.

For any data that the application decides to acquire from elsewhere, the application is a data client; for any data that the application decides to provide access for other applications, the application is a data server. Data managers might well be important data servers in this federated network, but unless they implement all of the services of a conceptual processor, they have no distinguished status as far as the network in concerned.

The question still remains whether

each application will negotiate individually with each other application the kinds of forms of data sharing that will be made between them, creating a two-schema federated architecture, with the resulting tendency toward the gridlock of the N-squared problem, or
the applications will negotiate collectively the construction of a conceptual schema for the federation that integrates the data they are willing to share, and the construction of a conceptual processor that mediates each access between the initiating data client and the set of data servers that can provide a partial response to that request.

Since some data is invariably stored multiple times in a federated architecture, any federated conceptual schema will need to establish rules that define which physical copy is the master, i.e., the copy that is by default retrieved to fulfill requests for that logical data. Even more difficult is the negotiation of a process for resolving differences in changes that are made by users of different applications.

A-1.3. Representations of Views

A schema defines the kind of services required or provided by a tool, or (in the case of a conceptual schema) shared among a collection of tools. Schemas are commonly captured in data models, enhanced to some extent with processing, object or knowledge models. These models specify how the business objects of interest to the users of the tool are to be represented and manipulated. If we look closer, however, we find that there are at least three kinds of representation that must be defined for a tool:

A presentation is a representation of those business objects to people. A presentation schema defines how data will appear to people, and how people will manipulate that data through operations on that appearance.
A public interface is a structure of the services which the tool exchanges with other tools. A public interface schema is the definition of the services through which such exchanges take place.
A private representation is the definition of the data as seen by the developers of the tool. For the most part this is proprietary information that is not directly available to users or integrators except through the presentation or the public representation.

Standards are normally concerned with public interfaces. For the most part the effectiveness of a presentation is part of the competitive advantage that a tool will use in its marketing. However, there are some issues about the relationships among these representations that are of importance in defining standards for semantic plug and play.

The symptom of these issues lies in the difference between the information in the presentation and the information in the public interface. Vendors are used to building tools that work in a standalone fashion; interfaces that allow exchange of data to other tools are usually developed (if at all) as afterthoughts, and often with concern that to provide a full interface, i.e., a public interface that fully captures the presentation, is just an invitation for a user to convert to another tool. The result is that often a tool makes information visible in a presentation that cannot be extracted from the public interface. If that data is needed by another tool, it must be manually entered.

This difference can actually be exacerbated by the existence of a domain standard to which the tool conforms. The vendor will provide a standard public interface that does not contain data that reflects vendor enhancements, additional data or functions that users find enticing, but are not yet part of the standard. The user can rightfully claim (depending on the exact wording of the standard) that to put the extensions into the public interface would render the tool non-conformant.

Another difference occurs when information provided by the tool does not end up as expected in the presentations of the tool. For example some CASE tools have had public interfaces through which data could be imported into the tool, but the objects thus imported would not appear in the graphic presentations of the tools, unless the user went through some contortions to move those objects into the appropriate graphics. I might import an Entity definition into the tool, but it would not appear on the Entity-Relationship diagram unless I edited the diagram, added a new object and then grabbed the imported data from some other tool presentation. The tool did not have the capability of even rudimentary graphic layout from imported data.

The ideal of semantic plug and play is that there be an equivalence between the set of presentations of a tool and its public interface. Any information a user can see in any presentation should be exportable from that presentation into the interface and importable from the interface into the presentation. Standards should be flexible enough to allow tools to supplement a standards-conformant core with data that has not yet been standardized (and might never be).

Appendix 2: Modeling and Modeling Techniques

In the wide world of business, the word "model" can be applied to anything from a look-alike toy to an full-scale physical mockup to an electronic abstraction used for simulation to a whole class of products. Those are all legitimate uses of the word. For our purposes here, however, I shall always (unless I specify otherwise) use the word to mean a definition of a view.

A-2.1. What is a Model?

The word "modeling technique" bears an ambiguity that has proved an obstacle in achieving convergence of modeling approaches. It can mean either of the following:

The method for collecting the information to be included in the model, i.e., the processes for identifying, capturing, analyzing and verifying that information.
The organization of the information to be presented in the model, i.e., the "language" for expressing that information.

By "modeling technique" I shall in this paper exclusively mean the latter of these two. More accurately, I shall mean the underlying semantics of a modeling language. In order to make effective use of a modeling technique, an enterprise must have a well-designed method or process that guides the collection of information, and that process may well be acquired from a company that specializes in systems development methodologies. But the kinds of standards being considered here are neutral with respect to modeling methods, i.e., how the data is collected, and address only the mechanisms for sharing models among different methods.

A-2.2. Modeling Techniques

A model captures information needed about a component of an information system in order to make decisions about implementing it or integrating it in a way that it can operate and interoperate with other components of that system.

Just as a view represents a certain perspective of the business, a model represents a certain perspective of the view. A model is a representation of the view for a particular purpose. Each model we take the time to build should assemble information about the view that we need in order to perform some task in implementing the view. Models should not be built for their own sake. No matter how well-entrenched a modeling technique is, if we do not know what we are going to do with that model, if there are no down-stream processes that need the definitions that are assembled in the model, then the model is a waste of time.

What kind of models do we need? That is like asking what kind of data do we need to do our business. The electronic marketplace is filled with "new" kinds of data--usually new ways of presenting data, but occasionally truly new kinds of data--that proclaim themselves to be essential to the competitive edge. A model is needed if plays a valuable role in defining, implementing, operating or improving the aspect of the business comprehended by a view. So it is not possible to limit a priori the list of useful models.

For the time being then we must leave aside any attempt to provide a complete list of the kinds of models that are needed, and ask only what kinds of models are currently finding significant use in defining views.

A-2.2.1. Data Definition

The most commonly used form of view definition is data modeling. Its objective is to define the kinds of data that can be manipulated (created, calculated, retrieved, altered or removed) by an application that supports a view. These definitions specify not only the types of data that are needed but the relationships among data of these types and the rules for manipulating the data in ways that assure that it continues to conform to the assumptions that are expected by the view. (These rules are commonly called "integrity constraints".)

A-2.2.1.1. Data Definition: Role in Semantic Plug and Play

In a semantic plug and play architecture, a data model for the view implemented by a tool would enable the following:

It would allow users of the tool to analyze the data it manipulates, to compare that data with the objects already defined for the enterprise, and to map the data manipulated by the tool to those objects.
It would allow tool implementers to incorporate the data access functions of a tool into its library of callable functions, thereby making the tool a data server for other applications.
It would allow tool implementers to identify the data required from tools being replaced, e.g., a prior version of the tool, and to link the tool's import facilities to the export services of the old tools, thereby achieving data migration.
It would allow tool implementers to identify the data required from external sources by the tool, and to link the tool's data access requests of the into its library of callable data access functions, thereby making the tool a client for those services.

A-2.2.1.2. Data Definition: State of the Art

A competitive market for data modeling tools and techniques emerged in the early 80's, and led to a proliferation of alternatives, each promoting its competitive edge.

A-2.2.1.2.1. Common Forms of Data Definition

The most common technique of data modeling is entity-relationship attribute (ERA) modeling, of which there are many dialects:

Peter Chen's original ERA technique distinguished
- entities, i.e., things about which information could be collected,
- attributes, i.e., properties of those entities, and
- relationships, i.e., associations among those entities.
Various entity-attribute techniques, such as EXPRESS, do not explicitly distinguish relationships as a type of object. Binary relationships are modeled as attributes with other entities as domains, and multi-place relationships are modeled as entities.
Binary ERA techniques, such as IDEF1X explicitly recognized only two-place relationships. Relationships with more than two places (roles), with many-to-many cardinalities, or with their own attributes or relationships had to be modeled as entities.
Semantic net techniques, such as NIAM (Natural language Information Analysis Methodology), Conceptual Graphs and KIF (Knowledge Interchange Format), make no hard distinction between entities and attributes, holding that distinction to be significant only in the design of a database, not in the specification of the business objects.

A-2.2.1.2.2. Distinguishing Features of Data Definition

Other distinctions among data definition languages include their support for the following:

Inheritance and subtyping -- To what extent can one object (entity, attribute or relationship) be defined as a special case of another, and how does such a characterization affect the way the object is implemented? If subtyping is supported, is it
- multiple inheritance, in which an object can be a subtype of more than one other object, or
- single inheritance, in which an object can be a subtype of no more than one other object?
Attributed subtyping--If subtyping is supported, is it possible to specify
- exclusive subtyping, in which nothing can be an instance of more than one of a set of subtypes
- exhaustive subtyping, in which each instance of the supertype must be an instance of at least one of a set of subtypes
Cardinalities and integrity constraints -- What kinds of built-in rules can be specified for relationships? Options available include:
- Maximum relationship cardinalities--Is a binary relationship one-to-one, one-to-many, many-to-one or many-to-many?
- Minimum relationship cardinalities--Is a binary relationship mandatory or optional for either of the types associated with it?
- Maximum role cardinalities--Which of the following characterizes a role in a relationship?
  - one-to-one -- an object of the related type plays that role in at most one instance of the relationship, and the relationship has one object in the role,
  - one-to-many -- an object of the related type plays that role in possibly many instances of the relationship, but each instance of the relationship has only one object in the role,
  - many-to-one -- an object of the related type plays that role in at most one instance of the relationship, but an instance of the relationship may have more than one object in the role, or
  - many-to-many-- an object of the related type plays that role in possibly many instances of the relationship, and an instance of the relationship may have more than one object in the role.
- Minimum relationship cardinalities--Is a role in a relationship mandatory or optional for either the type of object associated with it or for the relationship?
- Exclusive relationships--Can a set of relationships defined for a given type of object be characterized as mutually exclusive? Can the object types that can play a given role in a relationship be characterized as mutually exclusive?
- Exhaustive relationships--Can a set of relationships defined for a given type of object be characterized as exhaustive, i.e., such that every instance of that object type must play a role in at least one of them? Can a set of object types that can play a given role in a relationship be characterized as exhaustive, i.e., such that every instance of that relationship must have at least one of those objects playing that role?
Relationship Properties -- Does the technique allow the specification of classical properties of relationships? (These properties, if implemented, would enable certain relationships to be deducible from a database without being physically stored.)
- Reflexivity -- Can an instance of an object type be related to itself by a relationship?
  - Reflexive relationships (e.g., same length as) require each object of the appropriate types to be related to itself.
  - Irreflexive relationships (e.g., longer than) allow no object of the appropriate types to be related to itself.
  - Non-reflexive relationships (e.g., selected by) neither require nor preclude an object of the appropriate types from being related to itself.
- Symmetry -- Can a relationship hold in both directions between a pair of objects?
  - Symmetric relationships (e.g., same length as) require that if a relationship holds in one direction, then it holds in the other.
  - Asymmetric relationships (e.g., longer than) preclude a relationship from holding in both directions.
  - Non-symmetric relationships (e.g., enjoys the company of) neither require nor preclude a relationship from holding in both directions.
- Transitivity--Ifone object is related to a second and the second to a third, can the first be related to the third?
  - Transitive relationships (e.g., longer than) require that if one object is related to a second and the second to a third, then the first is related to the third.
  - Intransitive relationships (e.g., father of) preclude the first from being related to the third.
  - Non-transitive relationships (e.g., enjoys the company of) neither require nor preclude first from being related to the third.
Constraint Specification--What kinds of facilities does the technique provide for specifying cardinalities or integrity constraints more complex than built-in relationship or role attributes? Options available include:
- Procedural constraint languages--such as the rules language within EXPRESS
- Query languages--such as SQL
- Declarative languages--such as KIF
Constraint Implementation -- How do the tools that implement the technique help to assure that the constraints are met in the resulting system? For example, do they generate reusable code for integrity-preserving transactions?

The dominance of ERA modeling is easy to understand: not only is it relatively easy to teach (its graphical presentation put the essentials of the relational data model within reach of mere mortals), but vendors of CASE (computer-aided software engineering) tools were quickly able to market products that not only provided graphical editors for these models but also could generate database schemas from those models. For data bigots this was the millennium: our data-driven database environments were now themselves (meta-)data-driven. Look what you can do with them: Build a data model in your CASE tool and (depending on the tool) it will

generate a database schema for you, either in the form of standard SQL table definition statements, or in the form needed for particular DBMS products,
generate copy libraries or header statements necessary to access the data thus defined in your database,
generate pre- and post-processors to read and write standard import/export files,
generate an application programming interface (API), i.e., a library of callable functions to manipulate instances of that data.

There were of course a few minor problems:

DBMS products, even those that proclaimed their adherence to the relational database standard, differed significantly in the details of their implementation.
It was necessary to take those details into account in designing physical databases that provided adequate performance.
With a few exceptions general purpose data modeling tools were not aware of those differences, and did not provide an adequate environment for making and implementing decisions about details.
DBMS-specific modeling tools were usually designed to handle the implementation details, but were inadequate to solving architectural problems. (Try to find a data-modeling tool that will help you select between relational and object-oriented databases based on the structure of the data and the expected pattern of usage, or between different products of either type. Try to find one that will help you implement integrity constraints or access control more complex than the ones that are built into your DBMS.)
Until the recent issuance of the CDIF (CASE Data Interchange Format) standard by EIA (Electronic Industries Association), or the EXPRESS standard within STEP (ISO 10303-11: STandard for the Exchange of Product model data), it was not possible to exchange data models among tools in a way in which they would interoperate within different ----to meet these various objectives, and the standard is so new that attempts to do can still be regarded as experimental.
Even if an organization used standards-conformant tools, neither the standard nor the tools provide full integration of data and process models, i.e., techniques for mapping the processes defined in the process models to the data the process would need to access in ways that assured that those accesses would conform to the integrity constraints defined in the data models. Some exceptions exist:
- There are some tools based on the Information Engineering paradigm of James Martin and Clyde Finkelstein and that generate database applications that are assured to maintain the kinds of integrity constraints that the paradigm allows.
- An extension of STEP, namely SDAI (Standard Data Access Interface, ISO DIS 10303-22), defines a standard API for any model defined in EXPRESS. Defined as reusable modules within a process model, and given a process that assured that external data access took place only by invoking those modules, an enterprise is well on its way to a data architecture that is driven by its data and process models. (SDAI does not yet support multi-user database environments, or the clear identification of integrity-preserving transactions that are needed in such an environment, but it offers a step toward integrated data and process models that is not available in the classical data-flow models definable in CDIF.)

For most of us that go into information systems, these are the kind of problems that exist to be solved (for others of course they exist to be exploited). It seems quite clear that for semantic plug and play to become a reality, there must be communicable techniques for defining the semantics of the data a view needs to manipulate in order to compare that data with what is needed by other views.

A-2.2.1.3. Data Definition: Current Standards

The wide recognition of the importance defining the meaning and form of data has led to a variety of standards activity on the subject. The standards described in this section are the international and US (ANSI) standards that I am familiar with.

EXPRESS
CDIF Data Modeling Subject Area
CSMF
IDEF1X
KIF
Conceptual Graphs
The following other standards are included because from the little I know of their scopes, they seem to related in a significant way to this issue.
- ISO/IEC WD 11179, Specification and Standardization of Data Elements
- Basic Semantic Repository (BSR)
- PLIB (Parts Library)

A-2.2.2. Process Definition

Process modeling is another common form of view definition. Where data models define the rules imposed by a view for a permissible state of a data collection, process models define the transformations permitted by a view between those states.

A-2.2.2.1. Process Definition: Role in Semantic Plug and Play

In a semantic plug and play architecture, a process model for the view implemented by a tool would enable the following:

It would allow users of the tool to analyze the processing it performs, compare it with detail business processes, and set up procedures for accomplishing those business processes by using the functions of the tool.
It would allow tool implementers to incorporate the functions of a tool into its library of callable functions, thereby making the tool useful even to those who do not use it as their primary application. When a tool is being considered for retirement, this mapping provides the basis for an impact assessment, to determine what features of the tool need to be retained or replaced.
It would allow tool implementers to identify the functions that the tool needs to be able to call, and to link those calls into its library of callable functions.

A-2.2.2.2. Process Definition: State of the Art

Process modeling, as a formal technique within industry, got its start in the 1970's with the development of structured analysis and structured programming, of which there were several dialects.

Because of the immense cost both of computing and of developing computing systems, and because computing development was a chaotic process, in the theoretical sense that the usability of its product was highly sensitive to the accuracy of the specifications that the process was given to implement (as well as in more colloquial sense that no one really knows what they were doing or why), there was a major need for techniques for specifying requirements for computing systems in ways that reduced the errors and consequent rework. Structured analysis was a technique for analyzing the processes to be implemented by a computing system into progressively smaller and precisely defined components with precisely specified interfaces.

Programming theory identified some generic control structures that were common to all programming languages, and described the benefits in reusability and in maintainability of restricting programs to the use of those structures, and techniques such as structured programming, finite-state modeling and petri-nets were the result. So-called lower-CASE tools emerged that facilitated the construction of programs in various programming languages (usually COBOL or FORTRAN) based on those structures, sometimes through the use of a higher-order language, i.e., a process modeling language.

Although there is a lot of hype about the desirability of "executable process models", the techniques of process modeling have matured to the point where that goal is generally achievable, although it has been achieved on a limited basis in tightly constrained environments.

A-2.2.2.2.1. Common Forms of Process Definition

Source code is actually a precise form of process modeling, the most accurate description of what is done by a process implemented in software. However, despite the self-documenting features promoted by advocates of one language or another, source code is generally regarded as an inadequate means of defining processes, even software processes:
Flow charts are a common form of documenting the flow of control of a process, but fail to relate the steps of the process to the data required or produced by those steps.
Work flow modeling is to manual processes what flow charts are to software processes: they model the flow of control from one process to another. Work flow tools also tend to allow users to document the resources required to perform processes, and to construct schedules for allocating those resources to those processes.
Data flow modeling is the most common way of representing the results of the structured analysis technique of process definition. Dialects include the Yourdon method, SADT and IDEF0, which perhaps differ more in specific syntax and the methods of data collection than in the fundamental semantics of the models (but see the distinguishing features below). All forms provide a basis for modeling the flow of data (and sometimes other kinds of products) from one process to another, and for progressively analyzing both processes and flows into their components.
Control and data flow modeling
Procedure modeling
Object modeling
Finite state machines
- State transition modeling
- Petri nets

A-2.2.2.2.2. Distinguishing Features of Process Definition Techniques

Process definition techniques vary in a number of ways, which tool vendors select and adapt to make their products more market-worthy. The dimensions of this variation include the following, and even greater variation exists among tools in the extent to which these distinctions form the basis for the code generated by a tool:

Data Flow--Does the technique support the specification of the data that flows from one process to another?
- Data Model Mapping--Does the technique support the association of the data flow with the objects defined in the data model?
- Data Transformation--Does that association identify precisely what instances of objects defined in the data model are transformed (created, altered, deleted) by a process? Does it accurately describe the nature of the transformation? Does it identify the objects that are presumed to be invariant during the process?
- Data Integrity--Does that association support the inheritance of integrity constraints from the data model to the processes that manipulate instances of that data?
- Data Process Separation--Does the technique support the separate documentation of the information that flows between processes from the view each process has of that information?
- Product Flow--Does the technique support the distinct documentation of flows of products other than information between processes?
Control Flow--Does the technique support the specification of the events that affect the timing of processes?
- Control Relationships--What kinds of controls can be specified? Enable? Disable? Initiate? Terminate? Interrupt? Restart?
- Processing Events--Does the technique support the association of those events with states of other processes? What kinds of processing events can be specified? Upon Being Enabled? Upon Being Disabled? Upon Being Initiated? Upon Being Terminated? Upon Being Interrupted? Upon Being Restored?
- Data Model Mapping--Does the technique support the association of those events with states of the objects defined in the data model? What is the relationship of a state to an object? What kinds of events can be defined in terms of such states? Upon Inception? Upon Termination?
- Process Scheduling--Does the technique support the association of processes with calendar dates and times? What kinds of such scheduling relationships can be defined? Start by? Start at? End by? End on? Delayed to? Can the technique distinguish and capture separately planned schedules and actual ones.
Process Structure--Does the technique support the specification of the subordinateprocesses that make up a given process?
- Process Composition--Does the technique support the specification of the subordinateprocesses that operate together to achieve a given process?
  - Process Composition--Does the technique support the specification of the compositionof the process itself?
  - Object Composition--Does the technique support the specification of the composition of the objects operated on by the process?
  - Flow Composition--Does the technique support the specification of the composition of the flow into or out of the process?
  - Control Composition--Does the technique support the specification of the composition of the events which control the process?
- Process Specialization--Does the technique support the specification of the subordinate processes that operate independently to achieve a special case of a given process?
  - Process Specialization--Does the technique support the specification of the specialization of the process itself?
  - Object Specialization--Does the technique support the specification of the specialization of the objects operated on by the process?
  - Flow Specialization--Does the technique support the specification of the specialization of the flow into or out of the process?
  - Control Specialization--Does the technique support the specification of the specialization of the events which control the process?
Resource Planning--Does the technique support the specification of the resources(personnel, skills, tools, facilities, etc.) needed to perform a process? Does the technique provide for estimating and accumulating the costs and schedules for those resources?
Process Instantiation -- Does the technique support the specification of a generic type of process, and then the planning of a specific instance of that type?

A-2.2.2.3. Process Definition: Current Standards

Tool vendors were quick to implement proprietary versions of structured analysis, i.e., upper-case process modeling. Notational schemes differed, but since these models were not used to generate code, the semantic variations were not noticed by the users.

CDIF Data Flow Subject Area
CDIF State-Event Subject Area
CDIF Object Modeling Subject Area
EXPRESS-2

A-2.2.3. Object Definition

Object-oriented systems are claimed to offer features that are not to be found in traditional systems:

They provide a much more flexible architecture for reusing and packaging objects into new systems.
They provide a mechanism for exchanging objects that does not require the receiver of the object to have the same (or equivalent) applications as the sender. Sending an object, unlike sending mere data, includes sending the ability to execute the methods associated with the object.

How successfully object-oriented programming and object-oriented databases will compete with other development paradigms remains to be seen. At this point it appears that OO implementations will find a substantial niche in the marketplace, but by no means will drive out all competitors.

A-2.2.3.1. Object Definition: Role in Semantic Plug and Play

In a semantic plug and play architecture, a object model for the view implemented by a tool would enable the following:

It would allow users of an object-oriented tool to analyze the objects it manipulates, to compare them with the objects already defined for the enterprise, and to create a mapping between them, regardless of whether the other tools that will share those objects are object-oriented.
It would allow tool implementers to incorporate the methods associated with the objects managed by the tool into the enterprise library of callable functions, thereby making the tool an object server for other applications.
It would allow tool implementers to identify the objects required from tools being replaced, e.g., a prior version of the tool, and to link the tool's import facilities to the export services of the old tools, thereby achieving data migration.
It would allow tool implementers to identify the objects required from external sources by the tool, and to link the tool's object access requests of the into the enterprise library of callable object access functions, regardless of whether those functions ultimately are processed by an object-oriented tool, thereby making the tool a client for those services.

A-2.2.3.2. Object Definition: State of the Art

Is object definition really something new? Since we do not yet have a clear semantics for object models (i.e., one that is at least agreed to by the practitioners) that can be compared and contrasted with semantic definitions of data and process models, it is difficult to understand what, beyond the definition of data, processes and their relationships, the definition of objects requires in order to support the semantic plug and play of object-oriented systems with other object-oriented systems, let alone with other kinds of systems. My personal viewpoint is that the object-oriented movement did reveal some significant gaps in the modeling techniques theretofore in use, and codified by being hard-coded into CASE tools, gaps that most serious practitioners had recognized and begun to find ways to correct. Any information about a business that is needed to build an OO system is also needed to build a traditional system, it just gets used in a different way. We should be able to model the business without committing ourselves to whether we are going to support its requirements by traditional systems, object-oriented systems, other paradigms or some mixture of them all. In other words, I don't believe in object-oriented analysis; the gaps can be rectified by incremental, evolutionary improvement in classical techniques. Of course I'm willing to be taught.

A-2.2.3.3. Object Definition: Current Standards

There are several overlapping approaches to object sharing currently being debated. The competition (and consequent hyperbole) is so severe that the standardization process is going on outside the traditional standards organizations, and would suggest that the fate of certain major companies rests on their success in making their approach a de facto standard, to be blessed by some standards organization when they get around to it. Among the competitors are

CORBA (Common Object Request Broker Architecture)
OLE, (Object Linking and Embedding)
ODBC (Open Database Connectivity)
OpenDoc

However all of the above standards seem to be concerned with the implementation and exchange of objects, not their definition. The following seem to be addressing techniques for modeling objects:

CDIF Object-Modeling Subject Area
EXPRESS-2

A-2.2.4. Knowledge Definition

Whatever the form of the knowledge application, in order for it to plug and play in our operating environment, it will still be necessary to compare its view of data, services and rules with those in its new environment, and to create mappings between them. This requires that we have interchangeable techniques for modeling knowledge systems.

Knowledge definition therefore seems to be an extension beyond data and process modeling. Starting with a data model that defines the basic concepts to be expressed, a process model that defines the basic transformations in that data, an object model that defines the packaging of data and processes into exchangeable units, knowledge definition adds the rules that enable the exchange of rules between systems.

Here we are treading on relatively new ground. Aside from the interlingua project and the KIF language which sprung from it, and which is being explored by a number of standards organizations, little work has been done on what it means to exchange knowledge. Certainly no one has addressed the problem of how to achieve semantic plug and play of knowledge systems.

A-2.2.4.1. Knowledge Definition: Role in Semantic Plug and Play

In a semantic plug and play architecture, a definition of the knowledge managed by a knowledge-based tool would enable the following:

It would allow users of a knowledge-based tool to analyze the knowledge it manipulates, to compare it with the knowledge, data and objects already defined for the enterprise, and to create a mapping between them, regardless of whether the other tools that will share that knowledge are knowledge-based.
It would allow tool implementers to incorporate the knowledge-based processes provided by the tool into the enterprise library of callable functions, thereby making the tool an knowledge server for other applications.
It would allow tool implementers to identify the knowledge required from tools being replaced, e.g., a prior version of the tool, and to link the tool's import facilities to the export services of the old tools, thereby achieving data migration.
It would allow tool implementers to identify the data, processing, object and knowledge servicesrequired from external sources by the tool, and to link the tool's object access requests into the enterprise library of callable access functions, regardless of whether those functions ultimately are processed by a knowledge-based tool, thereby making the tool a client for those services.

A-2.2.4.2. Knowledge Definition: State of the Art

KBE tools are being used for an ever-increasing percentage of the design process. Their ability to generate designs for whole classes of parts significantly reduces the effort require to produce those designs following the traditional one-by-one design process, and therefore significantly reduces the costs of design.

However, KBE tools tend to be standalone products, whose interfaces to other tools, whether knowledge-based or otherwise, have to be built and maintained by using organizations.

The approaches to knowledge definition that are implemented in these tools are proprietary, and are treated as a competitive advantage. The Knowledge Sharing Project at Stanford provides a general facility for defining ontologies in KIF for use by knowledge tools.

A-2.2.4.3. Knowledge Definition: Current Standards

There are no existing standards for knowledge definition. However, several organizations are exploring such standards.

A-2.2.4.3.1. STEP Knowledge Exchange

ISO TC184/SC4 has long identified four levels of product data sharing for STEP:

File exchange, currently implemented by Part 21
Shared memory, no longer being considered
Data sharing, planned to be implemented by Part 22 (SDAI)
Knowledge sharing

No work has been done to data on knowledge sharing.

A-2.2.4.3.2. STEP Parametrics

Although no general work has been done on STEP knowledge sharing, a substantial investigation has been made of techniques for exchanging parametric part definitions, i.e., rules that enable the generation of classes of parts from sets of parameters. As yet no STEP Parts have resulted from this activity.

A-2.2.4.3.3. KIF Knowledge Exchange

KIF was developed with the deliberate intention of supporting the exchange of knowledge between different tools. Although it has demonstrated its ability to achieve this goal when tools are designed with KIF planned as an exchange mechanism, it is not yet clear that it will work for KBE tools currently in production. KIF is in the standardization process through ANSI X3T2.

A-2.3. A Framework for Modeling

John Zachman ["A Framework for Information Systems Architecture," IBM Los Angeles Scientific Center Report No. G320-27B5] proposed that models of an information system be organized into a two-dimensional matrix. One dimension showed various aspects of an information system that could be defined and managed with a high degree of independence of one another: data, processes and locations (and in some versions events, organization and motivation). The other dimension showed increasing degrees of technology-dependence, from the business view, whose specification "should" be totally independent of implementing technology, to the specific implementation of the system with specific software running on specific machines.

Whatever we think of Zachman's framework, it does seem that there is an important difference between business-oriented, technology-independent models in the upper regions of the framework, and the computing-oriented, technology-specific models in the lower regions, a difference that is significant to the kinds of standards that are appropriate. The primary function of technology-specific models, whether they be text files containing the source code for a program, or data structures in a CASE repository, is to construct an implementation that does the job specified for it well in the selected environment. The primary "user" of these models are the compilers that convert their instructions into executable code. They are a medium of exchange between a person and a machine. Business models on the other hand serve to communicate the semantics of various components of the information system, the meaning of the data and the reasons for certain kinds of transformations. As such they are primarily devices to communicate among people. In a semantic plug and play environment, the communication of such semantics is critical to achieving interoperation.

This difference means that it is much more important that standards for communication of semantics, i.e., for modeling languages, be interchangeable than it is for standards for programming languages. Standards are required for programming languages only to assure that a program written in a given language will be compilable on any machine that has a compiler for that language, or to enable a program in one language to invoke a program in another. There is no requirement that a compiler for one language be able interpret a program in another language. Modeling languages are different. For semantic plug and play to work, the models that define the semantics of the view implemented in a tool must be interpretable by whatever view mapping tool a potential user might have. That means that there must be a semantic, language-independent core of the modeling tools that they all share and that they can use to communicate the specific semantic assumptions of a view.

Zachman's framework was found by many to provide a useful starting point for discussion of the role of models in the management of information systems, but it never evolved into an implementable architecture, perhaps because its scope was too big for any one vendor to comprehend, and for industry to implement it by assembling it from off-the-shelf components requires precisely the kind of semantic plug and play for modeling tools that we are talking about for the information systems modeled by those tools.

Appendix 3: Current Standards Relevant to Semantic Plug and Play

The standards organizations included in this section are those that seem appropriate to participate in the Joint Workshop Standards for the Use of Models that Define the Data and Processes of Information Systems. That is, they are standards for definition, management or use of models. Comments and descriptions are mine based on my experience with these organizations, but are sometimes drawn from "official" sources. Where available, I have included the URL of a Web Site to provide access to an organizations own perspective of itself.

A-3.1. ANSI X3T2

ANSI X3T2 is the organization chartered to develop US standards for communications and to represent the US in ISO/IEC JTC1/SC7.

A-3.1.1. CSMF (Conceptual Schema Modeling Facilities)

One of the primary focuses of attention of X3T2 over the past few years has been the US position on the CSMF standard being developed by JTC1/SC21/WG3. The US has promoted a logic-based approach to conceptual schema modeling as the only way to adequate enable the integration of databases and knowledge bases into an integrated conceptual schema. To that end X3T2 is in the process of developing US standards on two languages based on logic, preparing to champion them through the process of international standardization, and promoting their application to the needs of international standards.

A-3.1.1.1. KIF

KIF (Knowledge Interchange Format) was developed at Stanford University as part of the Knowledge Sharing Project under contract with ARPA (Advanced Research Projects Agency). Its objective was to provide a means to exchange "knowledge" among "knowledge-based engineering" tools. A working draft is currently under review by ANSI X3T2 for US standardization.

KIF is a character-based form of formal logic, i.e., the first-order predicate calculus with identity and functions. As such it inherits all of the formal characteristics of the predicate calculus. It carries the classical text-book definitions of its semantics and the classical axiom sets as defaults, but users are free to tailor the axiom sets or the semantics to meet their needs.

KIF is in the standardization process through ANSI X3T2, and is currently being explored by several international standards organizations:

ISO TC184/SC4 has explored KIF for its possible use as
- a declarative rules language within EXPRESS
- a language for the exchange of parametric part definitions
- a language for the exchange of part knowledge
ISO/IEC JTC1/SC7/WG11 has explored KIF as a potential constraint language for use within CDIF, and has suggested that X3T2 propose KIF as an international standard through SC7.
ISO/IEC JTC1/SC21/WG3 has explored KIF as a potential language for CSMF

A-3.1.1.2. Conceptual Graphs

Conceptual Graphs (CG) is a modeling technique developed by John Sowa. It is intended to provide a graphic form of presentation of formal logic, based on work by Charles Sanders Peirce.

CG is currently under review for standardization by ANSI X3T2, and is being proposed to ISO/IEC JTC1/SC21/WG3 as a standard graphic visualization language for CSMF. X3T2 is deliberately defining the initial core KIF and CG standards as formally equivalent alternative representations of the predicate calculus.

A-3.2. EIA (Electronic Industries Association)

EIA is an ANSI-certified standards organization, with principal focus on electronic data standards, such as EDIF (Electronic Data Interchange Format). Because of the apparent similarity between circuit diagrams and graphic CASE models, vendors and users of CASE products explored the possible application of EDIF to the exchange of data between CASE tools. The effort failed since the similarity in the graphic structure of the two domains was not matched by a corresponding similarity in the semantics of those graphs, and the functions that CASE tools needed to perform were based primarily on the semantics. However, the group of vendors and users rechanneled their efforts into a new standard within EIA, namely CDIF.

A-3.2.1. CDIF (CASE Data Interchange Format)

CDIF is an interim standard for the exchange of data between tools used in software engineering, i.e., in the specification, design and implementation of information systems with software. CDIF is being reviewed for international standardization by ISO/IEC JTC1/SC7.

CASE tools commonly provide a family of modeling techniques, i.e., representations of different aspects of information systems (entity-relationship diagrams, data-flow diagrams, state-transition diagrams, etc.). Thus CDIF was built as a family of standards which includes several modeling techniques, each defined as a subject area. These subject areas currently include Foundation and Common (both of which are automatically included in every other subject area), Data Definition, Data Modeling and Data Flow. A State-Event subject area has been released as a working draft, and work is beginning on an Object Modeling subject area. Each of these subject areas is being defined as an instance of a meta-meta-model defined in the Framework document. There is a standard encoding based on this meta-meta-model for the exchange both of subject areas and of the models that instantiate the subject areas.

A-3.2.1.1. CDIF Framework

The family resemblance among the various CDIF subject areas is the result of a common framework that defines the structure and concepts to be used in defining a subject area. If each subject area is a meta-model, whose instances are models of a certain kind, then the framework defines a meta-meta-model, whose instances are the subject areas themselves. The CDIF exchange format is defined in terms of the meta-meta-model and is inherited by each of the subject areas.

A-3.2.1.2. CDIF Data Definition Subject Area

The CDIF Data Definition (DDEF) subject area enables the definition of the types of data that a computer can distinguish as elementary values of the data defined in models of data and data flows. This subject area was developed separately from the Data Modeling subject area in order that it could be used separately by subject areas that needed to define data without using the whole of DM.

A-3.2.1.3. CDIF Data Modeling Subject Areas

The CDIF Data Modeling (DM) subject area supports traditional CASE techniques of data modeling. By the very nature of the process by which vendors of different tools came to agree on the standard, DM provides a rich and flexible meta-model for integrating data models. It is an ERA form of modeling which

distinguishes relationships from entities, but allow both to have attributes and relationships
allows multiple inheritance among both entities and relationships, with the ability to designate inheritance relationships as exclusive and/or exhaustive
defines a number of configuration control attributes of the objects in a model
provides a number of built-in cardinality and integrity constraints
provides for other integrity constraints to be specified in another user-selected language, but does not currently include a general-purpose constraint language of its own.

Although each of the features of the subject area is used by one tool or another, few tools will take advantage of all of those features. I remains to be seen whether a process can be put in place to exchange models through CDIF in a way that assures that models can be "round-tripped", i.e., sent from one tool through CDIF and back, without loss of information.

A-3.2.1.4. CDIF Data Flow Subject Area

The CDIF Data Flow subject area is intended to enable the exchange models which implement the various flavors of structured analysis, including Yourdon, SADT and IDEF0. It supports such features of data flow models as

definition of processes
definition of data stores
definition and classification of flows
merging and splitting of flows
decomposition of processes and flows.

A-3.2.1.5. CDIF State-Event Subject Area

CDIF also includes a preliminary State-Event subject area, which is intended to enable the exchange models which implement the various flavors of finite state process modeling, including State-Transition Diagram and Petri Nets.

A-3.2.1.6. CDIF Object Modeling Subject Area

EIA is beginning work on an Object Modeling subject area for CDIF. An approach to object definition, based on work by Rumbaugh and Booch, seems to be emerging from OMG, and is being explored by EIA for this purpose.

A-3.3. IEEE (Institute of Electrical & Electronic Engineers) Computer Society)

The IEEE Computer Society (IEEE CS) is an ANSI-certified organization chartered to develop US computing standards.

A-3.3.1. IEEE IDEF (Integrated Definition Language)

IDEF (nee ICAM Definition Language) is a family of languages for modeling information systems. It was originally developed under the auspices of the US Air Force Integrated Computer-Aided Manufacturing (ICAM) project. It became a FIPS (Federal Information Processing Standard) and was managed by the IDEF Users Group. Recently projects have been initiated with IEEE CS to develop US standards for two languages in that family, IDEF0 and IDEF1X. It is anticipated that upon completion of the ANSI standardization process, these languages will be proposed to ISO/IEC JTC1/SC7 for international standardization.

IDEF0 is a dialect of data flow modeling that seems to have wide usage and is implemented in a number of CASE tools.
IDEF1X is a diagrammatic version of the binary entity-relationship-attribute modeling approach for defining data. IDEF1X has close ties to the relational data model. It is supported by a variety of tools, which not only validate the formal correctness of models but also generate relational database schemas from them.

Part of the IEEE IDEF standards is an exchange language (public interface) called IDL, a somewhat unfortunate selection of a name since CORBA also uses the name IDL for its public interface.

A-3.3.2. IEEE P1175

P1175 is an IEEE standard that has many similarities to EIA CDIF. It is included here for completeness, but I am not sufficiently well-versed in it to provide any details.

A-3.4. ISO TC184/SC4: Industrial Data

SC4 is chartered with the development of international standards for industrial data. It is currently working three such standards STEP (STandard for the Exchange of Product model data), PLIB (Parts Library) and MANDATE (Manufacturing Data). Working documents of SC4 are available through SOLIS (STEP On-Line Information Service); International Standards (IS) and Draft International Standards (DIS) are copyrighted by and are available for sale through ISO.

A-3.4.1. STEP (ISO 10303: STandard for the Exchange of Product model data)

STEP is the most widely known of the SC4 standards, and the only one yet to achieve publication as an International Standard.

A-3.4.1.1. EXPRESS (ISO 10303-11)

EXPRESS is the language used by ISO TC184/SC4 to define STEP. It is being used in an increasing number of applications outside of STEP.

EXPRESS International Standard. EXPRESS is a binary-ERA language with a rich set of built in cardinality attributes and a rich procedural language. It is primarily a lexical language, i.e., it can be exchanged between computers using a standard character set, but it has an optional, partial graphic presentation (EXPRESS-G).
EXPRESS Edition 2. STEP, which often bills itself as "object-flavored", is considering enhancements to EXPRESS to support the specification of methods that can be encapsulated with EXPRESS ENTITIES to define standard objects. A number of proposals have been made for incorporating into EXPRESS the ability to specify certain kinds of transformations in the definition of a product. Most of these proposals have taken the form of some variation on the theme of precondition/postcondition, which is reminiscent of finite state modeling. None of these proposals have yet been offered for ballot.
EXPRESS-X Mapping Language. STEP is also investigating the development of a dialect of EXPRESS (currently referred to as EXPRESS-X) to be used as a mapping language between EXPRESS models.

A-3.4.1.2. STEP File Exchange (ISO IS 10303-21)

STEP Part 21 defines the mechanism for exchanging instances of the data defined in an EXPRESS model. (Unlike most other major Parts of STEP, Part 21 does not have a convenient, catchy acronym.) Part 21 has been the basis for tool vendors to write import and export functions that allow their tools to "speak STEP" in the form of exchangeable files. To my knowledge this is the first industry standard for a model-driven form of data exchange.

A-3.4.1.3. SDAI (ISO DIS 10303-22: Standard Data Access Interface)

SDAI is a standard for an application programming interface (API) to provide data manipulation functions for instances of data defined in an EXPRESS model. SDAI is intended as the basis for tool vendors to "speak STEP" to a shared database. Part 22 defines an abstract API which must be combined with a binding to a particular language. The language bindings currently under development are Part 23 (C++) and Part 24 CORBA IDL).

A-3.4.1.4. STEP Generic Resources (ISO 10303-4x)

The "40-Series" of STEP parts, i.e., ISO 10303-41 through -49, define the Generic Resource Models (GRM) of STEP. These are STEP schemas, defined in EXPRESS, that are not intended for direct implementation, but are to be reused in the development of implementable application protocols.

A-3.4.1.5. STEP Application Protocols (ISO 10303-2xx)

STEP is implemented as a collection of application protocols (APs), for example, STEP Part 203 (Configuration Controlled Design) and other STEP Parts in the 200-series). Each AP defines a particular domain of sharable product data. An application protocol itself consists of several components:

an application resource model (ARM) to define this collection of data from the perspective of the business being supported,
an application activity model (AAM) to define the activities to be performed with this collection of data (normally expressed as an IDEF0 model), and
an application interpreted model (AIM) to map the data as defined in the ARM to data defined in a collection of reusable generic resource models (GRM, STEP Parts in the 40-series). This mapping is achieved through the use of a mapping table which defines the components of the ARM in terms of equivalent structures of components of the GRM. The mapping language used in the mapping table is currently not offered by TC184/SC4 as a separate standard.

A-3.4.1.6. STEP Knowledge Exchange

TC184/SC4 has long identified four levels of product data sharing for STEP:

File exchange, currently implemented by Part 21
Shared memory, no longer being considered
Data sharing, planned to be implemented by Part 22 (SDAI)
Knowledge exchange

No work has been done to date on developing a standard for knowledge exchange.

A-3.4.1.7. STEP Parametrics

A-3.4.2. PLIB (ISO DIS 13584: Parts Library)

PLIB (Parts Library) is another standard of ISO TC184/SC4, developed under WG2. Although it is intended to use as much of the STEP architecture and models as possible, it is developing facilities for including definitions of the concepts used to describe the parts in standards parts libraries, even when those descriptions do not conform to STEP. Thus PLIB will encapsulate semantic definitions in whatever form they might take, and not to be developing a technique for specifying those semantics.

A-3.5. ISO/IEC JTC1/SC7: Software Engineering

JTC1/SC7 is chartered to develop international standards for software engineering. Among the subjects currently being addressed by its various working groups are the following:

Working groups concerned with the definition and management of software engineering processes.
- JTC1/SC7/WG4: Tools and Environment. Scope: Development of standards and technical reports for tools and Computer Aided Software/System Engineering (CASE) environments.
- JTC1/SC7/WG6: Evaluation and Metrics. Scope: Development of standards and technical reports for software products evaluation and metrics for software products & processes.
- JTC1/SC7/WG7: Life Cycle Management. Scope: Development of standards and technical reports on Life Cycle Management.
- JTC1/SC7/WG8: Support of Life Cycle Processes. Scope: Development of standards and technical reports on Life Cycle Management processes.
- JTC1/SC7/WG9: Software Integrity. Scope: Preparation of standards, technical reports, and guidance documents related to software integrity at the system and system interface level. In this context, software integrity is defined as ensuring The containment of risk or confining the risk exposure in software.
- JTC1/SC7/WG10: Process Assessment. Scope: Development of standards and guidelines covering methods, practices and application of process assessment in software product procurement, development, delivery, operation, evolution and related service support.
- JTC1/SC7/WG12: Functional size measurements. Scope: To establish a set of practical standards for functional size measurement. Functional size measurement is a general term for methods of sizing software from an external viewpoint and encompasses methods such as Function Point Analysis.
Working groups concerned with the definition of the data to be shared among software engineering processes.
- JTC1/SC7/WG2: System Software Documentation. Scope: Development of standards for the documentation of software systems.
- JTC1/SC7/WG11: Software Engineering Data Definition and Representation. Scope: Development of standards and technical reports to define the data used and produced by software engineering processes, establish representations for communication by both humans and machines, and define data interchange formats.

In practice is has been WG11 that has been concerned with standards for the exchange of CASE models. It is they that are reviewing CDIF as an international standard for this purpose as a means of completing their Project 7.28 -- Software Engineering Data Description and Interchange (SEDDI).

Because of their awareness that other organizations are also developing standards for CASE modeling, WG11 has established a number of liaisons:

SC7/WG2
SC7/WG4
SC7/WG7
JTC1/SC14/WG4
JTC1/SC18/WG9
JTC1/SC21/WG3 [IRDS & Import/Export]
JTC1/SC30
IEC/TC93
ISO TC184/SC4/WG5 (EXPRESS)
ECMA TC33
EIA CDIF

A-3.6. ISO/IEC JTC1/SC14: Data Elements

ISO/IEC WD 11179, Specification and Standardization of Data Elements, is a working draft of a standard for the attribution, classification, definition, naming, identification and registration of data elements. This standard is being developed by ISO/IEC JTC1/SC14, with US input from ANSI X3L8.

A-3.7. ISO/IEC JTC1/SC21: OSI

A-3.7.1. ISO/IEC JTC1/SC21/WG3: Database

A-3.7.1.1. SQL (Structured Query Language)

SQL is a database query language based on the relational model of database structures. It is declarative in nature, in that a statement in SQL specifies what data is to be retrieved, added or updated, but leaves it up to the database implementation to determine both how to physically store data and how to execute SQL statements against that structure.

There is an international standard for SQL, the most recent of which (SQL-92) is defined in ISO/IEC JTC1/SC21/WG3 N9075:1992. Most commercial relational database management systems (RDBMS) provide a dialect based on that standard, which usually offers some alluring extensions and also fails to implement some "non-essential" features.

A version of the SQL standard is under development that is supposed to provide support for object-oriented applications.

A-3.7.1.2. CSMF (Conceptual Schema Modeling Facilities)

CSMF (Conceptual Schema Modeling Facilities) is a standard being developed by ISO/IEC JTC1/SC21/WG3 for defining conceptual schemas. Because WG3 is also responsible for the SQL data definition and manipulation language standard, the role of that language in the CSMF standard is a major consideration. Also under consideration are logic-based modeling languages, such as Conceptual Graphs and KIF, in order that the CSMF standard be adequate to emerging disciplines such a knowledge-based engineering.

A-3.7.1.3. IRDS (Information Resource Dictionary System)

IRDS is an industry standard for a repository to support the sharing of models of information systems.

A-3.8. ISO/IEC JTC1/SC22

A-3.8.1. ISO/IEC JTC1/SC22/WG2: PCTE (Portable Common Tool Environment)

A-3.9. UN/EDIFACT

A-3.9.1. Basic Semantic Repository (BSR)

The Basic Semantic Repository (BSR) is being developed by the UN/EDIFACT committee to provide a library of terminology that can be used with minimum ambiguity in electronic commerce. An element of that library is referred to as a Basic Semantic Unit (BSU).

A-3.10. Object Database Consortia and Standards

A-3.10.1. CORBA (Common Object Request Broker Architecture)

CORBA is a commercial standard, developed by the OMG (Object Management Group) consortium, for a Object Request Broker (ORB) that will transmit messages between objects, regardless of the manner of their implementation, and thereby enable interoperability among any applications that conform to the standard, so long as those applications agree on the semantics of the objects being interchanged. The language defined by CORBA for expressing these messages is IDL, which is different from the language of the same name for the IDEF public interface.

A-3.10.2. OLE (Object Linking and Embedding)

OLE is a mechanism promoted and implemented by Microsoft to enable the assembly of complex documents whose components are created and maintained by different applications. OLE allows a document developer to collect data into a database application, link that data into a spreadsheet for analysis, link the spreadsheet into a document in a word processor, for editing of explanatory text, along with graphics linked from yet another application, which may itself might be linked to the spreadsheet.

Current implementations of OLE tend to be "coarse-grained" in that the objects that are embedded or linked are usually whole documents, tables, pictures, or large, named components of such objects. OLE does not currently facilitate linking individual data items or query results from databases.

A-3.10.3. ODBC (Open Database Connectivity)

ODBC is a standard for a set of interfaces that enables the exchange of data based on the SQL query language. ODBC is the result of collaboration between the X/Open and the SQL Access Group. ODBC enables an application running on one machine to talk to data servers on other machines through drivers that handle the linkage between the SQL statements executed on the client and the facilities of the servers.

Return to: JSW Home Page.
Send feedback to: jfulton@atc.boeing.com .