221-TP-002-001 EOSDIS Core System (ECS) Design and Evolution (DRAFT) Technical Paper Technical PaperÑNot intended for formal review or government approval. June 1995 Prepared Under Contract NAS5-60000 RESPONSIBLE ENGINEER Ron Williamson/Eric Dodge/Carl Wheatly 6/23/94 Ron Williamson, Eric Dodge, Carl Wheatley Date EOSDIS Core System Project SUBMITTED BY Steve Fox 6/23/94 Steve Fox Date EOSDIS Core System Project Hughes Applied Information Systems Landover, Maryland This page intentionally left blank. Contents 1.ÊÊIntroduction 1.1 Purpose 1-1 1.2 Organization 1-1 1.3 Acknowledgments 1-1 1.4 Review and Approval 1-1 2.ÊÊEOSDIS Core System Context 2.1 Introduction 2-1 2.2 New, Comprehensive Earth Science Measurements 2-1 2.3 Mission Context and Support Capabilities 2-3 3.ÊÊKey Features for Science Users and Data Providers 3.1 Services-Based Architecture 3-1 3.2 Object-Oriented Design 3-2 3.3 Cohesive Information Model 3-5 3.4 Focus on Data Quality 3-6 3.5 Emphasis on Calibration & Validation 3-7 3.6 Core and Value Added Provider Distribution of Science Products 3-8 3.7 Robust infrastructure components 3-9 3.8 Operational Resiliency 3-11 4.ÊÊEOSDIS Implementation 4.1 Capabilities Developed as Needed for Mission Support 4-1 4.2 Current Access to Selected NASA Data Sets 4-3 4.3 Evolution of EOSDIS 4-3 4.4 Multi-Track Development 4-6 4.5 Managing Technology Insertion 4-6 5.ÊÊScience Community Involvement 5.1 Users Involvement at Technical Levels 5-1 5.2 Influence of User Feedback on EOSDIS 5-2 5.3 Influence of Community Feedback on GCDIS and UserDIS Vision 5-3 Figures 2-1. Examples of New Measurements Available from EOS Instruments. 2-3 2-2. ECS and EOSDIS / GCDIS/ UserDIS Contexts 2-4 3-1. High-level Services Provided by ECS Infrastructure 3-1 3-2. ECS Services-Based Architecture 3-2 3-3. Examples of Subsystem Design Features 3-3 3-4. EOSDIS Data Pyramid 3-6 3.6-1. The Value-Added Provider as a GCDIS Member 3-9 3-5. Key ECS Automated Resource Components 3-10 4-1. ECS Development Schedule with Example EOS Missions 4-1 4-2a. Technology Projection of Cost Per MFLOP (Processing) as of April 4-2 4-2b. Technology Projection of Cost Per Gbyte (Disk Storage) as of April 4-2 Tables 2-1. The EOS Rebaselined Satellite Series 2-2 2-2. Locations and Scientific Expertise of the EOSDIS DAACs 2-5 3-1. ECS Subsystem Key Capabilities and Technologies 3-4 4-1. Highlights of ECS Architectural Features that Facilitate Evolution. 4-4 4.-2. Likely Near and Long Term Evolutionary Enhancements For ECS 4-5 1.ÊÊIntroduction 1.1 Purpose The purpose of this technical paper is to provide an overview of the EOSDIS Core System (ECS) design and highlight some of key capabilities of the system. It focuses on key aspects of the science data processing architecture and design that will allow future support of a much broader user community. It describes how this architecture facilitates evolutionary change and discusses how such change is managed in a way that does not impact operational use of the system. It briefly describes the ECS development process, user involvement in this process and the role these play in supporting evolution of the system 1.2 Organization This paper is organized into the following broad categories: Chapter 2: ECS context within EOSDIS. Chapter 3: Key features of the ECS architecture and design Chapter 4. ECS development strategy Chapter 5. User involvement 1.3 Acknowledgments This paper is based in large part on ÒA Science UserÕs Guide to the EOSDIS Core System (ECS) Development ProcessÓ (160TP-003-001 written by Thomas Dopplick. 1.4 Review and Approval This Technical Paper is an informal document approved at the Office Manager level. It does not require formal Government review or approval. Questions regarding technical information contained within this Paper should be addressed to Steve Fox, 301-925-0346, sfox@eos.hitc.com. Questions concerning distribution or control of this document should be addressed to: Data Management Office The ECS Project Office Hughes Applied Information Systems 1616 McCormick Dr. Landover, MD 20785 This page intentionally left blank. 2.ÊÊEOSDIS Core System Context 2.1 Introduction NASA historically has built separate and centralized data and information systems for each science mission because of past focus on discipline-oriented research and the difficulties in building open, distributed systems. Modern technology now allows broad coupling of expanded services with advanced mission data collection to support interdisciplinary Earth science investigations, a necessary condition for resolving issues associated with global change. This revolution in science research is the fundamental reason for beginning the Earth Observing System (EOS), the centerpiece of Mission to Planet Earth which is NASA's contribution to understanding global change. The EOS Data and Information System (EOSDIS) provides end- to-end services from EOS instrument data collection to science data processing to full access to EOS and other Earth science data holdings. EOSDIS' infrastructure, the EOSDIS Core System (ECS), provides EOS and other U.S. and international scientists a broad range of desk top services from 9 science data centers, the Distributed Active Archive Centers (DAACs, see Table 2-2 for locations), operated by NASA and other agencies. The ECS infrastructure also supports exchange of data and research results within the science community, across multiple agencies and internationally. ECS is the evolutionary base for accelerating the pace of Earth science research. It has been designed to cost effectively evolve to support the broader Global Change research community (GCDIS) as well as educational and commercial users (UserDIS). 2.2 New, Comprehensive Earth Science Measurements Earth science research needs to broaden the use of remote sensing observations in order to complement in situ as well as theoretical studies owing to the complexity and spatial extent of the global Earth system. However, most geophysical variables cannot be directly measured by on-orbit measurements; rather, science algorithms applied to space measurements progressively convert raw instrument data to calibrated data to geophysical variables. A major strength of the EOS program is the breadth of geophysical variables for the global Earth system that can be derived from the numerous EOS instruments, such as variables for the land surface, oceans and atmosphere, as well as related biogeochemical variables. Table 2-1 shows examples of this breath as viewed through science mission objectives of the rebaselined EOS satellite series. The EOS satellite series will provide continuous, long term observations; whereas, other EOS instruments will fly on available non-EOS satellites, such as on the Tropical Rainfall Measurement Mission (TRMM) and international partner satellites. The number of derived variables and resultant new science products is large, with over 200 new science products from the EOS instruments on TRMM and the first EOS AM satellite. .c.Table 2-1. The EOS Rebaselined Satellite Series Satellites Mission Objectives EOS AM Clouds, aerosols, and radiation balance, characterization of the terrestrial ecosystem; land use, soils, terrestrial energy/moisture, tropospheric chemical composition; contribution of volcanoes to climate; and ocean primary production productivity EOS PM Cloud formation, precipitation, and radiative properties; atmospheric temperature and moisture profiles; air-sea fluxes of energy and moisture; sea-ice extent; and soil moisture and snow over land EOS ALT Ocean circulation and ice-sheet mass balance EOS CHEM Atmospheric chemical composition and dynamics; chemistry-climate interactions; air-sea exchange of chemicals and energy EOS investigators will analyze integrated, long-term earth observations from a variety of new remote sensing instruments in characterizing the components of global climate change and their interactions. Several small- and intermediate-sized platforms will be flown (see examples in Figure 2-1), with each platform carrying a suite of sensors selected to address key global change issues. The view of Earth through these new sensors will be different from previous views; advances in remote sensing technology enable much higher resolution, spatially and spectrally, than has previously been possible on a global basis. For example, analysis of the 1 km data of the Moderate-Resolution Imaging Spectroradiometer (MODIS, see AM-1 in Figure 2-1), will enhance knowledge of global dynamics and processes on the surface of the earth and in the lower atmosphere. The 36 wavelength bands that MODIS will simultaneously measure have been selected based on prior experience to provide the necessary information to separate convolved surface and atmospheric effects. Similarly, the high spectral resolution of the Atmospheric Infrared Sounder (AIRS, see PM-1 in Figure 2-1), with over 2000 spectral channels, will yield much improved vertical resolution of atmospheric temperature, water vapor and ozone measurements. In addition, AIRS will allow much improved characterization of the spectral properties of clouds and land, and a more accurate determination of the height of cloud tops. Each of the remaining instruments has much to offer individually but together they offer additional, synergistic benefit. Simultaneous observations of the same phenomena by complementary sensors allow much improved corrections to be developed for atmospheric uncertainties. For example, many derived geophysical variables, such as surface leaving radiances, require correcting for atmospheric uncertainties in order to produce an accurate science product. The synergism of these simultaneous measurements is very likely to lead to further improvement in the characterization of the land and ocean properties, clouds, aerosols, atmospheric processes, and radiative balance. The EOS Reference Handbook (Asrar and Dokken, 1993) contains details on all of the EOS instruments and is readily available via the World Wide Web using the URL provided in the reference at end of this chapter. .c.Figure 2-1. Examples of New Measurements Available from EOS Instruments. 2.3 Mission Context and Support Capabilities EOSDIS is an integrated system that supports multiple satellites and instruments, not a unique system for each. EOS includes instruments on satellites to be launched by NASA, the European Space Agency (ESA), and the Japanese National Space Agency (NASDA). Figure 2-2 provides a schematic view of the data and information system flows from the EOS instruments to the expanding user and service provider base in the ECS, EOSDIS, GCDIS, and UserDIS contexts. The ECS consists of the shaded inner boxes of Figure 2-2, plus facilities for operation of the NASA EOS satellites and instruments and NASA EOS instruments on International Partner satellites. The expanded GCDIS and UserDIS contexts support the addition of value added provider sites and an expanded base of educational and commercial users. The data from each EOS instrument will be sent to a Distributed Active Archive Center (DAAC) responsible for processing, archiving, and distributing EOS and related data. These data centers will house the ECS computing facilities and operational staff needed to produce EOS Standard Products and to manage, store, and distribute EOSDIS data, as well as the associated metadata and browse data, that allow effective use of the data holdings. The DAACs will exchange data via dedicated EOSDIS networks to support processing at one DAAC or SCF which requires data from another DAAC or SCF. NASA selected the DAACs based on their expertise in specific science disciplines (indicated in Table 2-2) and demonstrated long-term commitments to the corresponding user communities. .c.Figure 2-2. ECS and EOSDIS / GCDIS/ UserDIS Contexts .c.Table 2-2. Locations and Scientific Expertise of the EOSDIS DAACs DAAC Location Expertise Alaska SAR Facility (ASF), University of Alaska Fairbanks, AK Sea Ice and Polar Processes Imagery EROS Data Center (EDC), U.S. Geological Survey Sioux Falls, SD Land Processes Imagery Goddard Space Flight Center (GSFC), NASA Greenbelt, MD Upper Atmosphere, Atmospheric Dynamics, Global Biosphere, and Geophysics Jet Propulsion Laboratory (JPL), NASA Pasadena, CA Ocean Circulation and Air-Sea Interaction Langley Research Center (LaRC), NASA Hampton, VA Radiation Budget, Aerosols and Tropospheric Chemistry Marshall Space Flight Center (MSFC), NASA Huntsville, AL Hydrology National Snow and Ice Data Center (NSIDC), University of Colorado Boulder, CO Cryosphere Oak Ridge National Laboratory (ORNL), Department of Energy Oak Ridge, TN Ground-based data relating to Biogeochemical Dynamics Socio-Economic Data and Applications Center (Consortium for International Earth Science Information Network - CIESIN) Saginaw, MI Socio-Economic Applications The DAACs also house systems for processing and/or storage of non-EOS Earth science data. For example, the Alaskan SAR Facility currently provides systems for receiving, processing, and archiving data from Synthetic Aperture Radars on International Partner platforms. A broader provider base supporting value added services in the GCDIS and UserDIS timeframes will enhance the system services and reduce overall core system resource loading. Open access to the data by all members of the science community (and a larger educational and commercial community in the GCDIS and UserDIS era) distinguishes EOS from previous research satellite projects, where selected investigators had proprietary data rights for a number of years after data acquisition. This open data policy will lead to greater utilization of EOS data products, for global change research and other value added applications. The EOS program is also distinguished by the large number of funded investigators (over 500), who provide expertise across the broad range of scientific disciplines in Earth system science. Science Computing Facilities (SCFs), at EOS investigators' home institutions, are used to develop and maintain algorithms (for both Standard and Special Products), calibrate the EOS instruments, validate data and algorithms, generate Special Products, provide data and services to other investigators, and analyze EOS and other data in pursuit of the overall science objectives. The SCFs may range from single workstations to large supercomputer data centers. While the SCFs will be developed and acquired directly by the EOS investigators, ECS will provide software toolkits to the SCFs and other users to facilitate data access, transformation and visualization, and for science algorithm development. Some SCFs will play an operational role in quality control of the EOS Standard Products. Value-added service providers obtain data from the core system and integrate it with data from sources external to ECS. The results are made available through custom services directed at specialized user communities; for example, to environmental monitoring arms of large industrial companies to assist in local factory waste and toxic by-product control. The sizes of the companies providing such services vary widely, from individual consultants to specialist data centers (e.g. local governments, large oil-company consortia). The education community are also supported by value-added service providers, who support services and offer data appropriate for the education and training community. Common uses of the data are: ¥ supporting resource exploitation (construction, mineral extraction etc.) ¥ inventorying, mapping ¥ environmental impact studies ¥ supporting legal investigations ¥ information provided to operators influenced by environmental conditions (ocean routing, offshore operations, large scale civil-engineering) For more details on EOSDIS and ECS, a suggested reading list is provided at the end of each chapter. Several of the suggested documents are readily available via the World Wide Web at referenced URLs. Suggested Reading Asrar, G. and Dokken, D., 1993: 1993 EOS Reference Handbook. NASA Headquarters. URL:Êhttp://spso.gsfc.nasa.gov/eos_reference/TOC.html Asrar, G. and Dozier, J, 1994: Science Strategy for the Earth Observing System. NASA Headquarters. EOS Instrument Principal Investigators, 1994: Algorithm Theoretical Basis Documents. URL:Êhttp://spso.gsfc.nasa.gov/atbd/pg1.html James, M. 1995: EOSDIS Data Set Reference Handbook. In preparation, to be published in Spring 1995 by NASA/GSFC. Unninayar, S. and Bergman, K., 1993: Modeling of the Earth System in the Mission to Planet Earth Era. NASA Headquarters. 3.ÊÊKey Features for Science Users and Data Providers 3.1 Services-Based Architecture The science community and the National Research Council reviewed earlier architectural concepts for ECS and strongly recommended changes to make ECS more extensible and open. Their recommendations were incorporated and presented at the ECS System Design Review in June 1994. After receiving approval to proceed, the revised system design was baselined for ECS. From a science user's perspective, the ECS infrastructure appears as services, and interfaces to services, which appear on the scientistÕs desktop. This perspective is similar to the view of the World Wide Web (WWW) servers as seen through a local client such as Mosaic or Netscape. When scientists invoke ECS services, they are actually exercising the client environment running in their local workstation (Figure 3-1). .c.Figure 3-1. High-level Services Provided by ECS Infrastructure The ECS services-based architecture (Figure 3-2) serves a wide range of user needs and allows scientists to focus on global change research rather than computer science details. The Client Services Layer allows users to easily explore and locate provider services that are advertised through the Interoperability Layer. Advertisements provide complementary, coupled views of services, data sets, and providers. After identifying a service of interest, users can immediately invoke the provider service, or the reference to the provider service can be saved on the desktop for later use. For example, users can locate and invoke an EOSDIS-wide service, such as cross- DAAC searching or choose local DAAC services, such as search of local data collections and DAAC-unique subsetting. A user can also query directly, using science-oriented forms or free text, for specific provider services. Providers are not limited to the 9 DAACs; providers can be other data centers, other scientists, or value added service providers who adopt ECS interfaces and protocols. .c.Figure 3-2. ECS Services-Based Architecture ECS has chosen an open architecture that allows provider autonomy and independent, evolutionary development of components to improve services offered to users. Because of logical distribution, users and providers can tailor their components to their environments and still operate together as an extended distributed system such as found on the World Wide Web. 3.2 Object-Oriented Design The engineering process that transforms the architecture to design is defined and described in numerous ECS technical documents and the results presented in formal reviews throughout the development life cycle. Using an object-oriented design methodology, the ECS design is iteratively refined with progressively more definition of subsystems, lower-level objects, and associated interfaces. In Figure 3-3 we highlight some of the subsystem design features of importance to science users and data providers. For example, the client subsystem provides users a desktop workbench for accessing a broad range of services such as search services. Other subsystems, such as the planning subsystem, allow the DAACs to plan for instrument data processing in a data driven mode as well as respond to requests for on demand processing. Table 3-1 provides a brief list of major capabilities provided by each of the subsystems introduced in Figure 3-3. Evolutionary upgrades will be implemented depending on technology and cost considerations. For example, content-based searching of EOS products is not practical with current technology and is a candidate for a future evolutionary upgrade. .c.Figure 3-3. Examples of Subsystem Design Features The ECS Team adopts commercial-off-the-shelf (COTS) design solutions wherever possible in order to leverage the technological revolution underway in areas such as open systems and distributed computing. ECS draws heavily on the R&D technology of the entire commercial marketplace, resulting in a design that uses 100% COTS hardware and extensive COTS software for long term affordability and supportability. COTS hardware is selected based on projections of performance versus cost to maximize return on evolution of commercial technology. The use of Òjust-in-timeÓ COTS hardware procurement is discussed in Section 4.1 below. .c.Table 3-1. ECS Subsystem Key Capabilities and Technologies ECS Subsystem Key Capabilities and Technologies Client Provides: ÒClientÓ part of ÒClientÓ / ÒServerÓ access paradigm through graphical user interface and data/service access tools, as well as application program interface (API) libraries. Key Technologies: ¥ Hypertext Markup Language (HTML) based electronic document access ¥ World-Wide-Web (WWW) accessible ¥ Desktop Management (for graphical interfaces) ¥ Extensible Workbench provides tools for data access, search, document access, etc. Interoperability Provides: application level Òmiddle-wareÓ which facilitates dynamic client access to service providers holding data collections and associated services. Key Technologies: ¥ Dynamic linking of client application to local/remote provider services (via ECS CSMS) ¥ Advertising agents which link clients to new and changing provider collections and services ¥ Internet and WWW compatible and visible Data Management Provides: distributed search and access services with a science discipline view of data collections and Òone stop shoppingÓ location transparent access to those services and data. Key Technologies: ¥ Distributed Data Base Management System (DBMS) paradigm ¥ Data Dictionary provides data names and explanations of the data, and access operations ¥ DBMS / service access gateways Data Server Provides: individual data collection(s) management, search, access, long term storage and distribution facilities and offers views of such information at the data type level Key Technologies: ¥ Multi-Terabyte near-line/on-line archives (robotics based and DAAC site tailored) ¥ Electronic/network linked and media based distribution facilities ¥ Multi-File Storage Management Systems (FSMS) ¥ DBMS / OODBMS ¥ High capacity network accessible staging ¥ Data administration Data Ingest Provides: the ÓclientsÓ for the importation of data (science products, ancillary, correlative, documents, etc...) into ECS data repositories (Data Servers) on an ad hoc or scheduled basis and deals with external system interface specifics. For EOS L0 data: a special Data Server. Key Technologies: ¥ Metadata extraction and generation facilities ¥ Data format and translation tools (e.g. EOS HDF) as well as pre-processing capabilities ¥ For EOS L0 data: archive robotics, networked staging, Multi-FSMS, DBMS, etc. Planning Provides: for pre-planning of routine/ad hoc/on-demand science data processing as well as management functions for handling deviations from the operations plan for individual DAAC sites. Inter-DAAC plan coordination facilities are also provided. Key Technologies: ¥ Graphical User Interface for users and operations personnel (via Client GUI support) ¥ Client / Server planning tools and DBMS environment Data Processing Provides: the functions to host science algorithm software, perform data processing, process resource management (job dispatch, management and control) and includes facilities and toolkits which offer true software portability across advanced computing platforms as well as science software integration, test and configuration management. Key Technologies: ¥ Range of processors (from workstations, to supercomputers, to MPPs, etc. as needed) ¥ High speed I/O and data staging hardware (applied as needed, site tailored) ¥ Distributed and parallel processing environments as needed ¥ Resource optimization and management (distributed processing job control and mgmt.) 3.3 Cohesive Information Model To implement a services-based architecture, the ECS Team must identify, analyze, and capture the context and relationships of the science data and information holdings within EOSDIS, i.e. develop an information model. For example, a search service for sea surface temperature (SST) assumes an information model exists within ECS that models the relationship of the geophysical parameter SST with other SST-related variables such as instrument data attributes, space and time variability, and the science algorithm that converted the instrument data to SST. Thus, from product generation to product access, the context and relationships of science data and related variables must be modeled so users can request and apply services to the data. The choice of an information model is not arbitrary, rather it is based on careful examination of how data are used within EOSDIS by EOS scientists. Science data usage in EOSDIS is closely tied to data products and the uses of those products, which implies a need to generate, store, and access varying collections and levels of data across distributed providers. The starting point for developing the ECS information model was to characterize Earth science data into broad, multi-layered data categories as represented in the data pyramid (Figure 3-4). The data pyramid identifies data categories; it does not identify the context or relationships between the data categories. To understand relationships, extensive analysis is required to identify and relate the complex interactions between and among the different layers of the pyramid. Logical collections of data, based on their expected relationships, are developed to capture the variability in remote sensing instruments, science disciplines, and other characteristics of the Earth science community. For example, some EOS products have related properties (e.g., cloud type and cloud drop size) while other products are dissimilar (e.g., land vegetation indices and ocean productivity) which suggest certain logical groupings. Characteristics are often similar across a particular science discipline; often similar across products generated from a given instrument; but often different between provider sites because of differing science discipline focus, as well as organizational autonomy. .c.Figure 3-4. EOSDIS Data Pyramid Defining logical data collections typically begins with a systematic study of possible logical groupings with the goal of identifying similar characteristics. The resulting model is then examined from an applications perspective such as the expected access pattern to a logical collection. This process produces an information model which is capable of describing data in context. Logical collections are the basis for populating the ECS Data Servers at the distributed DAACs. A significant benefit of the ECS logical information model is the ability to present the science user an Earth-science view instead of a computer-science view of data. Also, by making the ECS information model available to researchers analyzing data, EOS scientists developing algorithm software, and scientists developing their own products, individual development can proceed within an overall information management framework that results in a consistent system view across widely distributed components of EOSDIS. 3.4 Focus on Data Quality Interdisciplinary scientists will be using a wide range of EOS data from direct instrument measurements to geophysical variables derived using complex science algorithms. These scientists need not have detailed knowledge of the intricacies and uncertainties associated with all the instruments and the science algorithms. Rather, access to data quality heritage will promote correct usage and application of the EOS products in interdisciplinary investigations. Three Quality Assurance (QA) processes are applied to the production of standard products, two at the DAACs and the third at the Science Computing Facilities (SCFs). First, startup QA is performed when production of a specific product is halted in order to assess the quality of intermediate output(s). Startup QA will be used during initial algorithm assessment, rather than during routine processing operations when operational algorithms are expected to perform internal in-line QA. Off-line QA involves DAAC operations performing quality checks of output from algorithm processing after completion of an algorithm process. The checks might be visual checks requested by the instrument team as part of their operations concept for the algorithm or processing consistency checks on product formats, metadata content etc. DAAC operations perform QA using core services or with specialized software developed by the science user providing the algorithm. When QA is complete, QA results (inventory attributes, QA summary report, etc.) are automatically updated in the product processing history which becomes available to any user of the product. SCF QA is similar to the off-line QA except that it is performed at the appropriate SCF. The QA in this case is performed using whatever manual/automated procedures the SCF team deem appropriate. The results from the QA may include updated metadata, annotations to the data set, and QA products added to the processing history. Again, the SCF QA results are available to any user of the product. 3.5 Emphasis on Calibration & Validation NASA has placed strong emphasis on pre-launch and post-launch calibration of EOS instruments as well as validation of the EOS standard products so that interdisciplinary scientists need not have detailed knowledge of the instruments and the science algorithms . To ensure delivery of accurate and reliable measurements, on-orbit and ground resources have been specifically allocated to conduct routine calibration such as periodic measurements in orbit of on-board calibrations sources, as well as celestial sources such as cold space, the moon, or the sun. Field calibration exercises will also provide ground-truth calibration of some instruments. Maintaining accuracy and calibration over long time periods is crucial to studies of climate and global change. Lack of calibration has limited the use of operational satellite data in global change studies and has spawned a joint NASA and NOAA retrospective effort, the Pathfinder Project, to develop stable calibration of historical raw data and consistent intercalibration among instruments in a series. Consistent intercalibration allows data to be chained together, for instruments in a series, to create longer time series. Validation of the EOS science algorithms begins with careful analysis of the theoretical basis for the algorithms. Each of the EOS instruments on TRMM and EOS-AM1 produced an Algorithm Theoretical Basis Document which was formally reviewed by NASA and the science community. These documents provide a wealth of information about the science algorithms including expected accuracy of the derived geophysical variables. Each EOS instrument team is planning post-launch validation of their science algorithms through techniques such as comparisons with in situ measurements, field measurements, and intercomparisons with other EOS and non-EOS derived products. 3.6 Core and Value Added Provider Distribution of Science Products In addition to providing facilities and resources for generating standard and special products, ECS will provide facilities and resources for scientists to migrate their data to the DAACs for high quality archiving and distribution and to other third party provider sites --- such as value added providers (VAPs) in an EOSDIS and GCDIS context. Special Data Products (generated as part of a research investigation using EOS data for a limited region or time period, or products that are not accepted as standard by EOS Investigator Working Group and NASA Headquarters), will normally be generated at investigator Science Computing Facilities (SCFs). Scientists who create their own local products have two options for distributing their results. They can: 1) return these Special Products to a DAAC where they will be ingested, archived, and made available to the general science community, or 2) establish a local provider site and advertise services available at their site via ECS. Migration to a DAAC will involve a peer review process in which science issues, demand and allocation of scarce resources are key factors. A limited amount of storage has been reserved at the DAACs for migration of Special Products from the interdisciplinary teams. Requests from other scientists will be resolved through the peer review process. By adhering to published ECS interfaces and protocols, scientists can use ECS core services, such as ingest, to transfer their product to a DAAC. Another option for distributing locally produced products is to establish a local provider site and advertise the availability of a new data collection and related services, such as search and retrieval services, and also the associated protocols. In this context, a value-added provider (VAP) is a third party who, through the interoperability service of ECS, offers their services to the EOSDIS and the extended GCDIS/UserDIS community. The VAPs, as service providers, provide high leverage to other agencies in the form of reuse cost savings, potential for the addition of new science products, and potential for service additions to the general user community. Figure 3.6-1 illustrates the concept of a VAP as a GCDIS member from a user modeling perspective. From a system perspective, the VAP is similar in many ways to other ECS and EOSDIS providers, and can integrate with the EOSDIS infrastructure and application services. .c1.Figure 3.6-1. The Value-Added Provider as a GCDIS Member Commercial applications of earth science data are growing steadily. Value-added service providers obtain data and fuse it with other information. The results are made available through custom services directed at relatively few users; for example, to oil companies using earth science data to target exploration activities or to manage offshore operations. The sizes of the companies providing such services vary widely, from individual consultants to specialist data centers (e.g. ocean-routing, sea-ice mapping). Common activities include the support of resource exploitation (construction, mineral extraction etc.), inventorying and mapping, environmental impact studies, legal investigations, and information provided to operators influenced by environmental conditions. The education community are also supported by value-added service providers, who support services and offer data appropriate for the education and training community. ECS supports value-added service providers through an architectural framework within which they can advertise their services and allow users of the wider GCDIS / UserDIS network to access them. 3.7 Robust infrastructure components End-to-end EOSDIS services depend on ECS providing a robust infrastructure with some components having high reliability, high throughput or large storage capacity. Certain mission critical components must be highly reliable to support launches and to ensure that data are not lost. Examples of mission critical components include: 1) command and control of EOS spacecraft and instruments, and 2) maintenance of reliable, long-term data archives for global change research. Loss of either space assets or long-term data would seriously impair the EOS mission. High reliability is designed into mission critical components only where necessary since high reliability is a significant cost driver. Other ECS components, i.e. the EOS data processing components, provide high throughput in order to ingest, process, and archive the high data rates from EOS instruments. Capturing the raw EOS instrument data (~200 gigabytes/day in mid-1999) and processing it to the level required to confirm data validity are mission critical functions. However, downstream processing of higher level products is important but not mission critical since recovery from processing errors or loss of data products can be accomplished by reprocessing from lower level input data. Processing of standard products is being reevaluated by NASA and the instrument teams, and will likely result in phasing of products into EOSDIS after launch of EOS spacecraft. Nevertheless, processing demands will be high with an expectation of 10's of GFLOPs needed to process the suite of instruments on each EOS spacecraft. Although processing power is important, other ECS components are equally important in building a robust infrastructure (seeÊFigure 3-5). .c.Figure 3-5. Key ECS Automated Resource Components As a result of the high processing throughput and resultant product generation, ECS also must provide archive (near-line) and on-line storage on a scale not seen before by NASA data centers. For example, projected cumulative archive storage of all DAAC archives, for other NASA data, is estimated to be ~ 29 terabytes by mid-1999; whereas, cumulative storage for archiving EOS data is expected to be ~ 500 terabytes by mid-1999, growing to multiple petabytes1 during the lifetime of EOS. Petabyte archive storage alone requires high network bandwidth (gigabits/sec) for data transfer to/from the archive. Further, to accommodate distribution, terabytes of on-line storage will be needed for timely distribution of data to local user sites. Processing power (GFLOPs), archive storage (petabytes), on-line storage (terabytes), and high bandwidth of local area networks (gigabits/sec), must be addressed collectively as a system to achieve a viable ECS infrastructure. Individual components at each DAAC are chosen based on detailed modeling that seeks to maximize service to end users while choosing the most cost effective components at each DAAC and across EOSDIS. In addition to the initial costs of ECS components, maintenance of a viable infrastructure throughout the EOS life cycle involves recurring costs for routine maintenance and operations, as well as periodic upgrades to avoid technology obsolescence. 3.8 Operational Resiliency ECS has many design features that improve operational effectiveness of user-related services. For example, partitioning of major functions is widely used to avoid resource competition such as separating flight operations processing from science data processing. Also, extensive error prevention and recovery in data handling will ensure end-to-end data integrity. In addition, ECS is being designed to minimize the amount of human interaction required for routine operations. For example, automated monitoring of performance and usage will allow problems to be anticipated, and solutions applied, before they adversely impact user services. ECS operations will automatically track histories of search access to data collections. Low demand by science users may indicate that the collection does not adequately meet their needs and should be further analyzed for possible reorganization. Detailed analysis may indicate that an isolated subset of the collection is accessed, and that creation of a new logical view or fragmenting into smaller collections may better serve the needs of that particular community. High demand may portend a performance bottleneck, in which case, replication would be appropriate. Also, ECS operations will track user access patterns to determine if users ad hoc needs are becoming more routine and established, indicating the need to create another ÔviewÕ or context within the Data Dictionary for existing data and services, as well as for potential new collections. Suggested Reading Case, L., 1994: JU9405V1 SDPS Prototyping Plan White Paper. URL:Êhttp://edhs1.gsfc.nasa.gov ECS, 1994: ECS Prototyping and Studies Plan. URL: http://edhs1.gsfc.nasa.gov ECS, 1994: Summary of the ECS System Design Specification. URL: http://edhs1.gsfc.nasa.gov Elkington, M., Meyer, R. and McConaughy, G., 1994: Defining the Architecture of EOSDIS to Facilitate Extension to a Wider Data Information System. Proceedings of the ISPRS'94, Ottawa. Endal, A., 1994: FB9402V2 ECS Science Requirements Summary - Version 2. URL:Êhttp://edhs1.gsfc.nasa.gov Moxon, B., 1994: FB9401V2 EOSDIS Core System Science Information Architecture. URL:Êhttp://edhs1.gsfc.nasa.gov Moxon, B. 1993: 19300611 Science-based System Architecture Drivers for the ECS Project. URL:Êhttp://edhs1.gsfc.nasa.gov. National Research Council, 1994: Panel to Review EOSDIS Plans, Final Report. National Academy Press. 4.ÊÊEOSDIS Implementation 4.1 Capabilities Developed as Needed for Mission Support Step-wise implementation of ECS was chosen to accommodate a compressed development schedule and leverage evolutionary technology. Initial capability for the Tropical Rainfall Measurement Mission (TRMM) will be delivered by December 1996 with EOS-AM1 launch- ready capability delivered by September 1997 (Figure 4-1). .c.Figure 4-1. ECS Development Schedule with Example EOS Missions Release A will support data functions (processing, archiving, data search and access, and distribution) for TRMM in addition to providing for flight operations interface testing and end- to-end data flow testing for EOS AM-1. Release A includes the algorithm development toolkit to support transition of algorithm software developed by the science community into the ECS DAACs. Release A will also include an Interim Release (IR-1) to support early interface testing and initial algorithm integration and test. Release B will provide flight operations for EOS AM-1 and data functions for EOS AM-1, Landsat 7, and ADEOS II. Releases C and D will support future EOS missions, such as EOS PM-1, and will incorporate evolutionary changes such as new processing and storage technologies. Successive releases will provide expanded and increasingly enhanced data search and access, based on feedback from the science community. Computer technology projections indicate significant benefit can be achieved by delaying hardware purchases until needed for mission support. For example, Figures 4-2a and 4-2b show ECS projections of cost per theoretical MFLOP (processing) and cost per gigabyte (on-line disk storage) over the next 5 years. These projections are based on information provided by major vendors of computer hardware and are updated by ECS as better information becomes available. Such technology projections suggest a "just-in-time" purchasing strategy to maximize computer resources for available funding. Also, some commercial software technologies are not expected to be mature enough for use in early EOS missions and will be incorporated as evolutionary improvements in later EOS missions. 4.2 Current Access to Selected NASA Data Sets EOSDIS Version 0 is the initial implementation of EOSDIS as a working prototype system that allows users to search for, and order data from, several DAACs in a single session. Through interconnection of the existing DAAC information management systems, Version 0 serves as a functional prototype of selected EOSDIS services. As a prototype, it does not have all the capabilities, fault tolerance, or reliability of later versions; however, EOSDIS Version 0 supports use by the scientific community in day-to-day research activities. Such use tests existing services to determine what additional or alternative capabilities are required of the full EOSDIS. The current Version 0 functions are data search and order, geographic coverage of selected data on an orthographic projection of the Earth, reduced resolution images for browsing the data, and detailed information on the selected data to be placed for order processing. Although EOS data products are not yet available, data from the Pathfinder program are currently being used by researchers. The Pathfinder data are research-quality global change data sets available to Earth scientists through Version 0. Large remote-sensing data sets applicable to global change research have been developed from existing global and/or regional data sets. Higher level geophysical products are derived from peer-reviewed algorithms. The Pathfinder data sets have met stringent quality assurance and access requirements such as stable calibration of the raw data, consistent intercalibration among different instruments and any necessary archiving to a more accessible medium. Version 0 will evolve towards next generation EOSDIS by ECS taking maximum advantage of existing experience and by ensuring that no disruption occurs in services to current users. 4.3 Evolution of EOSDIS Over the EOSDIS lifetime (at least two decades beyond the launch of the first EOS spacecraft), evolution will come from at least three separate sources: 1) Scientific needs will change as Earth system science matures and new applications of the data emerge; 2) Information system technologies must be refreshed as maintaining older technologies becomes more difficult and new technologies displace them; and 3) Changes in the information infrastructure (e.g., high bandwidth networking) will lead to migration of functions to take full advantage of these capabilities. A key ECS goal is to provide a highly adaptable infrastructure that is responsive to the evolving needs of the Earth science community. Table 4-1 highlights ECS architectural features and associated benefits that facilitate evolution of EOSDIS. .c.Table 4-1. Highlights of ECS Architectural Features that Facilitate Evolution. ECS Architectural Features Benefits that Facilitate Evolution Distributed search and access - Inter- and intra- DAAC services - Integration of additional DAAC-unique and user supplied search services One-stop data search and access Advertisements of new services - New and modified product sets - Submitted by producers Extensible product set Logical collections - Data modeled on Earth Science data taxonomies - Context and relationships captured for items within a collection Information-rich logical data collections Integration of investigator tools - Interoperability via standard data and control exchange protocols Integration of independent investigator tools Transparent access to system-wide resources: - ECS Client applications - Advertising and Subscriptions Services Transparency and location independence Search and access across multiple heterogeneous servers - Multiple protocols, including WAIS, and WWW State-of-the-art protocols Users can combine ECS services in arbitrary ways - Defined interfaces and associated protocols User-tailorable services Interoperability with heterogeneous systems - Standard or negotiated browse and data retrieval formats and protocols International interoperability Layered services - Minimize impact of technology insertion - Graceful evolution Facilitate technology upgrades User access to core system services - API toolkit - Customize user interface Build favorite user interface Toolkits - Software libraries, documentation, configuration tools, and APIs - Develop, insert and test new methods at their local site Local development and testing of new methods To facilitate evolution, a broad range of strategies were used in defining the ECS architecture such as isolating changes to ÔaffordableÕ component replacements or modifications and separating fast changing components, such as user interface browsers, from slowly changing components, such as the archive storage. Other architecture features that support evolution include location and access transparency, as well as avoidance of dependence on vendor computer hardware and system software. Use of layered, hierarchical services allow the application of state-of-the-art client/server topologies. Application Programming Interfaces (API) are a set of library routines between ECS components that allow program-wide access to particular ECS services. APIs are used to accommodate development of DAAC-unique interfaces such as special subsetting capabilities and new search techniques which can be advertised as value added services. APIs allow scientists to apply user-supplied methods to manipulate ECS data products from within an ECS query or executed from a userÕs own computing facility as a machine-to-machine process. The above architectural features, in addition to object-oriented design methodology, result in a design that is state-of-the-art in the use of technology and adaptable to evolutionary change. Table 4.2 provides a list of likely near term or long term evolutionary enhancements as they apply to the subsystems that make up the core of the ECS architecture. .c.Table 4.-2. Likely Near and Long Term Evolutionary Enhancements For ECS ECS Subsystem Key Capabilities and Technologies Client ¥ Enhanced Multi-Media Based Interfaces ¥ Hyper-Media Application Integration ¥ Client Access Outside of ECS Data Providers (GCDIS, UserDIS access) Interoperability ¥ Support of Additional Data Providers (part of core design - anticipated) ¥ Support of Additional Data Collections (part of core design - anticipated) ¥ Advanced ÒMiddle-WareÓ Standards (e.g. CORBA, see communications) Data Management ¥ Object Oriented Information Models ¥ Distributed Query Optimization Engines ¥ Distributed DBMS COTS Data Server ¥ Advanced Query Languages / Protocols (e.g. SQL3 compatibility) ¥ DBMS Technologies (e.g. OODBMS, ORDBMS, etc.) ¥ Advanced Media Types and Storage Formats (TB range, denser, cheaper) ¥ Use of Advanced Distributed File Systems (e.g. MR-AFS, etc., see communications) ¥ Higher Data Rate I/O Channels ¥ Advanced Data Staging Hardware (new multi-level RAID, denser, cheaper) ¥ Multi-FSMS ¥ DBMS and FSMS/HSM Integration ¥ Advanced Document Data Standards and Access Methods ¥ User Definable Data Collections (links) ¥ New Data Types and Collections (part of core DS design - anticipated) Data Ingest (many Data Server items apply equally to Ingest) ¥ New Data Types and Formats ¥ New Data Providers and External Interfaces Planning ¥ DBMS Technologies (e.g. OODBMS, ORDBMS, etc.) ¥ Advanced Planning COTS ¥ New Standards in Process Management, Dispatch and Control Data Processing ¥ Distributed Processing Protocols, Control and Management ¥ DBMS Technologies (e.g. OODBMS, ORDBMS, etc.) ¥ Processing Advances (faster/cheaper: workstations, clusters, SMPs, MPPs, etc.) ¥ Higher Data Rate I/O Channels ¥ Advanced Data Staging Hardware (new multi-level RAID, denser, cheaper) ¥ Compilers & Languages Communications Infrastructure ¥ Migration to Advanced ÒMiddle-WareÓ and O/S Infrastructures (e.g. CORBA) ¥ Extended O/S Support (Including Multi-Media Support, advanced dist. file systems, etc.) ¥ Gbps/Multi-Gbps LANs, WANs, etc. ¥ Switched Channel Technologies ¥ Advanced Protocols System Management ¥ Increased Automation in Enterprise Management Software Suites ¥ Simple Network Management Protocol (SNMP) to True Object Management Migration ¥ Enhanced Billing and Accounting ¥ Extended Management Domains and Management Information Bases (MIBs) 4.4 Multi-Track Development The ECS Team is developing ECS using a multi-track development approach that includes 1) development of a portion of ECS on an incremental track and 2) parallel development of the remainder of ECS on a formal track using the traditional waterfall development methodology. The two primary drivers for development on the incremental track are volatility of user-sensitive requirements and commercial-off-the-shelf (COTS) intensive integration. To accelerate and accommodate early user feedback, a delivery mechanism called an Evaluation Package (EP) was devised to put incremental developments and selected prototypes in the hands of distributed users for evaluation and design iteration significantly in advance of formal track releases. Another purpose of the Evaluation Package is early integration of COTS hardware and software in order to evaluate advertised capabilities of commercial vendors. The key to successful development on the incremental track is to provide structure without creating an administration overload that removes the freedom to react to objectives and design changes dictated by emerging circumstances, such as programmatic changes or technology evolution. To meet this challenge, we have adopted an Evaluation Package life cycle that merges selected practices from more traditional engineering methods with rapid prototyping methodology. For example, an objectives review is held with NASA and science community representatives at the beginning of each Evaluation Package to establish common understanding of design and evaluation goals. Other reviews with NASA and science community representatives include design, test readiness, consent to ship, and final readiness reviews. At each review, status and lessons learned are discussed and changes incorporated based on feedback by review attendees. Mockups, early prototyping, and in-process demonstrations give reviewers progressively better insight into the planned Evaluation Package functionality. After final integration and testing, the Evaluation Package software is then distributed to the DAACs and other evaluator sites for a multi-week evaluation. Also, science community representatives are invited to participate in structured usability testing at the ECS development facility in Landover, MD. Usability testing is an efficient and low cost method of testing and quantifying ease of use. Experience to date indicates that the minimum time to produce meaningful content in an Evaluation Package is about 6Êmonths, and that evaluation will require an additional 2 months including time for data analysis and results sharing. 4.5 Managing Technology Insertion Technology insertion must be carried out in a manner that ensures operational and functional compatibility, uninterrupted service to users, and smooth transitions between releases. The ECS team uses a variety of processes and facilities to manage technology insertion in a way that minimizes impact on production operations. ECS utilizes an object oriented design methodology that focuses on encapsulation of functions and data. This approach helps to stabilize interfaces and enables component changes to be made with minimal impact on the remainder of the system. During the design phase for each release, new functionality is assessed for impact on existing operational capabilities and a migration strategy developed if necessary. A transition plan is created for each new release that defines all activities that must be performed to migrate from the previous release. This includes: integration and test and regression testing at the ECS development facility; onsite testing at each DAAC; training of operations and user support staffs on the new capabilities; and the steps involved in executing the cutover from the old to the new release. A portion of the hardware and software configuration at each DAAC is dedicated to supporting nonproduction activities like science algorithm integration and test and new ECS release integration and test. This enables testing of new technology and evolutionary enhancements in parallel with production operations. This testing will include standalone site testing as well as system testing that involves multiple sites. Cutover to the new release is coordinated by the System Monitoring and Coordination (SMC) function of ECS. System-wide use of a common configuration management tool ensures that changes are introduced into the production environment in a controlled manner. Suggested Reading ECS, 1995: 301-CD-002-003, System Implementation Plan for the ECS Project (1/95). URL:Êhttp://edhs1.gsfc.nasa.gov Elkington, M., 1994: 19400131 Defining the Architectural Development of EOSDIS. URL:Êhttp://edhs1.gsfc.nasa.gov NASA, 1994: EOSDIS VO IMS Home Page. URL:Êhttp://harp.gsfc.nasa.gov:1729/eosdis_documents/eosdis_home.html/ Schwaller, M. 1993: Science Data Plan for EOSDIS Covering EOSDIS Version 0 and Beyond. NASA/GSFC. Anonymous ftp host: eos.nasa.gov. Change directory to /EosDis/Daacs/Docs/ScienceDataPlan. This page intentionally left blank. 5. Science Community Involvement Development of science-based services must have science user involvement and users have directly influenced development of ECS through several processes. Interaction with users occurs across a broad range of EOS organizations including the Investigator Working Group, the EOSDIS Data Panel, ECS science advisors, technical working groups, DAAC User Working Groups, and the general science community. Users participate in major design reviews; critique design considerations; collaborate in working group sessions; and submit on-line suggestions. The following sections describe how users are involved in ECS development at various technical levels, and how user feedback and suggestions have influenced the direction of ECS. 5.1 Users Involvement at Technical Levels A broad spectrum of the user community provides guidance at all technical levels from high level policy decisions to prototyping. The highest level of guidance is provided by the Investigator Working Group and the associated panels, particularly the EOSDIS Data Panel which is composed of prominent scientists with strong data management backgrounds. The Data Panel meets quarterly to review EOSDIS policy, plans, and design, and provides recommendations on where EOSDIS should focus in the near- and long-term. Recommendations are reviewed, analyzed, and incorporated into EOSDIS planning under NASA technical direction. Guidance at the technical level is also provided by the ECS Science Advisors. These science community representatives, sometimes referred to as ECS "tire-kickers", participate in reviews, evaluate ECS prototypes and evaluation packages and provide comments on a technical level for various issues as requested by NASA. The Science Advisors produce recommendations, evaluations and feedback about details of the technical design and implementation that are incorporated into future design changes. Technical working groups are made up of members of the science community who work closely with NASA and ECS team members to address specific technical challenges. For example, the Ad Hoc Working Group on Production was formed to address issues associated with data processing, storage, and distribution capacities of EOSDIS. The Data Modeling Working Group was formed to address issues on information architecture and modeling. Technical working groups are formed as needed and include NASA, science community and ECS representatives. The DAAC User Working Groups consist of science users with expertise or interaction with a specific DAAC. They provide guidance on DAAC plans, budgets and operations. The DAAC Scientist is the primary interface between ECS and the DAAC User Working Groups. A broad spectrum of the science community is represented at the ECS reviews, where guidance is provided at all technical levels. Reviews focus on everything from EOS policy to specific implementation of design; Earth science community members are invited to review and comment on issues presented. Critiques are submitted by science community members, and ECS responds to each critique by stating how the issue will be handled in the future. This process helps to guide work at all technical levels, and requires that ECS be responsive to the science user. 5.2 Influence of User Feedback on EOSDIS The direction and design of EOSDIS has been strongly influenced by the user community beginning with Phase B design studies. Recommendations by the EOSDIS Data Panel during Phase B led to the adoption of a distributed architecture with multiple DAAC providers. Critique of the ECS architecture presented at the System Requirements Review in September 1993, by the National Research Council and the EOSDIS data panel, resulted in a revised architecture that is now more open and evolutionary. Specific technical characteristics of ECS have been influenced through ongoing technical interactions, disposition of review and documentation critiques, and the general user suggestion process. Working groups, for instance, have improved design aspects of ECS data processing, storage and distribution. They also helped define metadata standards and browse packaging. Over 2,000 critiques have been received from members of the science community and others who attended major design reviews, many of which have been incorporated into technical level plans. Because ECS developers answer critiques while they are developing technical plans for the next level of design, many of the suggestions are incorporated into their plans. For instance, suggestions that ECS find an efficient means of serving the non-science community led to the development of the value added service provider concept. ECS also provides scientists with an 'on-line' suggestion box for collecting, reviewing, and dispositioning user recommendations for the design of the ECS system. To date, over 700 recommendations have been entered into the system: 78% either match existing requirements or are being factored into ECS design considerations, 17% are in evaluation, and 5% rejected because of technical or cost considerations. These entries have varied from questions and comments about the ECS design to feedback on prototyping to recommendations on new design requirements. An example of the way users have affected the ECS design is evident in the addition of a new requirement to support coincident searching. After review, assessment , and approval ECS now has a requirement to "search across multiple data sets for coincident occurrences of data in space and/or time and any other attribute(s) of metadata." Currently, user recommendations are being evaluated for incorporation into the ECS design as part of the lower-level requirements definition process. Many of the recommendations are suggestions on how the user community would like to use the system. They illustrate typical operations the user will need, enhancement of Version 0, or special options which would ease the user's job of analyzing the science data. This feedback is proving to be an invaluable resource for optimizing the ECS design. The many ways that users have to influence the design of ECS have resulted in a system design that has evolved from user feedback. As design and development proceed, these processes will continue to influence the development a system that will meet user needs. 5.3 Influence of Community Feedback on GCDIS and UserDIS Vision Community influence on the ECS Architecture and System design relating to a broader GCDIS and UserDIS vision began formally with the 1993 review of NASA's EOSDIS program by the National Research Council (NRC). The NRC expressed several concerns regarding the proposed system architecture and its suitability to contribute towards a general multi-agency Global Change Data and Information System (GCDIS) and an expanded version open to general earth science data providers and users (UserDIS). In particular, the NRC was concerned about: ¥ the separability of EOSDIS Core System (ECS) components for reuse in GCDIS, ¥ the portability of the system to other environments, and ¥ the evolvability of the system over the long term of the global change research program. Among its recommendations, the NRC suggested that EOSDIS be designed such that all users (EOS, non-EOS investigators, DAACs, other data centers) can easily build selectively on top of EOSDIS components without constraining local implementation of diverse functions by users and DAACs and where interaction with EOSDIS should lead to a minimum loss of autonomy. The primary influence of the extended user community on the ECS system design was summarized by the NRC expectation that: Òby carefully choosing the appropriate architectural approach, there are components of GCDIS/UserDIS which ECS could contribute without leaving its mission envelope and without a lot of additional costÓ. Toward this end, instead of simply proposing what changes should be made to EOSDIS to enable it to support GCDIS / UserDIS, we decided that a better approach would be to specify a generalized data and information system architecture concept. This architectural and system design concept provides guidance for the ECS development team such that ECS software components can play an important role in the development of GCDIS / UserDIS. The key tenets of the GCDIS and UserDIS vision as they influence the ECS design include: ¥ Data providers must have complete freedom of choice as to how they wish to organize data into types. ¥ The system design must support the inclusion of legacy systems into the network. ¥ The architecture must provide an open ended approach to earth science data search and retrieval. ¥ The architecture must make no principal distinction between various levels of earth science metadata (e.g., directory versus inventory), or between metadata and data. ¥ The system design approach must encourage competitive and collaborative development. The complete solutions to these types of goals are outside the scope of an architecture. They depend on the cooperation of service providers which, in a network like UserDIS, is voluntary. However, the architecture can include measures to facilitate the solutions. The ECS architecture, as influenced by the broader user community, will provide mechanisms for characterizing situations where standards or conventions exist and are being followed. Suggested Reading Corbell, M., 1994: MR9407V3 User Environment Definition for the ECS Project. URL: http://edhs1.gsfc.nasa.gov Tyahla, L. 1994: 19400313 ECS User Characterization Methodology and Results. URL: http://edhs1.gsfc.nasa.gov Tyahla, L. 1994: 19400548 User Scenario Functional Analysis. URL: http://edhs1.gsfc.nasa.gov West, S., 1994: 19400549 ECS Scientist User Survey (ESUS). URL: http://edhs1.gsfc.nasa.gov Wingo, T. 1994: 19400312 User Characterization and Requirements Analysis. URL: http://edhs1.gsfc.nasa.gov 1 petabyte = 1,000 terabytes