STANDARDS CHANGE REQUEST ======================== Purpose: Provide the ability for data providers to "quickly" define keywords on a level more suited to specific data products. Date: 2003-04-17 Submitted by: R.Joyner Working Group: A.Raugh (lead), S.Hughes, J.Wilf, R.Alanis, T.King, L.Huber, R.Simpson, J.Zender Background ========== Adopting keywords into the PDS data dictionary typically requires a fair amount of negotiation between CN and the group (DN and/or the instrument team) proposing the new keyword(s). The negotiation is centered around ensuring that the keyword is accurately defined, does not conflict with an existing keyword, complies with the formation and class-word rules, uses the proper set of units, etc. The group reviewing the proposed keyword is also responsible for providing a keyword definition that has a long term scope of use across missions, instruments, and data products. While, this process can be tedious and time consuming, it does ensure that the set of approved keywords adhere to a level of "quality assurance". The PDS has come to realize that the "quality assurance" aspect can be a deterent to expediting the incorporation of the new keywords into the CN catalog. And, there are a set of keywords that have a scope of use that is not appropriate to the rigors of ensuring complete compliance (i.e., the scope is limited / local to a single mission, a small set of data products, or is so specific that only a very few data providers would make use of the keyword). Current Urgency =============== Several missions (eg., Cassini, MER, Rosetta, and Stardust) are currently proposing a fair number of mission-specific keywords that are so limited in scope that the PDS needs to adopt a new process / mechanism for expeditiously incorporating these keywords into the PDS data dictionary. The above missions are expecting these keywords to be available for their use in the coming months and expect this to occur with minimal impact to their schedule for writing labels. The Stardust mission has already defined keywords specific to the mission, that the mission is using in archived products, and that have not been approved by the PDS. Stardust has designated these keywords as "local" to the mission with the understanding (from the DN) that they may be at risk. This SCR presents a detailed description of the changes necessary for, and the impacts of, the PDS adopting Local Data Dictionaries. Recommendations =============== The working group recommend the following actions: 1. Modify the PDS Data Dictionary to allow for defining local keywords. 2. Modify the current version of the PDS Standards Reference to define: - the conditions by which a keyword can be considered for use within a local data dictionary. - the method for specifying a locally-defined keyword in a label. - verification / validation of locally-defined keywords 3. Modify the following PDS tool suite to accommodate the validation of locally-defined keywords in a label: - LVTOOL - KWVTOOL - DDICT Note: The Perl tool suite will not have to be changed as it uses the most recent version of LVTOOL.exe. 4. The rewrite of the Data Preparation Workbook (when it occurs) will need to address the use of local keywords in an archive product. The set of changes to the DPW is not addressed within this SCR. It will be the responsibility of the DPW scrivener to accurately document the Local Data Dictionary processes / mechanisms. 5. The modifications outlined above and the impact statement that follows should be approved and scheduled for implementation as outlined. Changes to the PDS Data Dictionary ================================== The following are the set of changes to the PDS Data Dictionary (PSDD) required to support the use of locally-defined keywords in an archive product: (a) Redefine the set of standard value for STATUS_TYPE to include a status for locally-defined keywords. (b) Define the NAMESPACE_ID data element and the appropriate set of standard values for the keyword. Background ---------- The principal enhancement to the PSDD is to add the ability to designate a keyword as being "local" to a unique "namespace". Typically the "namespace", to which the keyword is designated, will be the mission that proposed the keyword. Definition of namespace: A namespace uniquely identifies a set of elements such that there is no ambiguity between objects having identical names but different origins. A new attribute, namespace, will be added to the PSDD. This attribute functions to delineate the unique namespace to which a keyword is designated (i.e., identifies the organization which proposed and defined the keyword). Currently, the PSDD requires each element (keyword) be unique. The PSDD will be changed to allow each element (keyword) be unique with respect to the element name and namespace. For example, TARGET_NAME is currently defined within the PSDD as being a "common / global" keyword. A mission could decide that it wants to use a different defintion of TARGET_NAME. The two definitions would co-exist within the PSDD, each identified under different namespaces. Example: TARGET_NAME = "EARTH" (namespace = PSDD) CASSINI:TARGET_NAME = "EARTH" (namespace = CASSINI) VOYAGER:TARGET_NAME = "MARS" (namespace = VOYAGER) In the above example, the PSDD would contain three separate instances of the TARGET_NAME keyword: (a) the common (PSDD) instance which the PDS defined and which the PDS community at large agreed upon. (b) the CASSINI instance which the Cassini project defined. (c) the VOYAGER instance which the Voyager project defined. Note: Cassini currently is proposing some 200+ keywords of which approximately 100+ have been identified as being appropriate for inclusion in the PSDD as local to Cassini. Each locally-defined keyword would be identified as CASSINI:keyword_name. OBJECT = ELEMENT_DEFINITION NAME = NAME DESCRIPTION = " The name data element indicates a literal value representing the common term used to identify an element or object. See also: 'id'. Note: In the PDS data dictionary, if the name identifier is prepended with a namespace identifier (e.g., CASSINI:TARGET_NAME), then the name identifier is restricted to 61 characters where the name identifier and the namespace identifiers are each restricted to 30 characters and are separated by a colon (for a total maximum length of 61 characters). The name identifier and its component parts must conform to PDS nomenclature standards. If the name identifier is used without a namespace identifier (e.g., TARGET_NAME), then the name identifer is restricted to 60 characters, and must conform to PDS nomenclature standards." GENERAL_DATA_TYPE = "CHARACTER" MAXIMUM = "N/A" MINIMUM = "N/A" MAXIMUM_LENGTH = "61" MINIMUM_LENGTH = "N/A" STANDARD_VALUE_TYPE = "DYNAMIC" STANDARD_VALUE_SET_DESC = "" KEYWORD_DEFAULT_VALUE = "" UNIT_ID = NONE FORMATION_RULE_DESC = "" SOURCE_NAME = "PDS CN/J.S. Hughes" SYSTEM_CLASSIFICATION_ID = "COMMON" GENERAL_CLASSIFICATION_TYPE = "DATASET STRUCTURE" CHANGE_DATE = "..." STATUS_TYPE = "PENDING" STANDARD_VALUE_OUTPUT_FLAG = "Y" TEXT_FLAG = "N" BL_NAME = "name" TERSE_NAME = "name" SQL_FORMAT = "CHAR(61)" BL_SQL_FORMAT = "char(61)" DISPLAY_FORMAT = "JUSTLEFT" AVAILABLE_VALUE_TYPE = "" END_OBJECT = ELEMENT_DEFINITION END OBJECT = ELEMENT_DEFINITION NAME = STATUS_TYPE DESCRIPTION = " The STATUS_TYPE element indicates one of a fixed number of statuses that can describe a particular data element or object. Examples: PENDING, APPROVED." GENERAL_DATA_TYPE = "CHARACTER" MAXIMUM = MINIMUM = MAXIMUM_LENGTH = MINIMUM_LENGTH = STANDARD_VALUE_TYPE = STATIC STANDARD_VALUE_SET = { APPROVED, LOCALLY-APPROVED, OBSOLETE, PENDING } STANDARD_VALUE_SET_DESC = " APPROVED -> Has been reviewed and accepted by the appropriate committees and entered into the Planetary Science Data Dictionary. LOCALLY_APPROVED -> Has been reviewed and accepted by the lead-node governing the use of local data elements and has been entered into the Planetary Science Data Dictionary as a locally-defined data element. OBSOLETE -> Has been designated obsolete by the appropriate review committee and should not be used in future archive products. PENDING -> Is currently in the review process. " KEYWORD_DEFAULT_VALUE = 1 UNIT_ID = NONE FORMATION_RULE_DESC = SOURCE_NAME = "PDS CN/J.S. Hughes" SYSTEM_CLASSIFICATION_ID = PDS/CN GENERAL_CLASSIFICATION_TYPE = SYSTEM CHANGE_DATE = "..." STATUS_TYPE = "PENDING" STANDARD_VALUE_OUTPUT_FLAG = "Y" TEXT_FLAG = "N" BL_NAME = "statustype" TERSE_NAME = "" SQL_FORMAT = "CHAR(13)" BL_SQL_FORMAT = "char(13)" DISPLAY_FORMAT = "JUSTLEFT" AVAILABLE_VALUE_TYPE = "" END_OBJECT = ELEMENT_DEFINITION END OBJECT = ELEMENT_DEFINITION NAME = NAMESPACE_ID DESCRIPTION = " The NAMESPACE_ID element uniquely identifies a set of elements such that there is no ambiguity between elements having identical names but different origins. The NAMESPACE_ID is limited in scope to identifying observing campaigns / missions." GENERAL_DATA_TYPE = "CHARACTER" MAXIMUM = "N/A" MINIMUM = "N/A" MAXIMUM_LENGTH = "30" MINIMUM_LENGTH = "N/A" STANDARD_VALUE_TYPE = STATIC STANDARD_VALUE_SET = { PDSDD, CASSINI, MARS-OBSERVER } STANDARD_VALUE_SET_DESC = "" KEYWORD_DEFAULT_VALUE = 1 UNIT_ID = NONE FORMATION_RULE_DESC = SOURCE_NAME = "PDS CN/R. Joyner" SYSTEM_CLASSIFICATION_ID = PDS/CN GENERAL_CLASSIFICATION_TYPE = SYSTEM CHANGE_DATE = "..." STATUS_TYPE = "PENDING" STANDARD_VALUE_OUTPUT_FLAG = "Y" TEXT_FLAG = "N" BL_NAME = "namespace" TERSE_NAME = "" SQL_FORMAT = "CHAR(20)" BL_SQL_FORMAT = "char(20)" DISPLAY_FORMAT = "JUSTLEFT" AVAILABLE_VALUE_TYPE = "" END_OBJECT = ELEMENT_DEFINITION END Note: The standard values for the NAMESPACE_ID keyword are static in the sense that they exist in a well defined and fixed set. Additional standard values will be added when a mission proposes new locally-defined keywords. Changes to the Standards Reference ================================== The following are the set of changes to the PDS Standards Reference required to support the use of locally-defined keywords in an archive product. Section 5.5 - Locally-defined Data Elements -------------------------------------------- The PSDD contains a large set of common (global) data elements (keywords) and small sets of locally-defined data elements. The set of common data elements are available for use in any label. Locally-defined data elements may only be used in data product labels. Section 5.5.1 - Justification for Locally-defined Data Elements ---------------------------------------------------------------- There are two justifications for when a mission can create a locally-defined keyword: (a) the scope of use is limited / local to a single mission, a small set of data products, or is so specific that only a very few data providers would make use of the data element (keyword). Examples of data elements in the PSDD having limited scope: MAXIMUM_B1950_RING_LONGITUDE [PDS-RINGS] ---------------------------------------- The maximum_B1950_ring_longitude element specifies the maximum inertial longitude within a ring area relative to the B1950 prime meridian, rather than to the J2000 prime meridian. The prime meridian is the ascending node of the planet's invariable plane on the Earth's mean equator of B1950. Longitudes are measured in the direction of orbital motion along the planet's invariable plane to the ring's ascending node, and thence along the ring plane. Note: For areas that cross the prime meridian, the maximum ring longitude will have a value less than the minimum ring longitude. INSTRUMENT_FORMATTED_DESC [PDS-CN] ---------------------------------- The instrument_formatted_desc element contains the formatted instrument descriptions. These descriptions represent the information collected for the PDS Version 1.0 instrument model and were created by extracting instrument information from several tables in the catalog data base. These descriptions represent an archive since the tables have been eliminated as part of the catalog streamlining task. DATA_SET_LOCAL_ID [PDS-SBN] --------------------------- The DATA_SET_LOCAL_ID element provides a short (of order 3 characters) acronym used as the local ID of a data set (Example value: IGLC). It may also appear as the first element of file names from a particular DATA_SET (Example value:IGLCINDX.LBL). (b) the common instance, and any other local instances, currently defined in the PSDD are inadequate in some descriptive capacity: - the data element definition is too restrictive or inappropriate - the length of the keyword-value is too short - different types of units A possible scenario for the above could be that the Cassini mission wants to use the DATA_QUALITY_ID keyword. DATA_QUALITY_ID [PSDD] - CHAR(3) -------------------------------------- The data_quality_id element provides a numeric key which identifies the quality of data available for a particular time period. The data_quality_id scheme is unique to a given instrument and is described by the associated data_quality_desc element. But, the Cassini mission wants to re-use the data element in a way that is different from the instance(s) currently defined in the PSDD. DATA_QUALITY_ID [CASSINI] - CHAR(50) -------------------------------------- The data_quality_id element provides a short acronym or identifier of the qualitative state in which the data resided when the data was generated by the instrument team. The data_quality_id is unique to the Cassini mission and is described by the associated data_quality_desc element. Section 5.5.2 - Identification of Locally-defined Data Elements ---------------------------------------------------------------- Locally-defined instances of data elements (keywords) are identified in data product labels as: : where is the unique namespace to which the keyword is designated. is the name of the keyword being included in the data product label. If there are multiple instances of a keyword, then the specific instance of use is identified as follows: Example: TARGET_NAME = "EARTH" (namespace = PSDD) CASSINI:TARGET_NAME = "EARTH" (namespace = CASSINI) VOYAGER:TARGET_NAME = "MARS" (namespace = VOYAGER) In the above example, the PSDD contains three separate instances of the TARGET_NAME keyword: (a) the common (PSDD) instance which the PDS defined and which the PDS community at large agreed upon. (b) the CASSINI instance which the Cassini project defined. (c) the VOYAGER instance which the Voyager project defined. Note: Once incorporated into the PSDD as a local instance of a keyword, the keyword may be used by any data provider associated with any mission or organization (eg., PDS-GEO-MGN, PDS-RINGS, MARS OBSERVER, ISIS, etc). Section 5.5.3 - Review of Locally-defined Data Elements -------------------------------------------------------- Each locally-defined data element (keyword) must be reviewed and approved by the PDS prior to use in a data product label. This is to ensure each keyword conforms to the syntactic and semantic PDS specifications. Locally-defined keywords are approved by the PDS lead-node for the mission that is proposing the keyword. Section 19.3.3.2 - DOCUMENT Subdirectory ---------------------------------------- Add: Data Dictionary Files [OPTIONAL] The data dictionary files are comprised of two files, PDSDD.FUL and PDSDD.IDX. The PDSDD.FUL file identifies and describes the data object and data element definitions contained in the Planetary Science Data Dictionary (PSDD). The PDSDD.IDX is an index of the PDSDD.FUL and is currently used by the PDS validation tools to quickly locate individual elements in the PSDD. These files are human-readable ASCII text and are useful for (future) users to ascertain the data object and data element definitions used within the PDS at the time that the archive product was produced. The above files are required if locally-defined data elements are used in the archive product, and are recommended if the archive product does not use locally-defined data elements. The PDSDD.FUL and PDSDD.IDX files can be labeled using either the TEXT or ASCII_DOCUMENT objects. Example: PDSDD.LBL PDS_VERSION_ID = PDS3 RECORD_TYPE = STREAM ^FUL_TEXT = "PDSDD.FUL" ^IDX_TEXT = "PDSDD.IDX" OBJECT = FUL_TEXT PUBLICATION_DATE = 2003-12-31 END_OBJECT = FUL_TEXT OBJECT = IDX_TEXT PUBLICATION_DATE = 2003-12-31 END_OBJECT = IDX_TEXT END Section 12.2.3 - ODL Character Set - Special Characters ------------------------------------------------------- : Colon The colon is used in attribute assignment statements to separate a namespace_identifier from an attribute_identifier (see Section 12.4.2) The colon is also used to separate hours, minutes, and seconds within a time value. Section 12.3.4.2 - Namespace Identifier --------------------------------------- When a namespace_identifier is prepended to the element_identifier statement, it indicates that the element_identifier has a local definition within the context indicated by the namespace_identifier. Examples - assignment statements RECORD_BYTES = 800 TARGET_NAME = JUPITER SOLAR_LATITUDE = (0.25 , 3.00 ) FILTER_NAME = { RED, GREEN, BLUE } Examples - assignment statements that use namespace_identifier CASSINI:TARGET_NAME = JUPITER MRO:SOLAR_LATITUDE = (0.25 , 3.00 ) VOYAGER:FILTER_NAME = { RED, GREEN, BLUE } Section 12.4.2 - Attribute Assignment Statement ----------------------------------------------- The attribute asignment statement is the most common type of statement in ODL and is used to specify the value for an attribute of an object. The value may be a singular scalar value, an ordered sequence of values, or an unordered set of values. The attribute assignment statement may optionally contain a namespace_identifier. assignment_statement ::= attribute_identifier = value where attribute_identifier ::= element_identifier | namespace_identifier:element_identifier Examples - assignment statements RECORD_BYTES = 800 TARGET_NAME = JUPITER SOLAR_LATITUDE = (0.25 , 3.00 ) FILTER_NAME = { RED, GREEN, BLUE } Examples - assignment statements that use namespace_identifier CASSINI:TARGET_NAME = JUPITER MRO:SOLAR_LATITUDE = (0.25 , 3.00 ) VOYAGER:FILTER_NAME = { RED, GREEN, BLUE } Changes to the PDS Tool Suite ============================= The following are the set of enhancements to the individual validation software tools required to support the use of a Local Data Dictionary in validating an archive product. CN estimates that the development will require approximately 1.5 man-months of effort. - LVTOOL - recognize and differentiate between PDS common / global data elements and locally-defined data elements. This will affect how keywords are validated in that the PSDD will have to be searched by two attributes (instead of one): - does the keyword (eg., DATA_QUALITY_ID) exist in the PSDD. - does an instance of the keyword (eg., CASSINI:DATA_QUALITY_ID) exist within the designated mission-specific namespace. - report a warning if the keyword_status is obsolete. - KWVTOOL - recognize and differentiate between PDS common / global data elements and locally-defined data elements. This will affect how keywords specified in the data dictionary are used to generate the report file which lists the unique keywords and keyword-values used in an archive product. - DDICT - use the set of keywords specified in the local data dictionary file to generate the report file which lists the set of unique keywords and keyword-definitions used in the archive product. - New tool - a web based application that will allow a PDS approved user to (in real-time): (a) define a locally-defined data element (keyword). (b) ingest the keyword into the PSDD. (c) download the current PSDD (i.e., the PDSDD.FUL and PDSDD.IDX having the set of keywords that the user just ingested). Impact Statement ================ The following represents the set of impacts to the PDS which will be affected by the incorporation of Local Data Dictionaries: The positive impacts of this SCR greatly out weigh the negative. Adopting the use of locally-defined keywords will allow the PDS to be more flexible and responsive in adopting new-keywords proposed by the instrument teams. The overall effect should be that the instrument teams produce more comprehensive documentation / labelled data with minimal time and effort on negotiating the addition of new keywords. Operational Impacts ------------------- The operational impacts are identified in the Changes to the Standards Reference section of this SCR. Software Tools -------------- The impacts to the software tools are identified in the Changes to the PDS Tool Suite section of this SCR.