Over 25 years in the business of organized information Founder & Principal, Taxonomy Strategies Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies (acquired by Interwoven,
November 2000) Program Manager, Getty Foundation Manager, Pricewaterhouse
Metadata and taxonomies community leadership President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and
Telecommunications Board Reviewer, National Science Foundation Division of Information and Intelligent Systems Founder, Networked Knowledge Organization Systems/Services
Defense Intelligence Agency Federal Aviation Administration FirstGov Forest Service HeadStart NASA Small Business Administration Social Security Administration USDA Economic Research Service USDA OCIO e-Government
Program
Dublin Core Metadata Initiative IDEAlliance
Blue Shield of California Halliburton HP Motorola PeopleSoft Sprint Time, Inc.
for Critical Mass – Fortune 50 retail for Deloitte Consulting – top credit card issuer
European Standards Organisation Government of Singapore
Terms | Definitions |
Metadata | Metadata is structured information to describe content. Typical |
metadata fields are Title, Author, Subject, Publication Date, etc. | |
Values | Values for metadata fields may be free text (e.g. Title), a |
specified data type such as a number or date, or come from a | |
predefined list (e.g. predefined codes for Subjects). | |
Controlled Vocabulary | A managed set of terms that have been explicitly defined and agreed upon. All terms in a controlled vocabulary have an unambiguous, non-redundant definition. Additions and deletions |
are “controlled”, meaning a process must be followed to change | |
Taxonomy | the list.Taxonomy is defined as a system for naming and organising |
things into groups that share similar characteristics. It is a set of | |
terms, organized into a structure. The terms might be the names | |
of people, places, organizations, things, and concepts. The | |
organization may be hierarchical and/or a set of mutually | |
Facet | exclusive categories called facets.Facets enable the classification of content from multiple dimensions. It is a discrete branch of a taxonomy, with a |
separately maintained controlled vocabulary. Facet values are | |
given in separate metadata fields.
Introductions What is Taxonomy? What is Dublin Core? What is the NASA Taxonomy? Using the NASA Taxonomy Case Study: JPL Unified Search for Project Information
Animalia
Chordata
Mammalia
Carnivora
Canidae
Canis
C. familiari
Kingdom
Phylum
Class
Order
Family
Genus
Species
Linnaeus …
Pets Mammals Farm
UNSPSC …
44-Office Equipment and Accessories and Supplies
.12-Office Supplies .17-Writing Instruments
.05-Mechanical pencils .06-Wooden pencils .07-Colored pencils
Segment
Family
Class
Commodity
School Supplies Office Supplies Art Supplies
… find the right information at the right time to solve the problem at hand
rement Risk Mgmt Meeting Small Business … Institutions Plans Transportation Sales Legal Advisory Services Rates and Institutional Info Tech Strategy Personnel Banking services Rankings Investors Finance Travel Training High Net Worth Financial Non-US Professionals Procurement Forms Services Instruments Individual Management
Business Units PR Private Asset Management Investors Support
… Models Management Market Strategy Financial Education Code … Marketing Intermediaries Health Care Rich media Mass Media Regulators Arts
Images Public Relations Media Hospitality Video Purchasing Vendors Other Services … Rates Special groups Public Admin Rates and Internal …
Rankings Employees Ratios Partners Research Contractors Risk Boards Settlements and …
Damages Statistics
…
The power of taxonomy facets
4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104)
Easier to maintain Can be easier to navigate
Introductions What is Taxonomy? What is Dublin Core? What is the NASA Taxonomy? Using the NASA Taxonomy Case Study: JPL Unified Search for Project Information
Dublin Core (DC) is the Metadata
Standard for describing Internet
resources so they are easy to find.
DC is being used as the starting point for many metadata specifications.
DC is a set of 15 basic elements. All are optional and repeatable.
Original Dublin Core workshop approved as
held in Dublin, ISO 15836.Ohio.
Shanghai meeting.
Identifier | Date |
Title | Source |
Creator | Relation |
Contributor | Rights |
Publisher | Format |
Subject | Type |
Description | Language |
Coverage |
95 0304 For more information:
http://www.dublincore.org
A Small Metadata and Taxonomy Example
Metadata Standard
Field | Example Value |
Identifier | www.iras.gov.sg/taxation/income_tax.html |
Creator | MOF > IRAS |
Title | Taxation in Singapore |
Date | 2002-01-11 |
Subject | Business > Taxation > Income Tax |
Introductions What is Taxonomy? What is Dublin Core? What is the NASA Taxonomy? Using the NASA Taxonomy Case Study: JPL Unified Search for Project Information
PROJECT 3? Follow-on Work ¾ Integrate with applications ¾ Long term maintenance
PROJECT 1 Fall 2002
¾ Identify & survey stakeholders
NASA Taxonomy Web Site and Resource
http://nasataxonomy.jpl.nasa.gov/
http://ringmaster.arc.nasa.gov/jupiter/ jupiter.html#index
Reference Resource
Attribute | Values |
Information | Web Sites; Animations; Images; |
Reference Sources | |
Audiences | Educators; Students |
Organizations | Ames Research Center |
Missions and | Voyager; Galileo; Cassini; Hubble |
Projects | Space Telescope |
Industries | N/A |
Locations | Jupiter |
Functions | Scientific and Technical Information |
Disciplines | Planetary and Lunar Science |
Chronology | 1979-1999 |
http://wufs.wustl.edu/missions/odyssey/ #Odyssey%20Data%20Sets
Data Archive
Attribute | Values |
Information | Data Files; Web Sites |
Audiences | Researchers; Scientists |
Organizations | Jet Propulsion Laboratory |
Missions and | Mars Odyssey |
Projects | |
Industries | N/A |
Locations | Mars |
Functions | Scientific and Technical Information |
Disciplines | Planetary and Lunar Science |
Chronology | 2002-present |
http://www.cmf.nrl.navy.mil/clementine/
Web Site
Attribute | Values |
Information | Web Sites; Data Files; Images |
Audiences | Researchers; Scientists; Educators; |
Students | |
Organizations | Naval Research Laboratory |
Missions and | Clementine |
Projects | |
Industries | N/A |
Locations | The Moon |
Functions | Scientific and Technical Information |
Disciplines | Planetary and Lunar Science |
Chronology | 1994 |
Do You Have to Use All Fields in Every Case?
The NASA Taxonomy is designed to be used in many different scenarios.
5 Use Case Scenarios
Publishing to the NASA public portal.
Could use these NASA Taxonomy elements: Audience Content Type Coverage (might be regional) Mission/Project Subject
Publishing to a NASA engineering portal.
Could use these NASA Taxonomy elements: Audience Competency Content Type Mission/Project Subject Instrument
Integrate information across multiple Centers for management reporting.
Could use these NASA Taxonomy elements: Competency Content Type Mission/Project Subject Business Purpose
Records Retention and Archiving.
Could use these NASA Taxonomy elements
Building our knowledge base
Assist in browse and navigation through large collections
of disparate information objects
Query multiple repositories with a single, unified interface
Could use these NASA Taxonomy elements: Spacecraft anomalies may require research across Problem Failure Reporting systems, PDMS systems, risk management databases, and many others Must be confident that all of the relevant material has been found
Introductions What is Taxonomy? What is Dublin Core? What is the NASA Taxonomy? Using the NASA Taxonomy Case Study: JPL Unified Search for Project Information
Fragmented and non-
interoperable repositories Inefficient and broken
Engineering 86
processes and
applications
Science 8
Parallel and redundant
Business/Admin. 28
efforts both in building
information systems and
Infrastructure 28
managing data Limited tools and services
Outreach 1
that cut across program
and line organizations
Total 157
JPL data dictionaries are too narrow to interoperate EAs seeking “data harmonization” Semantic frameworks allow for mappings of data
elements to larger vocabularies Semantic relationships require more than simple controlled vocabularies RDF statements allow specification of relationships for rules based inferencing
An integrated semantic architecture that mirrors and extends the enterprise architecture JPL Taxonomy: controlled vocabularies for JPL engineering communities By discipline, product, and process, etc. Centrally managed authority files for significant JPL asset attributes (ie, project names, etc) JPL Technical Thesaurus – equivalencies documented in RDF files Use semantic tools to present a unified navigation and search capability through JPL repositories
Case Study Goal: Allow Cassini flight project operations
teams to match anomalous behavior from spacecraft to engineering design specifications for problem resolution.
Collections: PFR System and the Cassini Electronic Library – Not a common metadata schema
Project Name | Project Name |
Anomaly Type | Content Type |
Subsystem | System |
Report Status | Project level |
Date | Responsible Team/WBS |
Date |
NASA Taxonomy
Content Types -Designs and Specifications
JPL Taxonomy
-Incident Surprise Anomaly -Corrective Action Notice Collections: PFR System and the Cassini Electronic Library
Mapping fields to each other using semantic hierarchies
Project Name
Content Type System Subsystem
Responsible Team/WBS Date Collection
A system whereby the user can browse all documents relating to the Cassini camera and its subsystem independent of any particular repository’s search engine.
Harmonization achieved by mapping terms to a common vocabulary (the Taxonomy)
Could browse by: System, Sub-system Instrument Content Type – PFRs, ECR’s, Designs Specs, etc. WBS or Responsible Team Date
Leverage what projects produce in the normal course of their business
WBS lists Document trees Document matrices DDCR, Flight Project Practices
There are many un-mined sources for semantic processing!
Contact Information:
Jayne Dutra, JPL
Jayne.E.Dutra@jpl.nasa.gov
(818) 354-6948
Joseph Busch, Taxonomy Strategies
jbusch@taxonomystrategies.com
(415) 377-7912