Taxonomy Use Examples

GSFC Project Librarians' Meeting October 28, 2004

Over 25 years in the business of organized information Founder & Principal, Taxonomy Strategies ™ Director, Solutions Architecture, Interwoven ™ VP, Infoware, Metacode Technologies (acquired by Interwoven,

November 2000) ™ Program Manager, Getty Foundation ™ Manager, Pricewaterhouse

Metadata and taxonomies community leadership President, American Society for Information Science & Technology ™ Director, Dublin Core Metadata Initiative ™ Adviser, National Research Council Computer Science and

Telecommunications Board ™ Reviewer, National Science Foundation Division of Information and Intelligent Systems ™ Founder, Networked Knowledge Organization Systems/Services

Government

Defense Intelligence Agency ™ Federal Aviation Administration ™ FirstGov ™ Forest Service ™ HeadStart ™ NASA ™ Small Business Administration ™ Social Security Administration ™ USDA Economic Research Service ™ USDA OCIO e-Government

Program

Non-Profit

Dublin Core Metadata Initiative ™ IDEAlliance

Commercial

Blue Shield of California ™ Halliburton ™ HP ™ Motorola ™ PeopleSoft ™ Sprint ™ Time, Inc.

™ for Critical Mass – Fortune 50 retail ™ for Deloitte Consulting – top credit card issuer

International

European Standards Organisation ™ Government of Singapore

Terms Definitions
Metadata Metadata is structured information to describe content. Typical
metadata fields are Title, Author, Subject, Publication Date, etc.
Values Values for metadata fields may be free text (e.g. Title), a
specified data type such as a number or date, or come from a
predefined list (e.g. predefined codes for Subjects).
Controlled Vocabulary A managed set of terms that have been explicitly defined and agreed upon. All terms in a controlled vocabulary have an unambiguous, non-redundant definition. Additions and deletions
are “controlled”, meaning a process must be followed to change
Taxonomy the list.Taxonomy is defined as a system for naming and organising
things into groups that share similar characteristics. It is a set of
terms, organized into a structure. The terms might be the names
of people, places, organizations, things, and concepts. The
organization may be hierarchical and/or a set of mutually
Facet exclusive categories called facets.Facets enable the classification of content from multiple dimensions. It is a discrete branch of a taxonomy, with a
separately maintained controlled vocabulary. Facet values are

given in separate metadata fields.

Introductions ™ What is Taxonomy? ™ What is Dublin Core? ™ What is the NASA Taxonomy? ™ Using the NASA Taxonomy ™ Case Study: JPL Unified Search for Project Information

Animalia

Chordata

Mammalia

Carnivora

Canidae

Canis

C. familiari

Kingdom

Phylum

Class

Order

Family

Genus

Species

Linnaeus …

Pets Mammals Farm

UNSPSC

44-Office Equipment and Accessories and Supplies

.12-Office Supplies .17-Writing Instruments

.05-Mechanical pencils .06-Wooden pencils .07-Colored pencils

Segment

Family

Class

Commodity

School Supplies Office Supplies Art Supplies

… find the right information at the right time to solve the problem at hand

rement Risk Mgmt Meeting Small Business Institutions Plans Transportation Sales Legal Advisory Services Rates and Institutional Info Tech Strategy Personnel Banking services Rankings Investors Finance Travel Training High Net Worth Financial Non-US Professionals Procurement Forms Services Instruments Individual Management

Business Units PR Private Asset Management Investors Support

Models Management Market Strategy Financial Education Code … Marketing Intermediaries Health Care Rich media Mass Media Regulators Arts

Images Public Relations Media Hospitality Video Purchasing Vendors Other Services Rates Special groups Public Admin Rates and Internal …

Rankings Employees Ratios Partners Research Contractors Risk Boards Settlements and

Damages Statistics

The power of taxonomy facets

4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104)

ƒ Easier to maintain ƒ Can be easier to navigate

Introductions ™ What is Taxonomy? ™ What is Dublin Core? ™ What is the NASA Taxonomy? ™ Using the NASA Taxonomy ™ Case Study: JPL Unified Search for Project Information

ƒ Dublin Core (DC) is the Metadata

Standard for describing Internet

resources so they are easy to find.

ƒ DC is being used as the starting point for many metadata specifications.

ƒ DC is a set of 15 basic elements. All are optional and repeatable.

Original Dublin Core workshop approved as

held in Dublin, ISO 15836.Ohio.

Shanghai meeting.

Identifier Date
Title Source
Creator Relation
Contributor Rights
Publisher Format
Subject Type
Description Language
Coverage

95 0304 For more information:

http://www.dublincore.org

A Small Metadata and Taxonomy Example

Metadata Standard

Field Example Value
Identifier www.iras.gov.sg/taxation/income_tax.html
Creator MOF > IRAS
Title Taxation in Singapore
Date 2002-01-11
Subject Business > Taxation > Income Tax

Introductions ™ What is Taxonomy? ™ What is Dublin Core? ™ What is the NASA Taxonomy? ™ Using the NASA Taxonomy ™ Case Study: JPL Unified Search for Project Information

PROJECT 3? Follow-on Work ¾ Integrate with applications ¾ Long term maintenance

PROJECT 1 Fall 2002

¾ Identify & survey stakeholders

NASA Taxonomy Web Site and Resource

http://nasataxonomy.jpl.nasa.gov/

http://ringmaster.arc.nasa.gov/jupiter/ jupiter.html#index

Reference Resource

Attribute Values
Information Web Sites; Animations; Images;
Reference Sources
Audiences Educators; Students
Organizations Ames Research Center
Missions and Voyager; Galileo; Cassini; Hubble
Projects Space Telescope
Industries N/A
Locations Jupiter
Functions Scientific and Technical Information
Disciplines Planetary and Lunar Science
Chronology 1979-1999

http://wufs.wustl.edu/missions/odyssey/ #Odyssey%20Data%20Sets

Data Archive

Attribute Values
Information Data Files; Web Sites
Audiences Researchers; Scientists
Organizations Jet Propulsion Laboratory
Missions and Mars Odyssey
Projects
Industries N/A
Locations Mars
Functions Scientific and Technical Information
Disciplines Planetary and Lunar Science
Chronology 2002-present

http://www.cmf.nrl.navy.mil/clementine/

Web Site

Attribute Values
Information Web Sites; Data Files; Images
Audiences Researchers; Scientists; Educators;
Students
Organizations Naval Research Laboratory
Missions and Clementine
Projects
Industries N/A
Locations The Moon
Functions Scientific and Technical Information
Disciplines Planetary and Lunar Science
Chronology 1994

Do You Have to Use All Fields in Every Case?

No, use what is appropriate to the case at hand.

The NASA Taxonomy is designed to be used in many different scenarios.

5 Use Case Scenarios

Publishing to the NASA public portal.

Could use these NASA Taxonomy elements: Audience ™ Content Type ™ Coverage (might be regional) ™ Mission/Project ™ Subject

Publishing to a NASA engineering portal.

Could use these NASA Taxonomy elements: Audience ™ Competency ™ Content Type ™ Mission/Project ™ Subject ™ Instrument

Integrate information across multiple Centers for management reporting.

Could use these NASA Taxonomy elements: ™ Competency ™ Content Type ™ Mission/Project ™ Subject ™ Business Purpose

Records Retention and Archiving.

Could use these NASA Taxonomy elements

Building our knowledge base

™ Assist in browse and navigation through large collections

of disparate information objects

Query multiple repositories with a single, unified interface

Could use these NASA Taxonomy elements: Spacecraft anomalies may require research across Problem Failure Reporting systems, PDMS systems, risk management databases, and many others ™ Must be confident that all of the relevant material has been found

™ Introductions ™ What is Taxonomy? ™ What is Dublin Core? ™ What is the NASA Taxonomy? ™ Using the NASA Taxonomy ™ Case Study: JPL Unified Search for Project Information

™ Fragmented and non-

Data Repositories Identified

interoperable repositories ™ Inefficient and broken

Engineering 86

processes and

applications

Science 8

Parallel and redundant

Business/Admin. 28

efforts both in building

information systems and

Infrastructure 28

managing data ™ Limited tools and services

Outreach 1

that cut across program

and line organizations

Total 157

JPL data dictionaries are too narrow to interoperate ™ EAs seeking “data harmonization™ Semantic frameworks allow for mappings of data

elements to larger vocabularies ƒ Semantic relationships require more than simple controlled vocabularies ƒ RDF statements allow specification of relationships for rules based inferencing

An integrated semantic architecture that mirrors and extends the enterprise architecture ™ JPL Taxonomy: controlled vocabularies for JPL engineering communities ƒ By discipline, product, and process, etc. Centrally managed authority files for significant JPL asset attributes (ie, project names, etc) ™ JPL Technical Thesaurus – equivalencies documented in RDF files ™ Use semantic tools to present a unified navigation and search capability through JPL repositories

Case Study Goal: Allow Cassini flight project operations

teams to match anomalous behavior from spacecraft to engineering design specifications for problem resolution.

  1. Characterize targeted databases/repositories ƒ ECR, PFRS, Docushare, Team Center, et. al.

  2. Create RDF from data architectures

  3. Queries identify fields of interest using semantic properties and return integrated result sets

Collections: PFR System and the Cassini Electronic Library – Not a common metadata schema

PFR: CEL:

‰ Project Name ‰ Project Name
Anomaly Type ‰ Content Type
Subsystem ‰ System
‰ Report Status ‰ Project level
‰ Date ‰ Responsible Team/WBS
‰ Date

NASA Taxonomy

Content Types -Designs and Specifications

JPL Taxonomy

-Incident Surprise Anomaly -Corrective Action Notice Collections: PFR System and the Cassini Electronic Library

Mapping fields to each other using semantic hierarchies

‰ Search and Browse the catalogue by:

‰ Project Name

‰ Content Type ‰ System ‰ Subsystem

Responsible Team/WBS ‰ Date ‰ Collection

A system whereby the user can browse all documents relating to the Cassini camera and its subsystem independent of any particular repository’s search engine.

Harmonization achieved by mapping terms to a common vocabulary (the Taxonomy)

Could browse by: System, Sub-system ™ Instrument ™ Content Type – PFRs, ECR’s, Designs Specs, etc. ™ WBS or Responsible Team ™ Date

Leverage what projects produce in the normal course of their business

WBS lists Document trees ™ Document matrices ™ DDCR, Flight Project Practices

There are many un-mined sources for semantic processing!

Contact Information:

Jayne Dutra, JPL

Jayne.E.Dutra@jpl.nasa.gov

(818) 354-6948

Joseph Busch, Taxonomy Strategies

jbusch@taxonomystrategies.com

(415) 377-7912