Workshop on Data Derivation and Provenance Chicago, October 17-18, 2002 Organizing Committee: Peter Buneman, Ian Foster |
Provenance and data derivation are important to many aspects of scientific computation. In molecular biology, where data is
repeatedly copied, corrected, and transformed as it passes through numerous genomic databases, understanding where data has come from and
how it arrived in the user's database is of crucial to the trust a scientist will put in that data, yet this information is seldom captured
properly. In astronomy, useful results may have been been obtained by filtering, transforming, and analyzing some base data by a
complex assemblage of programs, yet we lack good tools for recording how these programs were connected and the context in which they were
run.
The importance of provenance goes well beyond verification; it is closely related to archiving and annotation, also important in the context of scientific data. Moreover it may be used in data discovery. Knowing the provenance of a data item may help the biologist to make connections with other useful data. The astronomer may want to understand a derivation in order to repeat it with modified parameters, and being able to describe a derivation may help a researcher to discover whether a particular kind of analysis has already been performed.
The purpose of this workshop is to bring together a group of researchers who have confronted these issues either in specific situations or in the development of generic principles and technology. The workshop is informal and will consist of a mixture of presentations and discussions.
Provisional Schedule and Agenda
The workshop will start with coffee and stuff at 8am Thursday October 17th and end 4pm Friday October 18th. We will arrange a group dinner Thursday and, if there is interest, on Friday also. The location is the Sheraton hotel (address below), Superior 1 and 2.
Thursday, October 17th
08:00
Coffee etc.
09:00-10:30 Introduction and scene-setting
talks; review goals
and format
10:30-11:00 Break
11:00-12:30 Session 1: Requirements and
applications in the biological sciences
14:00-15:30 Session 2: Provenance and
annotations
15:30-16:00 Break
16:00-17:30 Session 3: Workflow and
derivation
17:30-18:30 Discussion
Friday, October 18th
08:30-10:00 Session 4: Requirements and
applications in other sciences
10:00-10:30 Break
10:30-12:00 Session 5: Archiving and
versioning
13:30-14:30 Open mike
14:30-16:00 Discussion to synthesize
conclusions
Position Papers are now available online (they are still coming in...)
Location and Accommodation Information (NB: Hotel Block Only Good Until September 26th)
The meeting will take place at the same hotel as the Global
Grid Forum, i.e.
Sheraton Chicago Hotel & Towers
CityFront Center
301 East North Water Street
Chicago, Illinois, USA
We have negotiated a group rate as follows:
Single or Double Occupancy $155.00 US
Triple Occupancy $185.00 US
Quad Occupancy $215.00 US
a) If you are attending the Global Grid Forum meeting, then register in the usual way at http//www.gridforum.org/Meetings/ggf6/
b) If you are only attending the workshop (which is the case for most of you), then book directly by calling 877-242-2558 (or 312-464-1000) and indicate that you with the ANL/Data Workshop. (NOTE: This should now be working, despite earlier difficulties.)
Participant List
Malcolm Atkinson, U. Glasgow
Bruce Barkstrom, NASA
Raj Bose, UC Santa Barbara
Peter Buneman, U. Edinburgh
Maria Cláudia Reis Cavalcanti, UFRJ, Brazil
Rick Cavanaugh, U. Florida
Vassilis Christophides, FORTH
Umeshwar Dayal, HP Labs
Ewa Deelman, USC Information Sciences Institute
Ian Foster, Argonne/U.Chicago
Peter Fox, National Center for Atmospheric Research
Mike Franklin, UC Berkeley
Jim Frew, UC Santa Barbara
Rob Gardner, U.Chicago
Michael Gertz, UC Davis
Carole Goble, U. Manchester
Greg Graham, FermiLab
Bill Howe, Oregon Graduate Institute
Yannis Ioannidis, University of Athens
Sanjeev Khanna, U. Pennsylvania
Christoph Koch, Edinburgh
Michael Lesk, Bell Labs
Miron Livny, U. Wisconsin Madison
David Maier, Oregon Graduate Institute
Natalia Maltsev, Argonne National Laboratory
Bob Mann, U. Edinburgh
Marta Mattoso, UFRJ, Brazil
Jim Myers, Pacific Northwest National Laboratory
Norman Paton, U. Manchester
Carmen Pancerella, Sandia National Laboratory
Dave Pearson, Oracle UK
Larry Rahn, Sandia National Laboratory
Joel Saltz, Ohio State University
Alex Szalay, John Hopkins University
Wang-Chiew Tan, UC Santa Cruz
Jens Voeckler, U. Chicago
Mike Wilde, Argonne National Laboratory
Yong Zhao, U. Chicago
Thanks to our Sponsors
www.griphyn.org | www.nsf.gov | www.sc.doe.gov/ascr/mics |