James Myers, Al Geist, Jens Schwidder, Alan Chappell,, Tara Talbott, Mike Peterson, Carina Lansing
During the last quarter, the SAM team focused its development efforts on migration of the SAM Client libraries to the Slide version 2 code base, on prototype web-based notebook components, hardening SAM 1.1, and on improving portal and Grid integration. A sub-contract from the NSF-sponsored George E. Brown, Jr. Network for Earthquake Engineering Simulation Grid (NEESgrid) project was executed to support integration of the SAM-based Electronic Laboratory Notebook (ELN) and aspects of SAM itself into the NEESgrid software suite. Complementing this work were a large number of community interactions at major conferences and coordination with other efforts in developing new proposals. SAM-related presentations were given at the NEES Awardees Meeting, the SciDAC PI Meeting, and the Americal Chemical Society Meeting.
Ongoing work includes preparation for release of a SAM 1.2 bug-fix release that includes support for web-service transforms, definition of a SAM 2.0 release including support for versioning and semantically-scoped queries, development of the Data Format Description Language (DFDL) within Global Grid Forum, and interactions with the Jakarta Slide project and Java Content Repository standard Expert Groups. NEESgrid funded integration work will continue in the next quarter.
Data Grid Integration: The SAM team has investigated options for integrating SAM's naming, annotation, translation, and records capabilities with underlying Data Grid repositories. Recent development work has targeted connections to Data Grids via GridFTP that support discovery of existing Data Grid contents and integration of access control and other file-system metadata.
Semantic Grid: With the release of initial RDF capabilities functionality in SAM 1.1, and growing community interest in semantic data mapping, the SAM team has shifted emphasis towards the detailed design of the Semantic Services (SS) layer and Semantic Grid concepts. Several initial capabilities are being developed and evolved to help elicit requirements while design work proceeds towards a more comprehensive mechanism. In particular this quarter, the RDF pedigree/provenance property has been enhanced to include reified information concerning the software used to generate derived data. Detailed planning has begun for implementing a semantic search capability accessible via WebDAV's Distrubuted Searching and Locating (DASL) search mechanism.
Data Format Description Language (DFDL):Work is continuing to design a standard for a language that can describe the content of arbitrary data files. The Grid Forum DFDL working group has been very active, working by email after intesive meetings at Grid Forum 9 and SuperComputing 2003. The SAM team is very involved in crafting the standard and has contributed significant concepts that derive from the developed of the BFD language and its extensions within the SAM project.
Slide 3.0 Migration: Discussions with the University of Michigan CHEF project and the Storage Resource Broker team have identified the emerging Java Content Repository (JSR 170) standard as a potential integration API. To prepare for such use, the SAM team is continuing to investigate the changes that will be necessary to migrate SAM MMS and notebook functionality to the Slide 3.0 JSR 170 reference implementation. The JSR 170 specification is expected to move quickly towards a final version over the next few months with public review of the current specification document to occur within days. Slide 3.0 will provide a higher-level server-side API and will standardize some of the functionality for messaging and configurable security that the SAM project has added to Slide.
Hardening and extending SAM 1.1:SAM 1.1 has been significantly hardened and several features have been back-ported from the SAM 2.0 code-base to support the immediate needs of collaborating projects. Specifically, SAM 1.1 now provides multi-step metadata generation and data transformation mechanisms that may include BFD, web service, and/or XSLT steps, a new ServletFilter-based security mechanism that simplifies integration with third party authentication services and use of SAM 1.1 on non-Tomcat application servers.
SAM 2.0:Initial migration to the Slide 2 codebase has been performed. On the client side, this work introduced support of https in the standard SAM library based on the Apache Commons http-client package. The SAM server has also been refactored to work with Slide 2 and work is continuing to expose and exploit newly available capabilities including DAV versioning, binding (hard links) and the DASL search language. The SAM team is also exploring the capabilities of new data storage modules developed for Slide which promise better performance and scalability. In terms of community involvement, this migration synchronizes SAM with the Slide team and we continue to submit bug fixes to the Slide project.
Migration of PNNL notebook servers: Significant work has been done to prepare for an upgrade of all ELN servers operated at PNNL to use a SAM 1.1 server with SSL encryption. Several issues related to importing information from older Perl-based ELN servers and to the configuration of SAM on PNNL's managed secure web servers have been solved. Upgrading all servers is expected in the next quarter.
Open Source Software Licensing: SAM source code has been distributed to external collaborators under an Apache/BSD-style license. We anticipate posting source code at www.sourceforge.net this fiscal year.
SAM team members participated in a wide range of meetings, workshops, reviews, and collaboration discussions during this quarter:
Collaboratory for Multiscale Chemical Science (CMCS): An ongoing collaboration related to the use of SAM as the primary CMCS data/metadata management system. CMCS collaborated on the design of the web service interface for metadata extraction and data transformation and is providing ongoing feedback including bug reports and performance evaluation.
Network for Earthquake Engineering and Simulation (NEES) Grid: PNNL has accepted a subcontract from the NEESgrid project to integrate the ELN and SAM capabilites into the NEESgrid infrastructure. The effort leverages ongoing work in the SAM, CMCS, and CHEF projects and focuses on integration with the NEESgrid portal and metadata/data repository. This effort will result in a notebook capability for the NEESgrid project that will launch directly from the NEESgrid portal, provide Grid-based single sign-on with the portal, and store/retrieve data from the NEESgrid data/metadata repository. As a result of this subcontract, Jim Myers has resigned from the NEESgrid External Advisory Board.
Web Downloads Registrations to download SAM and notebook software are continuing at a pace of 1-2 per day.
International Conference on Semantics for a Networked World,with a focus on Grid Databases: Jim Myers was invited to serve on the Program Committee for this conference, which will be held July 17-19, 2004, Paris, France.
AAAS Research Competitiveness Service: Michigan Technology Tri-Corridor Fund: Jim Myers was asked to review for this program.
GGF 11: Semantic Grid Workshop Jim Myers was invited to serve on the program committee for this conference, which will be held June 10, Honolulu, HI.
GridSem 2004, 1st International Workshop on the Semantic Grid, Jim Myers was invited to serve on the program committee for this conference, which will be held August 23-24, Valencia, Spain.