CENDI PRINCIPALS AND ALTERNATES MEETING

NATIONAL LIBRARY OF MEDICINE
Bethesda, MD

February 25, 2005

MINUTES

PUBLIC (OPEN) ACCESS POLICY AND IPR CONSIDERATIONS FOR STI MANAGERS
Overview of the Policy
Policy Considerations
Building the Archive
 
Associated Legal Issues
GENERAL COUNSEL ROUNDTABLE: PUBLIC/OPEN ACCESS AND OTHER INTELLECTUAL PROPERTY ISSUES IN A CHANGING STI ENVIRONMENT
STI Activities by JST and Status of STI Policy in Japan

Welcome

Dr. Walter Warnick, CENDI Chair, opened the meeting at 9:10 am. He thanked NLM for hosting the meeting. Ms. Bortnick Griffith introduced Betsy Humphreys, the newly appointed Deputy Director of the NLM.

Ms. Humphreys welcomed the CENDI members. She expressed appreciation for the CENDI/NLM partnership, acknowledging the many years of involvement on the part of Kent Smith, her predecessor. NLM has been involved with CENDI for a long time and has gained much from its membership. She hoped that the other agencies have also gained from NLM’s involvement and that this relationship will continue.

Dr. Warnick welcomed the representatives from Japan’s Science and Technology Agency (JST). The JST representatives visited the Secretariat and DOE OSTI to learn how CENDI and Science.gov work. A tour of NLM had been arranged as well as a visit to the National Institutes of Standards and Technology (NIST). Ms. Carroll reflected that the visit was enlightening to members of the Secretariat because they discovered again how CENDI succeeded in getting voluntary cooperation among 12 federal agencies.

PUBLIC (OPEN) ACCESS POLICY AND IPR CONSIDERATIONS FOR STI MANAGERS

“Overview of the Policy”
Jane Bortnick Griffith, Assistant Director for Policy Development, National Library of Medicine

The National Institutes of Health (NIH) recently issued a new Public Access Policy. NIH’s House Appropriations language indicated Congress’s interest in providing access to the results of publicly sponsored research. Congress wanted NIH to move forward and implement the policy this year.

NIH began by articulating the goals of the policy – to provide access, archive the results of its funded research, and advance science. It is critical to provide electronic access to the “best of the best” research. Institutional repositories are becoming more prevalent and this is the context in which NIH views this policy. Archiving and preservation have been NLM’s mission for more than 150 years. However, with online access and electronic content, it is possible to navigate and link throughout databases in ways that promote more advanced scientific discoveries.

The original draft, which requested those who obtained NIH funds, in whole or part, to deposit the final manuscript within six months of publication, was issued for comment in September. Many public comments were received via letters, e-mails and at public meetings. Ms. Bortnick Griffith gave some examples of the public comments. Those against the policy focused on cost and economic concerns, particularly for the learned societies and other small publishers. Those who were positive about the policy emphasized the need for access to biomedical information and the fact that taxpayers had already paid the information. NIH consulted with the Office of Information and Regulatory Affairs (OIRA) and other stakeholders during the process of developing the policy.

The final policy was announced on February 3, 2005, on the NIH web site and in the Federal Register on February 9, 2005. The final policy requests voluntary deposit of the final manuscripts of peer reviewed publications (no books, chapters, or conference papers) from research projects receiving whole or partial NIH funding, including intramural, cooperative agreements, contracts or grants, as soon as possible or within 12 months after publication. The change from 6 months to 12 months after publication provides flexibility for those publishing in publications that are quarterly or less frequent. The thought is that those requiring 12 months will be the exceptions.

An Advisory Working Group reporting to NLM’s Board of Regents, chaired by Dr. Tom Detrie, will monitor the implementation. NLM will be able to demonstrate the level of participation by NIH funded authors by comparing the number of manuscripts submitted to PMC with the number of citations in PubMed that have NIH grant numbers. NIH is developing training programs for staff and principal investigators. A letter was sent from Dr. Zehouni, Director of the NIH, to each principal investigator, and a workshop is being planned, which will reinforce what is expected of the investigators. The Public Access page contains questions and answers to which new ones will continue to be added.

There was some discussion as to whether the public could really be users of highly technical research results. One of the advantages of open access is that this information will be linked to other resources, including consumer health information like Medlineplus, which may help the public to understand the more complex research literature. There are no plans to “translate” research texts for the consumer.

“Policy Considerations”
Dr. David Lipman, Director, National Center for Biotechnology Information

There has been an essential shift to online access because of massive investments in information technology (IT) by industry and other institutions, both government and private. Authors can provide and access manuscripts electronically. Electronic peer review processes are in place. The generation and use of data from biomedical technology has increased rapidly, and the integration of data into publications is increasingly important. The underlying data is being used to make new discoveries. This connection to data maximizes the utility of the publications.

In addition, huge gains can be made by having this information publicly accessible as demonstrated by the genomic community. The removal of real and perceived barriers is a powerful force for the increase in the use of information. Diversity, data mining, and further evolution of the e-publishing model are likely. There was a 100-fold increase in the use of PubMed when it became freely accessible to the public. The usage of the Gene Test Database, a curated compilation indicating where clinical tests can be ordered, increased several fold when required registration was removed, even though there had never been a cost associated with it.

It is also important to empower patients and families with the most up-to-date information. There is information readily available on the Internet, but some of it is very biased. The availability of peer reviewed, scholarly information creates new opportunities in the high school and undergraduate education arenas, enhancing the diversification and democratization of science.

Just at the point in time when IT investment should have helped to reduce the price of journal publishing, journal prices are now rising at four times the consumer price index. A Wellcome Trust study pointed to the “perverse” economics of scientific publishing. It is not an efficient system, because there is a “disconnect” between those who pay for the subscriptions and the users of the information. NIH pays the publishing industry approximately $30 million per year to cover page charges.

In addition to economic issues, the digital environment raises issues of archiving and preservation of journals. The move is away from subscriptions to licenses where the library no longer archives the material. This is a serious issue; NLM has already been contacted by electronic journals that have gone out of business without proper archiving plans.

While the paper environment had some advantages with regard to archiving, digital materials are more difficult to archive than paper. There are often a large number of errors remaining in the content when it arrives for archiving at NLM. Different converters are needed even when the publishers are supposedly using NLM’s standard Archiving DTD.

There are currently about 350,000 articles in PubMed Central and approximately 800,000 articles are expected by the end of 2005. This estimate does not include the increase based on the NIH policy.

Dr. Lipman then went on to show some of the benefits that linking across information spaces can provide to the user. A user can search the full text of an article. If the article contains an organism name, a link is provided to the taxonomy tree for that organism. The PubChem Substance database consolidates small molecule chemical information which is also integrated with chemical names in PubMed Central articles. Similar compounds can be calculated based on substructure matching. There are also links to the Protein Structure database through the Chemical database. Cn3D is used to visualize protein structures. These and other links can be used to compute across the information spaces resulting in new research leads.

“Building an Archive: Technical Infrastructure and Operational Considerations”
Dr. Jim Ostell, Chief, Information Engineering Branch, National Center for Biotechnology Information

There are many social, technical, and psychological issues when taking in new journal content from publishers or their contractors. The information is checked against the NIH Grants Database. Elements of the manuscript are selected from the contributor’s local directories, and the software uploads all the files and generates a file in pdf. Approximately 400 different formats are handled. However, 80 to 90 percent of the contributors use some version of Word. The contributor is then asked to verify the pdf. If the contributor is the principal investigator, there is an automatic sign off. If not, a message is sent to the principal investigator. Images are converted. A contractor then tags the electronic submission using the XML DTD and quality control is performed. Notification is sent to the submitter, asking her to review the submitter proof, followed by a single round of corrections for tagging problems. All submissions are handled by a curator staff of four people who intervene only as needed.

If the document has not been published, the manuscript and metadata stay in this system until the NLM staff release the manuscript to the PubMed Centra (PMC) based on monitoring PubMed. If the publisher participates in the PMC, the author’s version will be replaced with the publisher’s version. There is also an option to view the author’s final manuscript. If information will be coming from the publisher, NLM does not tag the content, because it will come in via the transform.

It costs approximately $30-34 per paper for new submissions, including the XML tagging which is done by a single contractor that creates the normalized form. The budget for PubMed Central is $2.25 million per year. The Public Access Project budget is separate since it involves other activities beyond the database.

NLM has always been interested in creating archives at different sites rather than a strictly central model. Therefore, Portable PubMed Central is being developed, which will allow other institutions to build PMC-like archives.

Portable PMC is based on the NLM Archiving and Authoring DTD Version 3, a series of modules all of which are used for archiving and some of which are used for authoring. The archival format is an XML DTD, and rendering is done “on the fly” to an HTML presentation. This approach provides extensive quality control with over 2 million users a day looking directly at the “on the fly” content.

In its first phase, Portable PMC will mirror the PMC, but, over time, it will become more independent. A Japanese group, Microsoft, and the Wellcome Trust will be installing it. Other search engines can be hooked to the Portable PMC, giving the partners flexibility. There are hooks for different kinds of searching. Portable PMC does not come with its own search engine.

“Associated Legal Issues”
Barbara McGarey, NIH Legal Advisor, Office of the General Counsel, Department of Health and Human Services

Ms. McGarey outlined legal issues that were anticipated or that were raised from comments on the web site and other venues. The biggest concern raised during the public comment period was the impact of the deposit on the copyright interests of grantees. NIH is not invoking fair use; the public access policy upholds copyright. There has always been a government purpose license based on the government funding; it is there by operation of law.

There were people who thought that government purpose license could not be used. Ms. McGarey believes that it could be invoked if a publisher sues an author because they weren’t informed of the author’s intent to deposit the paper with NIH. NIH considers the author’s final manuscript to be a grants record.

The Paperwork Reduction Act was raised as an issue. However, NLM doesn’t consider the PRA to be applicable because the deposit is voluntary and no enforcement is contemplated. NLM has Office of Management and Budget (OMB) clearances and the journal articles aren’t government information.

The potentially negative impact on patent application filing was raised. NLM is making sure that authors can submit under confidentiality agreements until the manuscript can be made public.

Another comment cited the Bayh-Dole Act.  However, the Public Access Policy doesn’t go against the Bayh-Dole Act because Bayh-Dole refers to technology transfer rather than publication.  The NIH Public Access Policy deals with materials which the authors have already made publicly available through the journal publishing process.

GENERAL COUNSEL ROUNDTABLE: PUBLIC/OPEN ACCESS AND OTHE INTELLECTUAL PROPERTY ISSUES IN A CHANGING STI ENVIRONMENT"
Offices of General Council Panel

Ms. McGarey started the discussion among the attorneys by asking for comments from their perspectives. The attorneys expressed an interest in the reactions NLM expects from publishers. NIH believes that authors are willing to deposit their materials and that publishers and authors will have a discussion when the policy is invoked. NIH believes that the authors will like the policy because it results in wider dissemination of their work.

The question about linking as an alternative to deposit was raised. However, NIH was interested in archiving and preservation in addition to access. During the comment period, two publishers responded with alternatives that provided information retrieval, but did not result in an institutional archive or give access to the full text.

The institutional archive nature of the policy is primary. The policy is tied to grant reporting requirements. The annual reports require PIs to report on any publications resulting from the grant effort. Other agencies have reporting requirements such as annual and final reports. The reports are often of variable quality from the science perspective. If other agencies were to follow the NIH policy, journal articles might be accepted in lieu of the grant reports.

Joint authorship across agencies and across the government and private sectors creates complexity. Collaboration is increasingly complicated. New communication structures such as blogs, wikis and other interactive publications may also create issues.

While options for indicating government funding on publisher transfer agreements have become more common, there is often a lack of awareness on the part of authors, and government purpose rights are not well defined.

The FAR has discretion in how data is handled in the contract language. Several of the attorneys are working on codifying A-110 in the FAR. It will be important to recognize the NIH policy in these discussions. The question is whether the grant making community can come up with language that covers the waterfront, which then might find its way into the data clauses. The group also discussed other areas where policies may be lax, including CRADAs. In addition, there is much confusion about software. Grantees and contractors consider software to be copyrightable rather than patentable.

Having institutional repositories is of great interest to many of the agencies. EPA faces the issue with regard to getting reports. What is the interplay between articles and reports? They are trying to make reports more accessible and they stopped short of establishing a requirement. NSF has started to do text analysis on documents they received. A word analysis on the proposals provides categorization that can be used to assign documents to panels. This has been very helpful and efficient. The analysis of documents also allows a history of support for various types of science over time. It can help to shape how NSF programs should respond. Collecting documents is key to understanding and defining how NSF does its business. It allows for more effective management of the research portfolio.

Ultimately, there is the need to tell the user what he or she can do with the information. Threshold agreements and markings are needed to address this lack of information. It was suggested that the Creative Commons work might be of interest in this regard.

STI ACTIVITIES BY JST AND STATUS OF STI POLICY IN JAPAN
Chikako Maeda, Department of Planning and Coordination, Office of Science and Technology Information, Japan Science and Technology Agency (JST)

The mission of JST is to promote science and technology in Japan by conducting a broad range of activities including the conduct of basic research, the creation of new industries through technology transfer, the dissemination of scientific and technical information, promoting regional research activities, and improving public understanding of science and technology.

The history of JST dates back to the Japan Information Center of Science and Technology (JICST), which was formed in 1957 with responsibility for disseminating scientific and technical information. The Research & Development Corporation of Japan, which supported basic research and technology transfer, was established in 1961. In 1996, these two entities were joined into the Japan Science and Technology Corporation (JST). In 2003, it was reorganized as a part of the reform of the Japanese government into the Japan Science and Technology Agency, which is an independent administrative institution. While there are several ministries which have science and technology interests, JST is part of the Ministry of Education, Culture, Sports, and Science & Technology.

Ms. Maeda went on to describe the basic activities of JST. These include the creation of various database services to promote dissemination of scientific and technical information. There are a number of R&D project databases which primarily support researchers. Bibliographic and full-text databases are also provided – some free and others for a fee. The S&T literature database includes bibliographic databases and full text of e-journals. If full text is not available, linking is provided. JST’s STN is the node for STN International. Other activities focus on the development or collection of factual databases and other types of information resources.

The largest factual databases are housed at the Institute for Bioinformatics Research & Development. An example is the human genome database. They are also developing tools for advanced genome analysis, promoting R&D based on bio-information knowledge, and offering training courses about bioinformatics. Another type of factual database is the “Failure Knowledge Database” that provides users with information about accidents and failure cases in a variety of fields, including chemical engineering and material science. It is based on reports from government organizations.

JST future plans include the development of new service channels that will provide an in-house patent database system, an analysis and visualization tool for JST bibliographic databases, and a public Chemical Registry Database with 2 million chemical compounds.

JST attempts to cover all journals inside Japan despite budget constraints. The annual budget is 100,000 million yen or about $1 billion ( US). The majority of the budget is from government subsidies. Almost half of the JST budget supports basic grant programs. Support for S&T is not done smoothly or well across the ministries, which is why they were interested in CENDI.

Previous Page