REPORT ON THE ASSESSMENT OF ELECTRONIC GOVERNMENT INFORMATION PRODUCTS Prepared under contract (#RN 97007001) by Westat Rockville, Maryland for the UNITED STATES NATIONAL COMMISSION ON LIBRARIES AND INFORMATION SCIENCE commissioned by the UNITED STATES GOVERNMENT PRINTING OFFICE SUPERINTENDENT OF DOCUMENTS March 30, 1999 Mr. Michael F. DiMario The Public Printer The Government Printing Office North Capitol and H Sts. NW Washington, D.C. 20401 Dear Mr. DiMario, It is with great pleasure that I forward herewith a copy of the Final Report prepared by Westat, Inc., the contractor selected by the Government to undertake Phase II of the three-part study called "Assessment of Electronic Government Information Products." As you requested, the U.S. National Commission on Libraries and Information Science (NCLIS) planned and implemented this research survey, pursuant to an interagency agreement between NCLIS and the Government Printing Office (GPO), approved by the Joint Committee on Printing (JCP). This report follows on the process begun with the congressional requirement, contained in the Senate Report on H.R. 1854, the FY 1996 Legislative Branch Appropriations Act (P.L. 104-53), to identify the measures necessary for a successful transition to a more electronic Federal Depository Library Program. That requirement resulted in a study published by the Government Printing Office in June 1996. There was a consensus, however, that additional work was required (1) to identify the electronic formats and mediums used and/or planned by Federal publishing entities, and (2) to determine whether public or private sector standards do, or could, play a stronger role in reducing the unnecessary proliferation of these formats and mediums. These questions precipitated this survey. I am extremely pleased to note that the survey enjoyed the active support and participation of all three branches of Government. Twenty-four different Federal entities participated, including the Supreme Court, several committees of the Congress, one regulatory commission and 19 Executive Branch agencies, including most of the Cabinet Departments. In addition to this broad and diverse Federal involvement in the survey, an impressive 74 percent of the survey forms sent to the agencies were returned completed. I believe this level of interest and support is highly unusual, and could, perhaps, be construed as a reflection of agency desires to help establish a systematic baseline for measuring and monitoring the rapidly changing and evolving kinds and mix of preferred mediums, formats, and standards. Our representatives and your staff have been in close, harmonious contact from the earliest stages of planning for the survey, right up until the final stages of review of the final report. I want to take this opportunity to thank especially both the former and present Superintendents of Documents, as well as the staffs of the present and former directors of the Library Programs Service, and the Office of Electronic Information Dissemination Service, for the superb support NCLIS and the contractor received throughout the process. I also want to recognize the key role played by Forest Woody Horton, Jr. As consultant to NCLIS, Woody's broad knowledge of how Government works and his deep understanding of Information Resources Management helped to move the study along most effectively. Finally, I would like to recognize the support of Vice-Chair Martha B. Gould, Commissioners C.E. ("Abe") Abramson who chairs the NCLIS Access to Government Information Committee, Joan R. Challinor, and Jos‚-Marie Griffiths, all of whom have been staunch advocates throughout. I believe you are also aware of the strong interest and support NCLIS Executive Director Robert S. Willard personally accorded this study, beginning very early with his tenure as a commissioner and extending to the present day. The long review and analysis process of the contractor's statistical tabulations, findings, and observations has just begun. This demanding process will take some time, in part because the number of interested communities is so large, and in part because the subject matter is so technical, involving the full range of information handling formats, mediums, and standards, and quite diverse agency plans and practices. Ultimately, actions needed to be taken will most likely involve new or strengthened policies, rules, and regulations, as well as the adoption of technical standards, some of which could have legislative ramifications. It is now the Commission's intention to begin Phase III. We will take the now completed Phase II Westat report, as well as the Phase I report completed in 1997 by the National Academy of Sciences, as points of departure. They will be reviewed and we will determine if additional fact gathering is required. We can then move forward to draw conclusions and make recommendations to the Congress and the President from the multitude of facts and expert opinions received thus far. NCLIS will continue to consult with GPO, along with various knowledgeable individuals, interagency and special advisory groups, all of whom have been assisting us throughout the Phase I and II efforts, as we prepare a plan for the Phase III initiative. My hope is that we will keep most of the broader advisory team we have utilized thus far in place until we have completed Phase III. Finally, I want to thank you for your personal leadership, without which we could have never moved ahead with this complex, landmark task. Sincerely yours, Jeanne Hurley Simon Chairperson Acknowledgments The study was performed by Westat under contract with the U.S. National Commission on Libraries and Information Science (NCLIS). The Government Printing Office commissioned this study as part of the transition to a more electronic FDLP. Westat's Task Leader was Denise Glover, and the project staff included Sarah Bennett-Harper, Debbie Alexander, and Ethel Sanniez. Libby Farris served as Westat's Project Director. Forest Woody Horton, Jr., a consultant to NCLIS, served as the Project Director. Francis J. Buckley, Jr., the Superintendent of Documents, Gil Baldwin, and T.C. Evans served as the key Government Printing Office (GPO) liaison officials throughout all phases of study design, implementation, and evaluation. Wayne P. Kelley, former Superintendent of Documents, and James D. Young, former Director of GPO's Library Programs Service, were instrumental in recognizing the need for this study and in shaping its direction. Robert S. Willard, NCLIS Executive Director, and Judy Russell, NCLIS Deputy Director, provided strong overall guidance and supervision. Westat also wishes to thank the Depository Library Council, NCLIS Chairperson Jeanne Hurley Simon, and Vice-Chair Martha B. Gould, and the NCLIS Committee on Access to Public Information, chaired by Commissioner C. E. ("Abe") Abramson, for their interest and support. In addition, Westat expresses its deep appreciation to the 24 participating Federal agencies, especially the Chief Information Officers, the agency coordinators, the product respondents, and key officials in the agency library, printing and publishing, information technology, public affairs, and other functional offices. Finally, Westat expresses appreciation to the various experts interviewed and the depository libraries visited, all of which are listed in appropriate appendices. Table of Contents Section Page Acknowledgments i Executive Summary xi 1. Introduction and Background 1 The Federal Depository Library Program 1 How the Federal Depository Library Program Works 2 Background of the Study 2 Project Phases 3 Study Goals and Objectives (Phase II) 4 Scope and Organization of the Report 5 2 Methodology 7 Product Selection 7 Coordinator Briefings 8 Questionnaire Design 9 Distribution of the Questionnaires 9 Followup for Nonresponse, Data Retrieval, and Inconsistency 10 Methodology for Qualitative Data Collection 11 Site Visits to Depository Libraries 11 Purpose and Procedures for Agency Meetings 12 Expert Interviews 14 3 Survey Analysis and Findings 17 Structure of the Questionnaire 17 Section A Responses 18 Section B Responses 19 Types of Data Contained in Product 19 Types of Mediums Used 20 Format Types Used 23 User Interfaces 28 Searchability of Product 30 Product Host 31 Retrievability of Product 32 Section C Responses (Planned Product Profile) 33 Types of Data 33 Types of Mediums 34 Types of Formats 36 Section D Responses (Other Information) 37 Metadata 37 Permanent Public Access 38 Permanent Retention 39 Ensuring Authenticity 40 Updating/Refreshing Plans 41 Changing Supporting Technology 42 User Fees 43 Licensing 43 Public Domain 44 Section E Responses 45 Study Questions 46 Preferred Medium and Format Standards 46 Public Access to Products 50 Other Issues: Authenticity and Metadata 53 4 Qualitative Findings 55 Site Visits to Federal Depository Libraries 55 Highlights of Site Visits to Three Depository Libraries 55 User Needs and Concerns 55 Librarians' Concerns: User Fees, Hardware, Training, and Costs 56 Agency Meetings 56 Agency Meeting Highlights 57 Preferred Mediums and Formats 57 Assessing User Needs 58 Information Life Cycle Management, Permanent PublicAccess, and Permanent Retention 58 Cost-Effectiveness of Various Mediums and Formats 58 Expert Interviews 58 Interviews With Webmasters 59 Preferred Formats 59 User Needs 59 Interviews With Preservation Specialists 59 Goals of Preservation 59 Barriers to Preservation of Digital Materials 60 Current Preservation Models and Initiatives 60 CLIR Initiatives 60 Interviews With Information Resources Management Specialists 61 Barriers to Successful Implementation of Information Resources Management Initiatives 61 5 Discussion of Quantitative and Qualitative Findings 63 Preferred Mediums and Format Standards 63 Evaluating Websites 65 Cost-Effectiveness of Formats and Mediums 65 Depository Library Needs 66 Public Access 67 Permanent Public Access to and Permanent Retention of Electronic Government Information 67 Perspectives on Permanent Public Access and Information Life Cycle Management from Information Resources Management Experts 68 Current Initiatives on Permanent Public Access and Permanent Retention 69 Next Steps 71 Bibliography 73 List of Appendixes Appendix A Agency Study Coordinator Meetings Agenda A-1 B List of Agency Coordinators and Other Key Officials B-1 C List of Participating Agencies and Products Surveyed C-1 D Coordinator and Respondent Cover Letters D-1 E Questionnaire and Glossary of Terms E-1 F Site Visits to Three Federal Depository Libraries and Interview Questions F-1 G Electronic Government Information Products Assessment Agency Meetings Held and Discussion Questions G-1 H Assessment of Electronic Government Information Products List of Expert Interviews and Interview Questions H-1 I Sample Agency Meeting Agenda Electronic Government Information Products Assessment I-1 J Task 16-Assessment of Electronic Government Information Products-Statement of Work J-1 List of Tables Table Page 1 Number of surveys returned by each agency surveyed 18 2a Number and percent of types of data, by the type of data contained 20 2b Number and percent of types of data, by the primary type of data 20 3a Number and percent of mediums publicly available, by the type of medium used and primary medium used 22 3b Number and percent of mediums publicly available, by the standard for each medium used 23 4a Frequency and percent of formats used, by the type of format used and primary type of format used 25 4b Number and percent of formats used, by the standard for each format used 26 5 Number and percent of products reported as being in an online medium 28 6a Number and percent of online approaches used, by type of online tool used 29 6b Number and percent of online approaches used, by the standard for each online tool used 30 7 Number and percent of responses regarding searchability of the product 31 8 Number and percent of responses regarding where the product is hosted 31 9 Number and percent of responses concerning the retrievability status of the product 32 10 Number and percent of respondents reporting plans to discontinue publication of the product 33 11 Number and percent of responses regarding the planned changes to the type of data contained in the product 34 12 Number and percent of responses regarding the timeframe for planned changes to the type of data contained in the product 34 13 Number and percent of responses regarding the planned changes to the mediums used for the future 35 14 Number and percent of responses regarding the timeframe for planned changes to product medium used 35 15 Number and percent of responses regarding the planned changes to the formats the product will contain 36 16 Number and percent of responses regarding the timeframe for planned changes to the product format used 36 17 Number and percent of respondents reporting a metadata record for the product 37 18 Number and percent of responses regarding the entity providing permanent access to the product 39 19 Number and percent of responses regarding products for which access will be provided in the future 39 20 Number and percent of responses regarding permanent retention of the product 40 21 Number and percent of respondents who reported the agency ensures authenticity for the product 41 22 Number and percent of responses regarding how frequently the product is updated or refreshed 42 23 Number and percent of responses regarding the plans for supporting technology of the product 42 24 Number and percent of respondents reporting that user fees are charged for the product 43 25 Number and percent of respondents reporting about the use of licensed commercial search and retrieval software for the product 44 26 Number and percent of responses regarding coverage by the agency software license 44 27 Number and percent of respondents reporting the public domain status of the product 45 28 Number and percent crosstabulations of products in both paper and CD-ROM formats 47 29 Number and percent crosstabulations of products in both CD-ROM and web formats 47 30 Number and percent crosstabulations of products in both paper and web formats 48 31 Number and percent crosstabulations of products in both HTML and PDF formats 48 32 Number and percent crosstabulations of products in both HTML and GIF formats 49 33 Number and percent of products that use HTML with GIF and ASCII formats 49 34 Number and percent of products that use HTML with PDF and ASCII formats 49 35 Number and percent crosstabulations of products that are permanently public accessible and scheduled for retention with the National Archives and Records Administration (NARA) 50 36 Number and percent crosstabulations for products with licensed commercial search and retrieval software and user fees charged for the product 51 37 Number and percent crosstabulations of those products with licensed commercial search and retrieval software and the product is scheduled for permanent retention by the National Archives and Records Administration (NARA) 52 38 Number and percent crosstabulations of those products for which agencies ensure authenticity and permanent public access 53 39 Number and percent crosstabulations of those products for which agencies ensure authenticity and another agency provides permanent public access 53 40 Number and percent crosstabulations of products that are hosted by the agency and have a metadata record 54 41 Number and percent crosstabulations of products that are hosted by another agency and have a metadata record 54 REPORT ON THE ASSESSMENT OF ELECTRONIC GOVERNMENT INFORMATION PRODUCTS Prepared under contract (#RN 97007001) by Westat Rockville, Maryland for the UNITED STATES NATIONAL COMMISSION ON LIBRARIES AND INFORMATION SCIENCE commissioned by the UNITED STATES GOVERNMENT PRINTING OFFICE SUPERINTENDENT OF DOCUMENTS March 30, 1999 Executive Summary The Federal Depository Library Program (FDLP) has served and continues to serve the American public by ensuring localized access to Federal Government information. The mission continues to be as important today to the fundamental success of our democracy as it was when the FDLP was created. The FDLP's original mandate, to assist Americans regardless of economic, education, or geographic considerations, is one that must not be lost as we strategically and thoughtfully use the tools of the electronic age to enhance that mandate. Letter to Michael F. DiMario, the Public Printer, from Senators John Warner and Wendell Ford of the Senate Committee on Rules and Administration, May 24, 1996. Background Congress established the antecedents to the Federal Depository Library Program (FDLP) in the Act of 1813 to ensure that the American public has access to its Government's information. The mission of the FDLP, part of the Superintendent of Documents (SuDocs) in the Government Printing Office (GPO), is to assure current and permanent public access to the universe of information published by the U.S. Government. Depository libraries safeguard the public's right to know by collecting, organizing, maintaining, preserving, and assisting users with information from the Federal Government. GPO provides that information at no cost to designated depository libraries throughout the country. These depository libraries, in turn, provide local, no-fee access to Government information in all formats in an impartial environment with professional assistance. Any member of the public can visit these depository libraries and use the Federal depository collections. In order to administer the FDLP, as required by the enabling legislation for the program, 44 U.S.C. Chapter 19, the SuDocs is responsible for the acquisition, classification, format conversion, dissemination, and bibliographic control of tangible and electronic Government information products; the inspection of depository libraries; and the continuing education and training initiatives that strengthen the ability of depository library personnel to serve the public. An emerging new responsibility is to ensure that electronic Government information products disseminated through the FDLP, or incorporated in the FDLP Electronic Collection, remain permanently accessible to the public. Under 44 U.S.C., Sections 1901-1903, and Office of Management and Budget (OMB) Circular A-130, Management of Federal Information Resources, Federal agencies should make all their publications in all formats available to SuDocs for distribution to depository libraries. This study to assess electronic medium and format standards for the creation and dissemination of electronic information products is an essential step toward ensuring a successful and cost-effective transition to a more electronic FDLP. The three goals of this assessment were to: > Identify medium and format standards that are the most appropriate for permanent public access; > Assess the cost-effectiveness and usefulness of various alternative medium and format standards; and > Identify public and private medium and format standards that are, or could be used for products throughout their entire information life cycle, not just at the dissemination or permanent public access stage. The Superintendent of Documents will use the results of this work effort to continue to plan and implement the transition to a more electronic FDLP. The five major specific objectives are: > First, with respect to electronic publishing practices and plans for Federal agencies (including ways in which the FDLP can best accommodate them), the objective is to provide an analysis of current practices as well as future plans for creating, disseminating, and providing permanent public accessibility to electronic information products, and to identify the standards for software and electronic mediums and formats that are used throughout the product's information life cycle, from creation to archiving but especially at the stage of dissemination for permanent public access. > Second, with respect to cost-effectiveness of various dissemination mediums and formats that are, or could be utilized, the objective is to gather information on standards (whether mandated or consensual) that will assist the FDLP in making near-term decisions regarding the cost-effectiveness of alternative mediums and formats for all FDLP participants. This information should also assist participants in long-term planning for permanent public accessibility, and the collection and analysis of overall information life cycle costs. > Third, with respect to the practical utility of various electronic mediums and formats to depository libraries and the public, the objective is to identify preferred standards used in various mediums and formats that depository libraries will need to support. > Fourth, with respect to utilizing standards employed in mediums and formats that can be used throughout all stages of the information life cycle (including creation, composition, computer terminal display, encryption, secure digital signature with non-repudiation, and secure transmission capabilities), for electronic dissemination, but especially permanent public accessibility, the objective is to assess standards for basic security services in order to provide for secure and reliable transmission and document interchange. > Fifth, with respect to standards that are being developed and used in the private sector, the objective is to identify existing and planned standards for the purpose of determining what the FDLP must do to accommodate their adoption in terms of hardware/software requirements, staff and user education and training, and budgetary impacts. Methodology The study utilized both quantitative and qualitative data collection activities: a survey of a cross-section of 314 Government information products from 24 agencies and interviews with experts. The response rate for the survey was 74 percent. This cross-section of products was not a randomly selected sample due to cost and time constraints. Instead, NCLIS and GPO-assisted by various groups, including the library associations represented by the Inter-Association Working Group on Government Information Policy (IAWG), the Federal Library and Information Center Committee (FLICC), the Depository Library Council (DLC), and the Interagency Council on Printing and Publication Services (ICPPS)- developed and refined the criteria for product selection. NCLIS, GPO, and the other organizations asked knowledgeable members of these groups to identify products that met one or more of six criteria. NCLIS distributed the list of preliminary products to agency Chief Information Officers (CIOs) who were asked to validate and coordinate the final selections with their appropriate agency personnel. In addition, NCLIS asked CIOs to select an agency coordinator. The coordinator's role was to oversee the distribution of product questionnaires to the appropriate respondents and to encourage respondents to complete the questionnaire and return it to Westat. Product selection was based on six criteria: > Increased emphasis on electronic dissemination, rather than continuation of paper and microform dissemination; > Replacement of older electronic mediums and formats with state-of-the-art technologies; > Adoption of mandated (Government or private sector) and consensual (common agency practice) medium and format standards; > Adoption and use of preferred mediums or formats that have widespread support from agency, depository library, and user communities; > Exemplified cost-effective mediums and standards, especially those that can be used throughout the entire information life cycle, rather than the use of expensive customized or shelf packages; and > Exemplified awareness of the important impact of medium and format decisions on permanent accessibility, authentication, and/or security encryption protection. The survey requested information on four main topics: > General information about the product and agency that produced it. > The product's current profile including the kinds of data the product contains, mediums in which it is produced, formats and online approaches used (if applicable); and searchability and retrievability of the product. > Future plans for the product including changes in its data, mediums, and formats. > Other issues including metadata, permanent public access, permanent retention, authenticity, updating/upgrading plans, user fees, licensing, and public domain. The qualitative data collection included site visits to three depository libraries, meetings with representatives of five Government agencies, and telephone interviews with six experts. The qualitative data collection included site visits, agency meetings, and expert interviews. Westat conducted site visits to three Federal depository libraries: > McKeldin Library, University of Maryland College Park, College Park, Maryland > Washington College of Law Library, American University, Washington, D.C. > Montgomery County Rockville Regional Public Library, Rockville, Maryland The purpose of the visits was to discuss the effects of the transition to a more electronic Federal Depository Library Program on the end user and on the services and resources of each library. Meetings with agency representatives had a twofold purpose: > To collect qualitative data about electronic Government information products, such as cost-effectiveness of standards, use of locator tools, results of user surveys, etc., that were not covered in the survey; and > To discuss the procedures for distribution of the questionnaire. In addition to inviting agency coordinators and respondents, the statement of work specified that Westat invite representatives of the following offices to attend the meetings: > Public affairs or communications offices > Agency printing and publishing units > Information technology or electronic information systems offices > Agency libraries, and > Relevant program offices. The following six agencies agreed to schedule a meeting: Department of Health and Human Services, Department of Education, U.S. Supreme Court, Department of Commerce, Environmental Protection Agency, and the National Archives and Records Administration. Only four of the six agencies chose to discuss the qualitative questions at the meeting. The other two agencies discussed the questionnaire only and agreed to respond to the discussion questions in writing, although only one actually submitted their written questions. Finally, Westat held four telephone interviews with six content experts. The experts included two webmasters (Linda Wallace from the Internal Revenue Service, and Jerry Malitz from the National Center for Education Statistics); two preservation specialists (Evelyn Frangakis from the National Agricultural Library, and Abby Smith from the Council on Library and Information Resources); and two professors in information resources management (John Bertot and Charles McClure). The purpose of expert interviews was to: > Solicit opinions of experts on topics not adequately covered on the survey or in the agency meetings, > Ask questions to provide a broader context in which to view the issues, and > Explore current initiatives and future directions. Key Findings These findings reflect the major results of the survey and qualitative data collection: Policy and Planning Issues 1. There is an overall lack of Government information policy guiding electronic publishing, dissemination, permanent public access, or information life cycle management, especially as information policy relates to agency missions. Also, there is a lack of overall coordination of these initiatives at the Governmental, branch, or even agency level (pp. 68-69). 2. Responsibility for electronic publishing within agencies is decentralized, diffuse, and unclear. Some agencies either could not identify or had difficulty identifying the proper respondent within their own agency, or even the person who was responsible for the product (pp. 11 and 14). 3. Some Government agencies are monitoring the information needs of their users to enhance current access to electronic Government information products (p. 65). 4. There is a lack of specific planning for product development and technological migration (pp. 34-36; table 23 on p. 42). 5. There is a lack of planning for or consideration of web design approaches that comply with the Americans with Disabilities Act (ADA) (table 6a, p. 29) Permanent Public Access 6. The concept of permanent public access (PPA) is not well understood. Respondents also had difficulty distinguishing between PPA for electronic products and archiving electronic Federal records with the National Archives and Records Administration (tables 18-20, pp. 39-40). 7. Metadata and their importance to public access are not well understood, particularly as they may affect PPA. Only 27 percent of respondents reported having a metadata record for the products surveyed (table 19, p. 39). 8. For some products, PPA results from the agencies' use of a host disseminator, such as GPO Access (p. 11). Authenticity 9. There is a lack of understanding of what ensuring authenticity entails, and a lack of planning for or consideration of ensuring authenticity of electronic Government information products (table 21, p. 41). Product Characteristics 10. Fifteen percent of the products surveyed are not in the public domain, for all or part of the product (table 27, p. 45). In addition, user fees are charged for 30 percent of the products (table 24, p. 43). 11. The most prevalent types of mediums are the web, paper, CD-ROM, and bulletin board systems (table 3a, p. 22); the most prevalent formats are HTML, PDF, GIF, JPEG, TIFF, and ASCII (table 4a, p. 25). 12. The most prevalent types of data contained in the products surveyed are textual, numerical, bibliographic, and graphical (tables 2a and 2b, p. 20). Standards 13. There is a lack of standardization for producing Government information products on CD-ROM (e.g., installation instructions, user documentation) (p. 55). 14. The most prevalent medium and format standards identified in the survey are common agency practice rather than agency-mandated (tables 3b, 4b, 6b, pp. 23, 26, and 30). 15. Some Government agencies have established guidelines or best practices for presenting and organizing Government information products on the web, although full compliance with the guidelines is a goal that has not yet been achieved (p. 64). 16. Some Government agencies are exploring a range of innovative formats and web design approaches for electronic Government information products (p. 57). Next Steps As a followup effort, NCLIS indicated that they will use these findings as a point of departure and analyze them in greater depth. It is expected that this followup effort will result in broad conclusions and recommendations to the President and Congress about how the problems and challenges revealed in this study can be constructively addressed to improve current and future public access to electronic Government information. 1 Introduction and Background Since 1813, the American public has benefited from the ability to gain free access to Federal Government information. This unique American right to no-fee access to Government information is made possible through the Federal Depository Library Program (FDLP) of the Superintendent of Documents (SuDocs) in the Government Printing Office (GPO). The FDLP has significantly contributed to creating an informed, educated, and culturally enriched U.S. citizenry. This introduction provides a brief overview of the FDLP and background information on the purpose and objectives of this study to assess electronic Government information products. The Federal Depository Library Program The mission of the Federal Depository Library Program is to assure current and permanent public access to the universe of information published by the U.S. Government. The FDLP was established by Congress to ensure that the American public has access to its Government's information. Depository libraries safeguard the public's right to know by collecting, organizing, maintaining, preserving, and assisting users with information from the Federal Government. The Government Printing Office provides Government information at no cost to designated depository libraries throughout the country. These depository libraries, at their own expense, provide local, no-fee access to Government information in all formats in an impartial environment with professional assistance. Any member of the public can visit these depository libraries and use the Federal depository collections. Products distributed by GPO for depository library collections include all electronic Government information products that are of public interest or educational value. By law, the FDLP excludes those products that are solely for administrative or operational purposes, classified for reasons of national security, or the use of which is constrained by privacy considerations (GPO, 1998, p. 4). In order to administer the FDLP, as required by the enabling legislation for the program, 44 U.S.C. Chapters 17, 19, and 41, the SuDocs is responsible for the acquisition, classification, format conversion, dissemination, and bibliographic control of tangible and electronic Government information products; the inspection of depository libraries, and the continuing education and training initiatives that strengthen the ability of depository library personnel to serve the public. An emerging new responsibility is to ensure that electronic Government information products disseminated through the FDLP, or incorporated in the FDLP Electronic Collection, remain permanently accessible to the public. Under 44 U.S.C., Sections 1901-1903, and Office of Management and Budget (OMB) Circular A-130, Management of Federal Information Resources, Federal agencies should make all their publications in all produced formats available to SuDocs for distribution to depository libraries. How the Federal Depository Library Program Works GPO provides Government information at no cost to designated depository libraries throughout the country. These depository libraries, at their own expense, provide local, no- fee access with professional assistance to this information in all formats. Access to Federal Government information is available through more than 1,350 depository libraries located throughout United States and its territories. Fifty-three of the depositories are regionals, and the remaining are selective depositories. The regional libraries receive and maintain everything that is distributed through the program, unless they are superseded. The selective libraries pre-select the types of publications they wish to receive based on the specific needs and interests of the communities they serve. Of the libraries in the FDLP, approximately 50 percent are academic, 20 percent are public, 11 percent are law, 5 percent are community college, 4 percent are Federal agency, and 10 percent are special, state, court, and Federal court libraries. Before the evolution of electronic publishing media, especially the Internet, Federal Government agencies published information almost exclusively in a centralized print environment that facilitated easy distribution to the Federal depository libraries. Now, Federal Government agencies are doing their own electronic publishing and creating and managing their own websites to disseminate a variety of Government information products. This study resulted from Congress's concerns about the short- and long-term effects of electronic publishing on the ability of all U.S. citizens to continue to gain affordable and easy access to Government information. Background of the Study This study to assess electronic Government information products was authorized by the Joint Committee on Printing and was sponsored by the Superintendent of Documents, U.S. Government Printing Office. The initial need for this project was identified in GPO's cooperative 1996 Study to Identify Measures Necessary for a Successful Transition to a More Electronic Federal Depository Library Program. This study (see www.access.gpo.gov/su_do cs/dpos/fdlppubs.html#4) was conducted at the direction of Congress. In order to conduct the study, the Public Printer established a working group consisting of representatives from the following program stakeholders and constituents: > GPO, > Appropriate congressional committees, > Congressional Research Service at the Library of Congress, > Office of Management and Budget, > National Archives and Records Administration, > Federal Publishers Committee, > Interagency Council on Printing and Publication Services (ICPPS), > Administrative Office of the U.S. Courts, and > Depository library community. One of the committee's major recommendations was to assess electronic medium and format standards for the creation and dissemination of electronic information products. The committee considered this assessment an essential step toward ensuring a successful and cost-effective transition to a more electronic FDLP. Project Phases This project is being undertaken in three phases. The first phase of the project consisted of a review by the National Academy of Science's Computer Science and Telecommunications Board (CSTB) in which CSTB developed a detailed statement of work that defined the data collection process required to conduct the assessment (see http://www.nclis.gov/info/g po1.html). This report is a product of Phase II of the project. GPO commissioned the National Commission on Libraries and Information Science (NCLIS) to undertake a survey and assessment of electronic Government information products. NCLIS awarded the contract to Westat, a survey research company, to undertake research and data collection from Federal agencies in all three Branches, as well as solicit the opinions of selected knowledgeable experts. The contract further called for Westat to complete an analysis of the data and expert opinions for the purpose of interpreting their general meaning and significance, including identifying broad emerging trends and patterns, and documenting findings. In Phase III, NCLIS will identify an appropriate organization to review Phase I and Phase II findings, as well as to review the data and develop conclusions and recommendations for GPO, the Congress, and the President. Study Goals and Objectives (Phase II) Information gathered from this assessment will be used by the Superintendent of Documents to facilitate improved public access to Federal Government information made available to Federal depository libraries and the general public through the FDLP. More specifically, for this cross-section of Government information products, the Phase II goals were to: > Identify medium (see glossary in Appendix E for the difference between the medium and media) and format standards that are the most appropriate for permanent public access, > Assess the cost- effectiveness and usefulness of various alternative medium and format standards, and > Identify public and private medium and format standards that are, or could be used for products throughout their entire information life cycle, not just at the dissemination or permanent public access stage. The Superintendent of Documents will use the results of this work effort to continue to plan and implement the transition to a more electronic FDLP. The five major specific objectives are: > First, with respect to electronic publishing practices and plans of Federal agencies (including ways in which the FDLP can best accommodate them), the objective is to provide an analysis of current practices as well as future plans for creating, disseminating, and providing permanent public accessibility to electronic information products, and to identify the standards for software and electronic mediums and formats that are used throughout the product's information life cycle, from creation to archiving, but especially at the stage of dissemination for permanent public access. > Second, with respect to cost-effectiveness of various dissemination mediums and formats that are, or could be utilized, the objective is to gather information on standards (whether mandated or consensual) that will assist the FDLP in making near-term decisions regarding the cost effectiveness of alternative mediums and formats for all FDLP participants. This information should also assist participants in long term planning for permanent public accessibility, and the collection and analysis of overall information life cycle costs. > Third, with respect to the practical utility of various electronic mediums and formats to depository libraries and the public, the objective is to identify preferred standards used in various mediums and formats that depository libraries will need to support. > Fourth, with respect to utilizing standards employed in mediums and formats that can be used throughout all stages of the information life cycle (including creation, composition, computer terminal display, encryption, secure digital signature with non-repudiation and secure transmission capabilities), but especially for permanent public accessibility, the objective is to assess standards for basic security services in order to provide for secure and reliable transmission and document interchange. > Fifth, with respect to standards that are being developed and used in the private sector, the objective is to identify existing and planned standards for the purpose of determining what the FDLP must do to accommodate their adoption in terms of hardware/software requirements, staff and user education and training, and budgetary impacts. Scope and Organization of the Report The primary data collection activities included a survey and interviews. Westat, per the requirements established by NCLIS in consultation with GPO, surveyed a cross-section of electronic information products from Federal agencies in all three branches of Government and solicited the opinions of selected knowledgeable experts. This cross-section of products was not a randomly selected sample due to cost and time constraints. Therefore, readers are cautioned about generalizing the findings to all electronic Government information products. Westat surveyed electronic Government information products to determine the mediums and formats in which products are currently produced and the standards, if any, that are being used. The survey also asked respondents questions about the agency's future plans for adding or changing products, including the mediums and formats in which they will be disseminated for permanent public access. This report is limited to presenting and discussing the survey findings and findings from qualitative site visits, agency meetings, and expert interviews. Phase III of the project will focus on drawing conclusions and recommendations based on work conducted during Phases I and II. The report is organized in five parts: introduction and background, methodology, survey analysis and findings, qualitative findings, and discussion of quantitative and qualitative findings. Please note that Appendix E contains a glossary of terms and acronyms used on the questionnaire and throughout this report. 2 Methodology This second part of the report discusses the following topics: > The process of selecting a cross-section of electronic Government information products, > Agency coordinator briefings, > Questionnaire design and development, > Nonresponse and data retrieval followup, and > The methodology for the qualitative data collection activities, i.e., site visits, agency meetings, and expert interviews. Product Selection NCLIS and GPO-assisted by various groups, including the library associations represented by the Inter-Association Working Group on Government Information Policy (IAWG), the Federal Library and Information Center Committee (FLICC), the Depository Library Council (DLC), and the Interagency Council on Printing and Publication Services (ICPPS)- developed and refined a set of criteria for product selection. NCLIS, GPO, and the other representatives asked knowledgeable members of these groups to identify products that met one or more of the following six guidelines: > Increased emphasis on electronic dissemination rather than continuation of paper and microform dissemination; > Replacement of older electronic mediums and formats with state-of- the-art technologies; > Adoption of mandated (Government or private sector) and consensual (common agency practice) medium and format standards; > Adoption and use of preferred mediums or formats that have widespread support from agency, depository library, and user communities; > Exemplified cost- effective mediums and standards, especially those that can be used throughout the entire information life cycle, rather than the use of expensive customized or shelf packages; and > Exemplified awareness of the important impact of medium and format decisions on permanent accessibility, authentication, and/or security encryption protection. The products were not randomly selected; therefore, readers are cautioned about generalizing the findings to all electronic Government information products. In April 1998, NCLIS distributed the preliminary list of products to agency Chief Information Officers (CIOs), who were asked to validate and coordinate the final selections with appropriate agency personnel. In addition, NCLIS asked CIOs to select an agency coordinator. The coordinator's role was to oversee the distribution of product questionnaires to the appropriate respondents and to encourage respondents to complete the questionnaire and return it to Westat. (See Appendix B for a list of coordinators who participated in this study.) The final product list included 328 products from 24 agencies (Appendix C). Over the course of the data collection, the number of products decreased from 328 to 314 for the following reasons: > Several products were discontinued and no longer exist. > Several products were in paper only and agencies had no plans to migrate them to an electronic medium; therefore, they fell outside the scope of this study. > Agency coordinators could not identify respondents for some products, so there was no one to complete the questionnaire. > Several questionnaires were undeliverable due to unknown or incorrect respondent addresses; no alternate respondent could be located in a few cases. Coordinator Briefings NCLIS and GPO planned and conducted two coordinator briefings in June and July 1998, and asked Westat to attend them (see Appendix A for agenda). The purpose of these briefings was to: > Provide an overview of the study including background, purpose, goals, and schedule, > Discuss their specific tasks, > Review the draft questionnaire with coordinators and solicit their input on changes, > Collect their final list of products, and > Thank them for their participation and cooperation. Coordinators were asked to: > Assist Westat in pretesting the survey instrument, > Identify and brief appropriate internal participating offices, > Identify product respondents for survey followup, > Schedule and participate in voluntary agency meetings with Westat, > Distribute questionnaires to agency respondents, > Ensure timely completion and submission of survey instruments, and > Cooperate with Westat on followup. Only a few coordinators brought their final selections to the agency meetings; most agencies needed much more time to review and finalize their product selections. The questionnaire review also served as an informal pretest of the questionnaire. Questionnaire Design NCLIS, with consultation from GPO, developed the initial five-page list of questions. This list of questions was included as an appendix to the statement of work. Westat worked with GPO and NCLIS from June through July to expand and refine the list of questions to a 13- page instrument with appropriate instructions, examples, skip patterns, open-ended questions, please-specify questions, etc. Westat pretested the questionnaire informally at the two coordinator briefings. The coordinators helped Westat to clarify some questions, expand the format choices, and add a few more questions. Westat conducted a more formal pretest with personnel from six Government agencies. These pretests led to the following substantive changes in the questionnaire: > Clarification of instructions and wording of several questions, > Addition of more format options, > Addition of the definition of "product" at the beginning of the questionnaire, and > Clarification of definitions included in the glossary. Westat, with final approval by NCLIS and GPO, finalized the questionnaire by mid-August 1998. (See Appendix D for the cover letters and Appendix E for the final questionnaire.) Distribution of the Questionnaires During the last week of September and the first week of October, Westat distributed the questionnaires to 23 agencies through the agency coordinators. On October 9, 1998, NCLIS requested that Westat add more products to the survey by including a 24th agency, the Securities and Exchange Commission (SEC). Westat created a database of products and corresponding coordinators or respondents and their addresses, phone and fax numbers, and e-mail addresses, and prepared and mailed packets to the agency coordinators. The agency coordinators were responsible for ensuring that each packet was sent to the appropriate product respondent in a timely fashion. These packets included the following materials for each product that was to be surveyed: > Cover letter to coordinator, > Cover letter to respondent, > Questionnaire, > Glossary of terms used in the questionnaire, and > Postage-paid return envelope. A few agency coordinators requested that Westat send questionnaires directly to their product respondents and a copy of the respondents' packets to the coordinators themselves. Westat sent questionnaire materials directly to the respondents at the Department of Commerce, the Department of the Interior, the Executive Office of the President, and the U.S. Congress. These respondent packets included the following materials: > Cover letter to respondent, > Questionnaire for each product he/she was assigned to survey, > Glossary of terms used in the questionnaire, and > Postage-paid return envelope. In addition, the Department of Health and Human Services asked Westat to send an e-mail message to the individual product respondents notifying them that they could download the final version of the questionnaire and cover letters from the PDF file located on the NCLIS website at http://www.nclis.gov/news/ nclisqux.pdf in order to complete the questionnaire. Followup for Nonresponse, Data Retrieval, and Inconsistency Westat made the first calls for nonresponse to agency coordinators. These calls began in early November and continued through mid- December. In addition, NCLIS sent periodic coordinator bulletins to keep coordinators updated on the progress of the study and to encourage respondents-through the coordinators-to complete questionnaires and return them to Westat. Westat began a second round of nonresponse followup calls to respondents from mid- December through the end of January 1999. From mid-November through the first week in January 1999, Westat made calls directly to respondents for data retrieval (i.e., missing data) and inconsistencies (i.e., a respondent checked "yes" to one question, but the next question was answered in a way that suggested a "no" answer to the first question). Approximately 40 percent of the questionnaires required some type of data retrieval followup for one or more questions. Some questions, such as 16, 18-19, and 21a, concerning metadata, permanent retention, authenticity, and the product's supporting technology, presented particular problems. Westat added a "don't know" category to these questions as a result of the nonresponse data retrieval. In addition, most respondents skipped questions 13d, 14d, and 15d about long-term plans for changing the product. Data retrieval phone calls and discussions with agency coordinators suggest respondents skipped these questions because agencies had not yet developed long-term plans. The calls to respondents for data retrieval and data inconsistency revealed the following reasons for nonresponse: > Did not know the answer. > Could not identify anyone who knew the answer. > Did not understand the question or the concept; using glossary did not help. > Did not have time to research the answer; had other work priorities. In a few instances, it was clear that the agency was not in a good position to respond to the questionnaire, in part because they rely on another agency, vendor, or contractor to provide electronic access to their products. Sometimes these "host disseminators," such as GPO, assisted in preparing the responses sent in by the publishing entity. Observations about the data collection process. Agency coordinators had difficulty locating a single point of contact from each agency sub-unit who was knowledgeable about the range and type of electronic information products created for the agency. Furthermore, due to the nature of the survey questions, product respondents had to coordinate responses to some questions with personnel who often did not work in their program areas. This process required respondents to identify personnel with whom they appeared to have little prior contact, such as records managers, information technology staff, and staff in planning offices, in order to respond to these questions. In some cases, this extra step discouraged respondents from seeking answers to these questions, so questions were left unanswered. Also, agencies whose coordinators could not attend the coordinator briefings and agencies that did not participate in the agency meetings had more problems with data consistency than did other agencies. Methodology for Qualitative Data Collection Site Visits to Depository Libraries The qualitative data collection included site visits, agency meetings, and expert interviews. Westat conducted site visits to Federal depository libraries from July 30 through September 9, 1998. The statement of work (Appendix J) specified that Westat visit three libraries: one regional academic, one law, and one public. Furthermore, GPO suggested that Westat visit the following specific libraries in the Washington, D.C., metropolitan area: > McKeldin Library, University of Maryland College Park, College Park, Maryland > Washington College of Law Library, American University, Washington, D.C. > Montgomery County Rockville Regional Public Library, Rockville, Maryland The purpose of visits was to discuss the effects of the transition to a more electronic Federal Depository Library Program on the end user and on the services and resources of each library. The interview questions, which were based on readings and discussions with GPO and NCLIS, covered three broad areas: > What key issues or concerns do you have about users accessing and using electronic Government information products? > What are your concerns about providing access to electronic Government information products? > What specific ideas do you have for improving public access to online and electronic Government information products in your library? The site visits were audiotaped. In addition, the libraries gave Westat representatives a tour of the facilities. (See Appendix F for a list of the specific interview questions, the names of all interviewees, and detailed site visit notes.) Site visit observations. In addition to the small number of libraries visited, the problems and concerns of librarians in the D.C. metropolitan area may not be representative of those experienced by librarians at most depository libraries, especially the selective depositories. Some smaller selective depository libraries that are located in more remote areas as well as some of the larger urban selective depositories might have fewer resources (e.g., fewer computers and trained librarians, training funds, and options for low-cost Internet providers). Purpose and Procedures for Agency Meetings Meetings with agency representatives were held between September 15 and September 24, 1998. The purpose of the meetings was twofold: > To collect qualitative data about electronic Government information products that were not covered in the survey, such as cost- effectiveness of standards, use of locator tools, results of user surveys, etc.; and > To discuss the questionnaire and data collection procedures for distribution of the questionnaire. In addition to inviting agency coordinators and respondents, the statement of work specified that Westat invite representatives of the following offices to attend the meetings: > Public affairs or communications offices, > Agency printing and publishing units, > Information technology or electronic information systems offices, > Agency libraries; and > Relevant program offices. Westat wrote the procedures for scheduling agency meetings and arranging for logistics, which included developing meeting protocols, agenda, cover letter, and script for interviewers to schedule meetings. We then contacted coordinators and sent them the following materials: > Cover letter explaining purpose of meeting and their tasks, > Meeting agenda and discussion questions, > Press release from NCLIS with background information on the project, > Roster of potential agency representatives who will attend meeting (to be completed by the coordinator), and > Respondent product roster (to be completed by coordinator). Agency meetings held. Westat contacted 15 of the 24 agencies to hold meetings. Of the 7 agencies that were not contacted, 3 had fewer than 10 products. NCLIS instructed Westat not to hold meetings with the U.S. Congress and the Executive Office of the President because NCLIS and GPO worked with them directly. Ten of the 16 agencies did not respond to Westat's request to schedule a meeting. The following six agencies agreed to schedule a meeting: > Department of Health and Human Services > Department of Education > U.S. Supreme Court > Department of Commerce > Environmental Protection Agency > National Archives and Records Administration Only four of the above six agencies chose to discuss the qualitative questions at the meeting. The other two agencies wanted to discuss the questionnaire only and agreed to respond to the discussion questions in writing. However, only one of them sent in responses. Westat audiotaped all agency meetings and took notes as agency personnel discussed the questions. (Appendix G includes the list of agencies that participated in meetings, the number of attendees, the discussion questions, and summary notes from the meetings.) In addition to the meetings held with Westat, NCLIS and/or GPO representatives met with approximately 50 agency representatives. In these meetings, NCLIS and GPO discussed survey goals and objectives and the process for preselecting products, in addition to responding to specific questions about the survey. Agency meeting observations. Agency participation in the entire project was voluntary but essential. As with any voluntary activity, participation is based on availability and timing. For example, many agency coordinators were unavailable to schedule meetings during the summer months, or they were available but product respondents were on vacation, which may have resulted in fewer agency meetings. Product respondents needed to attend the agency meetings to review the questionnaire, although they were not always the most appropriate personnel to respond to all of the qualitative questions. The project depended upon the good faith, interest, and cooperation of agency CIOs and coordinators to participate in the meetings. Respondents and participants from the private sector are often given an honorarium for participating in similar research activities, but Federal employees are exempt from this process. Scheduling agency meetings, calling coordinators, and preparing paperwork to send to coordinators took a considerable amount of planning and coordination and time, but it did not result in many meetings. Agencies were cooperative, but it was difficult for them to identify the "right" personnel to invite to the meetings, even though coordinators took a significant amount of time to locate product respondents from other sub- units within their agencies. Therefore, answers to the agency meeting discussion questions reflected the perspectives of only 5 of the 24 agencies surveyed. Expert Interviews NCLIS provided a list of experts from which Westat chose six names. Westat held four telephone interviews with the six experts between October 27 and November 24, 1998. The experts included two webmasters, two preservation specialists, and two professors in information resources management. The purpose of expert interviews was to: > Solicit opinions of experts on topics not adequately covered on the survey or in the agency meetings, > Ask questions raised during the agency meetings or site visits that require further explanation, or to provide a broader context in which to view the issues, and > Explore current initiatives and future directions. As with the site visits and agency meetings, Westat audiotaped the interviews. Appendix H provides a list of experts, interview questions, and a summary of interview notes. 3 Survey Analysis and Findings This section of the report presents the survey findings from each of the major survey questions as they appear in the questionnaire (Appendix E). Appendix E also includes a glossary of terms and acronyms used throughout this report. The discussion and presentation will then focus on the key study questions explored on the following topics: > Preferred medium and formats used, > Planned medium and format changes, > Permanent public access issues, > Permanent retention issues, > Authenticity, and > Searchability, proprietary software, and licensing fees. The final response rate was 74 percent. Respondents from 24 Government agencies completed and returned a total of 242 of the 328 questionnaires fielded. The word "respondents" refers to the 242 agency personnel who completed the questionnaire. Since each agency submitted at least two product questionnaires, the unit of analysis is the product or product respondent, not the agency (table1). The sample was not randomly selected due to cost and time constraints. Therefore, readers are cautioned about generalizing the findings to all electronic Government information products. Structure of the Questionnaire The questionnaire is organized into five sections, A through E. Section A contains general information about the product and agency that produced it. Section B contains questions about the product's current profile including the kinds of data the product contains, mediums in which it is produced, and, if in an online medium, formats and online approaches used. This section concludes with questions on searchability and retrievability of the product. Section C relates to the future plans for the product and is designed to solicit information about changes in the product's data, mediums, and formats. Section D addresses the issues of metadata, permanent public access, permanent retention, authenticity, updating/upgrading plans, user fees, licensing, and public domain. The final section, E, includes one open-ended general comments question. Section A Responses Sections A and B of the questionnaire focus on format and medium standards that address the key objectives of the study. Section A contains general information about the product and the agency that produced it, including the name of the agency and its sub-unit, the product name and description, and the Uniform Resource Locator (URL) for the site in which the product appears. A list of the agencies surveyed and the number of product questionnaires received from each agency appears in table 1. (For a description of how products were selected, refer to the methodology section.) Appendix C contains the final list of products surveyed. Table 1. Number of surveys returned by each agency surveyed Agency Number of surveys returned Administrative Office of the U.S. Courts 5 Department of Agriculture 19 Department of Commerce 14 Department of Defense 8 Department of Education 14 Department of Energy 12 Department of Health and Human Services 19 Department of the Interior 11 Department of Justice 8 Department of Labor 2 Department of State 3 Department of Transportation 9 Department of the Treasury 13 Environmental Protection Agency 16 Executive Office of the President 5 General Services Administration 8 Library of Congress 21 National Aeronautics and Space Administration 6 National Archives and Records Administration 10 Securities and Exchange Commission 11 Smithsonian Institution 11 Social Security Administration 4 Supreme Court of the United States 4 United States Congress 9 SOURCE: National Commission on Libraries and Information Science, Government Information Product Assessment Questionnaire: 1998. Section B Responses Section B covers the current product profile, including: > How it is used; > What types of data it contains; > What mediums the product is available in, what is the primary medium used, and what are the agency's medium standards; > What kinds of formats are used, what is the primary format used, and what are the agency's format standards; > What user interfaces are supported and what web design approaches are used; > If the electronic product can be searched and how; > What agency hosts the product on the web; and > How the product can be retrieved. Readers should note that most of the survey questions asked respondents to "check all that apply"; therefore, the percentages for these questions will exceed 100 percent. Also, for the first set of tables in this section (tables 1 through 6), the response categories appear in descending order by number or percentage. Therefore, the responses will not match the order in which they appear on the questionnaire. Types of Data Contained in Product Table 2a shows that the frequently mentioned types of data contained in the products surveyed are textual (188 responses), followed by graphical (142 responses), numerical (141 responses), bibliographic (82 responses), and spatial (53 responses). Multimedia, video, and sound are less common, probably because they reflect the products surveyed and because of the special plug-ins, hardware, and memory required to open, view, and listen to products that contain these data types. The primary data types contained in products surveyed are textual (57 percent), numerical (21 percent), bibliographic (10 percent), and graphical (5 percent; table 2b). These four types of data account for approximately 93 percent of the products surveyed. Table 2a. Number and percent of types of data, by the type of data contained Type of data Type of data contained Number Percent Textual data (books, serials, reports) 188 77.7 Graphical data (photos, charts, graphs, tables, drawings) 142 58.7 Numerical data 141 58.3 Bibliographic data 82 33.9 Spatial data (maps, coordinate files) 53 21.9 Multimedia (sound, video, text, graphics) 14 5.8 Video 10 4.1 Sound 9 3.7 Other 16 6.6 NOTE: Percents do not add to 100 because respondents could choose more than one item. SOURCE: National Commission on Libraries and Information Science, Government Information Product Assessment Questionnaire, 1998. Table 2b. Number and percent of types of data, by the primary type of data Type of data Primary type of data Number Percent Textual data (books, serials, reports) 138 57.0 Numerical data 50 20.7 Bibliographic data 24 9.9 Graphical data (photos, charts, graphs, tables, drawings) 13 5.4 Multimedia (sound, video, text, graphics) 3 1.2 Spatial data (maps, coordinate files) 2 0.8 Sound 1 0.4 Video 1 0.4 Other 10 4.1 NOTE: Percents do not add to 100 because of rounding. SOURCE: National Commission on Libraries and Information Science, Government Information Product Assessment Questionnaire, 1998. Types of Mediums Used Respondents were asked to identify all the types of mediums in which the product is available to the public as well as the primary type of medium used. The most common type of medium used among pre-electronic mediums is paper (177 responses), followed by microform (22; table 3a). The responses in the "other" category include Fax on Demand, audiotapes, and Braille. Among electronic mediums used, it is not surprising that the web is the most common (204 responses), followed by CD-ROM (70 responses), floppy diskettes (42 responses), hard drive (30 responses), and magnetic tape (18 responses). These figures reflect the medium types the public is most likely to easily access, as well as the availability and growing interest in the web. Table 3a also displays the frequency and percentage distribution of the primary types of mediums in which the product is publicly accessible. The web (42 percent) and paper (41 percent) are the primary types of mediums used, followed by CD-ROM (8 percent) as a distant third. Standards for all mediums checked. For each type of medium checked, respondents identified one medium standard (see Appendix E glossary) among four types: > Agency mandated, > Common agency practice, > Other, and > None. While most agencies have some type of standards for their pre-electronic and electronic mediums, they are primarily "common agency practice" rather than "agency mandated." For pre-electronic mediums, 33 percent of the products in paper are in an agency- mandated standard (table 3b). However, 52 percent of paper products are used as a common agency practice. Only 13 percent of the CD-ROM products are in an agency-mandated standard, as compared to 59 percent of CD-ROMs that are used as a common agency practice. Eighteen percent of web-based products were reported to be in an agency-mandated standard, while 70 percent of them are used as a common agency practice. A considerable number of products in CD-ROM (21 percent) were reported as having no standards (table 3b). Compare these numbers to 9 percent (15 products) of products reported by respondents as having no standards for the use of paper, and 8 percent (16 products) reported as having no standards for the use of the web. Table 3a. (Omitted - Will appear at a later date) Table 3b. (Omitted - Will appear at a later date) Format Types Used Databases. Responses to all formats used are shown in table 4a. Wide Area Information Server (WAIS) is the most common type of database identified (22 responses), followed by Oracle (17 responses), and dBase (9 responses). In some cases, WAIS is reported because the products surveyed are made available through GPO Access. The 44 responses in the "other" category reveal few multiple responses except for Microsoft Access that received 5 "write-in" responses in this category. WAIS (24 percent) and Oracle (14 percent) are the primary types of databases used (table 4a). Ninety-one percent of the respondents who checked WAIS as one of the databases used also indicated that the use of WAIS is a common agency practice, while only one respondent indicated that WAIS is agency mandated (table 4b). However, only 44 percent of the respondents identified the use of Oracle as a common agency practice, but 39 percent of respondents indicated their use of Oracle is agency mandated. Spreadsheets. For spreadsheet formats used, Excel and Lotus 1-2-3 received 33 and 23 responses, respectively (table 4a). When respondents were asked to choose one of the databases as the primary type used, 59 percent chose Excel, while only 33 percent chose Lotus 1-2-3. Close to 71 percent of the respondents also identified the use of Excel as a common agency practice as compared to 38 percent who indicated the use of Lotus 1-2-3 as a common agency practice (table 4b). Tagged mark-up. Hypertext Markup Language (HTML) is both the most commonly used tagged markup language (157 responses) and the primary type of tagged markup language used (89 percent; table 4a). The Government agencies surveyed seldom use Extensive Markup Language (XML) (2 responses), and Standard Generalized Markup Language (SGML) (14 responses). This is noteworthy since SGML is one of the few formats that NARA accepts for electronic records. Even given the fact that HTML is the primary type of tagged markup format used, 72 percent of the respondents reported that HTML is used as a common agency practice, while only 13 percent reported that its use is mandated by the agency (table 4b). Sixty percent of the respondents who use SGML for their online products reported it as a common agency practice, while only 13 percent reported that its use is mandated by the agency. Image formats. Portable Document Format (PDF) is the most common image format (132 responses) and the primary type of format used (49 percent) by the agencies surveyed in this study (table 4a). The use of PDF is followed by GIF (99 responses), JPEG (77 responses), then TIFF (36 responses) as image formats used. Perhaps PDF is the most commonly used format by the agencies surveyed because the Federal Government disseminates a wide range and large number of forms and documents that must be printed in the exact format in which they are created. Almost 64 percent of respondents reported that PDF is a common agency practice, while 16 percent reported it is mandated by the agency (table 4b). While a higher percentage of respondents reported using GIF (69 percent) and JPEG (71 percent) as a common agency practice, PDF is the most used agency-mandated image format (16 percent). Audio formats. The number of responses reported in this category reflects the small numbers of products surveyed that contain sound (see table 2a). WAV (12 responses) is the most commonly used sound format followed by AU (5 responses), and AIFF with 1 response (table 4a). WAV is also the primary type of audio format used (73 percent). Table 4a. (Omitted - Will appear at a later date) Sixty-two percent or eight of the agency respondents who indicated using WAV reported it as a common agency practice; only two respondents (15 percent) reported that WAV is an agency- mandated standard (table 4b). Perhaps it is not surprising that WAV is the most commonly used audio format; since it was built into Windows95, it has become the de facto standard for sound on PCs. AIFF is the standard audio format for Macintosh computers (PC Webopaedia; see www.pcwebopaedia.com). Table 4b. (Omitted - Will appear at a later date) Video formats. As with the audio formats used, the even smaller number of responses reported in this category also reflect the small numbers of products surveyed that contain moving images. Table 4a shows that Moving Picture Experts Group (MPEG) (9 responses) is the most commonly used format, followed by MOV (7 responses) and Audio Video Interleave (AVI) (4 responses). MPEG may be more commonly used since it generally produces better quality video than AVI (PC Webopaedia). Of all the video formats used, however, the primary type of video format used is MOV (33 percent), followed by MPEG and AVI (27 percent each). Of the respondents who reported using MPEG, 50 percent indicated its use is a common agency practice, while only 1 respondent (10 percent) reported that its use is agency mandated. Sixty-three percent of the respondents reported that MOV is used as a common agency practice, and none indicated that its use is agency mandated (table 4b). Text formats. ASCII is by far the most commonly used text format (122 responses) and the primary type of text format used (81 percent; table 4a). The second most commonly used text format is ANSI (11 responses) followed by Rich Text Format (RTF) (9 responses). Seventy-one percent (87) of the respondents reported that their use of ASCII is a common agency practice, as compared to 11 percent (14) who reported its use is agency mandated (table 4b). Word processing formats. Between the two most popular word-processing software packages, Microsoft Word and WordPerfect, the latter (75 responses) is more commonly used than Microsoft Word (55 responses; table 4a). These responses are also consistent with the primary type of word processing used. Sixty-four percent of respondents reported WordPerfect as the primary type of format used while only 22 percent of respondents reported Microsoft Word as the primary type of format used. PageMaker received the largest number of responses (5) in the "other" category. Nineteen respondents (25 percent) reported that WordPerfect is an agency-mandated format standard, while only 8 respondents (14 percent) indicated that Microsoft Word is an agency- mandated format standard (table 4b) Summary of format types used. Each of the 242 respondents from the 24 agencies surveyed was asked to identify the primary type of format used of each of the categories. The primary types of formats used in each category are WAIS, Excel, HTML, PDF, ASCII, and to a lesser degree, WAV and MOV. User Interfaces Online approaches. Question 9 on the survey refers to online approaches used. Eighty-five percent of the respondents reported that their product is in an online medium (table 5). These respondents were then asked to respond to a set of questions on user interfaces supported and web design approaches. Table 5. (Omitted - Will appear at a later date) User interface supported. Table 6a shows that Netscape Navigator (195 responses) is a more commonly supported browser than Internet Explorer (170 responses). However, close to 70 percent of agency respondents indicated that both of these browsers are almost equally supported as a common agency practice rather than an agency- mandated standard (table 6b). Table 6a. (Omitted - Will appear at a later date) In addition, respondents reported that file transfer protocol (FTP), Telnet, and nongraphical/dial-up shells are also supported by their agencies (table 6a). Designs that support LYNX, a text-based browser, account for 12 of the 22 responses in the "other" category. The number of responses for the category nongraphical/dial-up shell are low (15 responses), especially given the need for agencies to comply with the American with Disabilities Act by making their sites more accessible to the visually and hearing impaired. Like the browsers, the other user interfaces supported are primarily supported as a common agency practice rather than an agency-mandated standard. Almost 83 percent of the 40 respondents who reported their agency supports FTP also reported it is a common agency practice, while 79 percent of the 27 respondents who reported supporting Telnet also indicated it as a common agency practice (table 6b). No respondents reported that Telnet is an agency-mandated standard; however, 8 percent reported that FTP is an agency- mandated standard for their surveyed products. Seventy- five percent of the respondents reported the support of a nongraphical/dial-up shell as a common agency practice while only 13 percent indicated that it is agency mandated. Web design approaches. Various web design approaches used, in descending order, are HTML (150 responses), tables (111 responses), CGI Scripts (66 responses), frames (53 responses), Javascript (43 responses), Java Applets (23 responses), and XML (11 responses; table 6a). ColdFusion was reported in three of the responses in the "other" category. The use of these web design approaches is overwhelmingly a common agency practice rather than an agency-mandated standard (table 6b). Basic HTML-tags that consistently display content in a similar fashion by the most popular browsers-is the only approach to which almost one-fifth (18 percent) of the respondents reported that its use is agency mandated. Less than 10 percent of the respondents using each of the other approaches indicated that they are agency-mandated standards. Since the use of frames, Javascript, Java Applets, and XML may not be supported or enabled for many users' browsers, the agencies surveyed appear to be adopting them slowly, if at all. Table 6b. (Omitted - Will appear at a later date) Searchability of Product Searchability of an electronic product is important for users because it allows them to effectively access the information they need. Most electronic products are searchable either by full-text with no fielding (74 responses) and/or by full-text and field (99 responses; table 7). The "view only" category contains a higher number of responses than expected (79 responses). The "other" category contains the following common responses: > Inapplicable because product is in a paper medium (most common response); > In PDF, which is not searchable; and > Product is indexed by field only. Table 7. (Omitted - Will appear at a later date) Product "Host" Most of the products surveyed (199 responses) were hosted by the agency that created them, although other agencies or institutions might also host the products since respondents were asked to "check all that apply" for this question (table 8). There are fewer responses for products hosted by another agency (42 responses), a contractor (17 responses), and an educational institution (9 responses). Table 8. (Omitted - Will appear at a later date) Retrievability of Product In order to ensure broad access to the product, the public should be able to download and save electronic Government information products without restrictions (GPO, 1996, p. 7). Responses to Question 11 indicate that for the most part, products surveyed for this study can be downloaded and saved without restrictions (173 responses; table 9). Responses in the second category indicate that some products cannot be downloaded or saved (20 responses). A small number of products (14) cannot be downloaded or saved because their use requires proprietary software that is not freely distributed (table 9). Common write-in responses in the "other" category include > Can be downloaded and saved, but subject to restrictions. > Can be printed from browser, but not downloaded. > Product available only in paper. The United States Advisory Council on the National Information Infrastructure, in its publication "A Nation of Opportunity," identifies as one of the basic principles of Government information and services that "the Federal Government should not charge for making its information available...nor charge for access to that information" (GPO, 1996, p. 28) Table 9. (Omitted - Will appear at a later date) Section C Responses (Planned Product Profile) Section C contains a series of questions related to the future product profile. Respondents were asked questions about changes in the types of data, mediums, and formats used and reported on in Section B of the questionnaire. Respondents also were asked to identify the time span in which the changes would occur and to describe the planned changes. Types of Data The first question in this section of the questionnaire asked respondents about plans to discontinue publication of the product. Only 5 percent (12) of the respondents planned to discontinue the product (table 10). Several of the most commonly listed responses provided for discontinuation of a product was that the product was a one-time "prototype" or that the paper version of the product would be discontinued. Table 10. (Omitted - Will appear at a later date) Table 11 shows responses to question 13 about the kinds of data (i.e., bibliographic, textual, graphical) the product will contain in the future. The majority of respondents (76 percent) reported that the agency plans no changes to the product. Twenty-one percent reported that the agency would add one or more new types of data. A total of 3 percent reported either the discontinuation of one type of data (0.4 percent), or a complete change to new data types (2.6 percent). Several respondents reported that the changes in data types would include adding audio or video and multimedia. Table 11. (Omitted - Will appear at a later date) Most agency respondents reported that these changes in data types would mainly occur in the short term (40 responses) and, to a lesser degree, in the medium term (24 responses; table 12). Most respondents skipped the question about long- term plans for changing data types. Respondents noted in the "please specify" categories in questions 13c and 13e indicate that respondents' plans for product changes have not yet been solidified. Table 12. (Omitted - Will appear at a later date) Types of Mediums Responses to changes in types of mediums parallel those for changes in data types. Seventy-six percent of the respondents reported no plans to change mediums (table 13). Eighteen percent of respondents reported that they are planning to add one or more mediums, 2 percent indicated they will discontinue one or more mediums, and 3 percent reported they will change to a new type of medium. Table 13. (Omitted - Will appear at a later date) The two most frequently mentioned additions to medium types are to provide web access to the product, and to make the product available on CD- ROM. Most of the respondents (35) who reported changes in medium types indicated that the changes will occur in the medium term; 21 respondents indicated that these changes will occur in the short term (table 14). Again, most respondents skipped the question about long-term plans for changing product mediums. The few respondents who provided descriptions of their long- term plans mentioned that they will produce the product in multiple mediums (paper and web), or that paper items will be migrated to the web. Other respondents indicated that their long-term plans are undetermined or undefined. Table 14. (Omitted - Will appear at a later date) Types of Formats One might expect to see more dramatic changes in types of formats since the range of formats is varied and broad (i.e., database, spreadsheet, tagged markup, image, etc.). The pattern of responses to question 15 mirrors the responses to changes in types of data and mediums, except for the change to new types. Seventy-two percent of the respondents reported no changes in format types. Eighteen percent indicated that they are planning to add one or more formats, while 9 percent reported they will change to new format types (table 15). This change to new format types is the largest percentage change in this category as compared to changes to new types of data (3 percent; table 11) and new types of mediums (also 3 percent; table 13). Respondents who provided specifics about the changes to new format types indicated these new types would be PDF and XML. Table 15. (Omitted - Will appear at a later date) The majority of respondents who reported changes indicated that they will occur in the short term (36 responses), and/or the medium term (32 responses; table 16). The majority of respondents did not answer the question about long-term plans for changing formats. Table 16. (Omitted - Will appear at a later date) Section D Responses (Other Information) Section D of the questionnaire contains a variety of questions in an effort to answer some of the critical issues of public access to electronic information products: > Metadata, > Permanent public access (i.e., provided by what agency and how), > Permanent retention, > Ensuring authenticity, > Updating/upgrading plans, > User fees, > Licensing, and > Public domain. Metadata Metadata, data about data, are important for public access. Metadata refers to describing the content of a document or record allowing users to find Government information more effectively. Examples of metadata include Government Information Locator Service (GILS) and machine-readable cataloging (MARC) records. To that end, one of the requirements of the Government Printing Office Electronic Information Access Enhancement Act of 1993 (Public Law 103-40) was that the Superintendent of Documents maintain an electronic directory of Federal electronic information (44 U.S.C., Section 4101). Only 27 percent of agency respondents reported that their products have a metadata record, while 69 percent reported no metadata record exists for their products (table 17). In the followup question, most respondents identified their metadata records as either MARC or GILS. Another 5 percent indicated they do not know if a metadata record exists. Table 17. (Omitted - Will appear at a later date) Permanent Public Access In an electronic age, permanent public access to Government information, a critical concept in information resources management, presents far-reaching challenges to the Federal Depository Library Program, Congress, Federal agencies, and ultimately the American public. GPO indicates that permanent public access "means that electronic Government information products within the scope of the FDLP remain available for continuous, no-fee public access through the program" (GPO, 1998, p. 19). GPO recognizes and acknowledges its responsibility to provide ongoing public access to the electronic Government information available through the FDLP. However, in a decentralized networked environment, agencies are asked to share the responsibility for building, storing, disseminating, and preserving a broad range of electronic information products in order to ensure continued public access. Agency respondents reported that permanent public access is primarily provided by their agency (177 responses), by another agency (51 responses), and/or by some other entity (20 responses; table 18). Respondents reported that permanent public access is not provided for 28 products (table 18). However, on closer examination, the responses to the "please specify" questions indicate that either respondents may have misunderstood the concept of permanent public access (as opposed to current access), or they assumed other entities have this responsibility. Some of the common responses to "other" agencies include the Government Printing Office, National Archives and Records Administration (NARA), and contractors and vendors. These responses illustrate respondents' lack of understanding about the difference between permanent public access to electronic information products through their own agencies or through partnerships with GPO, and permanent retention of official Government records through NARA. Furthermore, only 4 of the 28 products for which no permanent public access currently is provided have future plans for providing permanent public access (table 19). Table 18. (Omitted - Will appear at a later date) Table 19. (Omitted - Will appear at a later date) Permanent Retention The mission of the National Archives and Records Administration is distinct from that of GPO. NARA's mission is to preserve and provide public access to permanently valuable records of the Federal Government. Federal agencies are responsible for transferring products to NARA that are scheduled as permanent records (i.e., official records of the Federal Government as defined by the Federal Records Act). Under 36 CFR 1228.188, mediums approved for transfer include open reel magnetic tape, magnetic tape cartridge, and CD-ROM. Agencies currently may not transfer to NARA electronic records that are in a format \dependent on specific hardware and software. However, SGML tags are permitted on electronic textual documents as are records written in ASCII or Extended Binary Coded Decimal Interchange Code (EBCDIC) with all control characters and other non- data characters removed (Lewis Bellardo, Deputy Archivist of the U.S. in a written response to agency questions, October 14, 1998). The responses to the questions on permanent retention may reflect the current status of transferring permanent electronic records to NARA (see questions findings). Only 34 percent of agency respondents reported that their products are scheduled for permanent retention by NARA (table 20). Sixty- four percent reported their products are not scheduled for retention, while another 3 percent reported they do not know if the product is scheduled for retention. However, it should be pointed out that at the time of the survey, the schedule that would have covered electronic records of permanent value was unenforceable under a court case declaring it null and void; therefore, these figures may be unreliable. Table 20. (Omitted - Will appear at a later date) Ensuring Authenticity Although 64 percent of respondents reported that their agency ensures authenticity for the products surveyed (table 21), responses to the open-ended question about how the agency attests to authenticity indicate that respondents may not fully understand the concept. Authentication refers to the process agencies use to ensure the public that the product is an official legitimate product created and produced by the Federal Government agency and no other source (see glossary, p. E-17). Ensuring authentication includes technical as well as policy considerations. Some technical examples of authentication include digital signature technology, special watermarks, disclaimers, or statements on the products. Respondents provided answers that address how the agency ensures that information or data in the product are valid or reliable- an important process, but not the same concept as authenticity. Common responses include the following: > Program office verifies data. > Review CD-ROM contents before public release. > Regulations and source/reliability statement regarding data sources. > Review and approval within agency. > Source of content is the same as the hardcopy version. > Test reliability of data every 5 years, or more often. > Publications are subjected to review by subject matter expert and peer review. Table 21. (Omitted - Will appear at a later date) Updating/Refreshing Plans Twenty percent of respondents reported that their products are updated annually, followed by daily (16 percent), monthly (12 percent), and weekly (5 percent; table 22). However, the majority (47 percent) of respondents checked the "other" response category. The write-in responses covered a broad range of time periods in which products are updated. Below is a sampling of multiple responses: > Quarterly, > As needed, > Irregularly, > Not updated, > Semi-annually, > Every 2 years, > Periodically, and > Twice a month with old version staying on line. Table 22. (Omitted - Will appear at a later date) Changing Supporting Technology The majority of the respondents (71 percent) reported that there are no plans to change the product's supporting technology (table 23). Twenty-eight percent of respondents reported plans to change the product's supporting technology. Table 23. (Omitted - Will appear at a later date) User Fees Public access to no-fee Government information products is one of the core principles upon which the FDLP is based. However, users might be charged a fee if they order certain types of electronic Government information products directly from GPO or the agency that created the product. Nine percent of respondents reported that all users are charged fees, while 20 percent reported some users are charged fees. The majority (72 percent) of agency respondents reported that there are no fees charged to access or use the product surveyed (table 24). The followup question asks about specific fee amounts and the reasons for the charge. The responses to this question vary greatly. A few common responses include the following: > No charge for web access. > Single paper copy free; charge for additional copies. > No subscription fee to libraries and some constituencies. > Files can be downloaded from the Internet for free. There is a charge for published books. > Fees are for paper products only. Table 24. (Omitted - Will appear at a later date) Licensing Many Government agencies purchase licenses from vendors for search and retrieval software to be used with the product to make the data or information more accessible to users. Agencies negotiate various agreements with vendors about who can use the software free of charge. The majority of respondents (69 percent) reported that they do not license commercial search and retrieval software (table 25). For the remaining 31 percent of respondents who have licensed commercial software, the license covers use by all the key constituencies including agency personnel (73 responses), public users (69 responses), agency's primary target constituencies (65 responses), Federal depository libraries (59 responses), and/or all libraries (59 responses; table 26). Table 25. (Omitted - Will appear at a later date) Table 26. (Omitted - Will appear at a later date) Public Domain Public domain, a critical component of public access, means that the information, product, or publication is not copyrighted and therefore can be reproduced by anyone without obtaining copyright permission. One of the goals of an electronic FDLP is to provide public access to any Government information product free of copyright or copyright-like restrictions (GPO, 1996, p. 2). The majority of respondents, 86 percent, indicated that all parts of their surveyed products are in the public domain (table 27). Another 10 percent indicated that part of the product is in the public domain, while 5 percent reported that the product is not in the public domain. The followup question that requests an explanation of the second response (i.e., part of product is in the public domain) uncovered these typical responses: > Copyrighted tables are not in the public domain. > There are some copyright-protected logos and trademarks. > Includes copyrighted material that would require approval for reproduction. Respondents offered a wide variety of explanations for products that are not in the public domain: > Retrieval software is proprietary and use is licensed. > Commercial vendors lease the database for distribution. > Songs and performances are protected by copyright. > Books are available only to eligible blind patrons of our program, by law. Table 27. (Omitted - Will appear at a later date) Section E Responses The final section E of the questionnaire contains one open-ended "comments" question. These responses are too broad and disparate to provide a detailed itemization. Most of the comments are explanations of issues covered in the survey. However, below are a few comments that cover issues not directly addressed in the survey. > Our mission, mandated by the Americans with Disabilities Act (ADA), is to satisfy all browser requirements (e.g., ASCII browsers like LYNX through the latest versions of Netscape and Internet Explorer). > We produce printed documents and link to electronic documents maintained on the GPO's server. > The product is not published in any electronic form. It is a collection of individual products that are individually published. > I am very new to this area (2 weeks) and received significant contractor assistance in completing this form. > In addition to four other web sites, we will soon web-enable our database with some encrypted modules. > The database is intended to be accessible to the largest audience possible via free or public domain software whenever possible. > This information is available in PDF format on our website to ensure the integrity of the data. Coding in HTML (particularly tables) could lead to mistakes with such a large amount of numeric data. No respondents commented on the survey questionnaire, the project in general, or the process of filling out the survey. Study Questions This section will use findings from two or more survey questions to provide additional information on some of the key issues explored in the study. The responses to these questions relate specifically to the products surveyed. The following questions were chosen because they address one or more of the critical study areas: preferred medium and format standards, permanent public accessibility, permanent retention, user fees, commercial licensing of search and retrieval software, and authenticity. Preferred Medium and Format Standards Study Question 1: What combinations of preferred medium standards are currently used by the respondents? The agencies surveyed are creating and using (in descending order): > Products both in paper format and on the web. > Products both in CD- ROM and the web. > Products both in paper and CD-ROM. Since most of the agencies surveyed create products in more than one medium, what combinations of preferred mediums are they using? Of the respondents who indicated that paper was a medium used and the respondents who reported that CD- ROM was a medium used, only 19 percent reported that they are using both paper and CD-ROM products (table 28). Table 29 shows that of the respondents who reported that they use CD-ROM, and those who reported that they use the web as a medium, 21 percent use both CD-ROM and the web as mediums. However, of the respondents who reported using paper and the respondents who reported using the web as a medium, 64 percent use both paper and the web (table 30). Therefore, the respondents surveyed are creating and using products both in paper format and on the web much more often than they are creating and using products in CD- ROM and the web. An even smaller percentage of products is being created in paper and in CD-ROM. This confirms the earlier finding that paper and the web are the preferred mediums used by the agencies surveyed, but provides additional information about the combinations of mediums used. Table 28. (Omitted - Will appear at a later date) Table 29. (Omitted - Will appear at a later date) Table 30. (Omitted - Will appear at a later date) Study Question 2: What combinations of preferred format standards are used by the respondents? The respondents are slightly more likely to use HTML in combination with PDF than they are to use HTML together with GIF. However, they are almost as likely to use HTML, GIF, and ASCII together as they are to use HTML, PDF, and ASCII together. Of the respondents who reported using HTML as a tagged markup format, and those who reporting using PDF as an image format, 39 percent reported the use of both HTML and PDF (table 31). Of the respondents who checked HTML, and those who checked GIF as an image format, 36 percent checked that they used HTML in combination with GIF (table 32), slightly less than those who used HTML and PDF in combination. Since PDF is the preferred image format used by agencies (table 4), this is not an unexpected finding. Table 31. (Omitted - Will appear at a later date) Table 32. (Omitted - Will appear at a later date) However, when the formats are used in combinations of three, it appears that respondents are almost as likely to use HTML, GIF, and ASCII (21 percent) together as they are to use HTML, PDF, and ASCII (22 percent) together (tables 33 and 34). Table 33. (Omitted - Will appear at a later date) Table 34. (Omitted - Will appear at a later date) Public Access to Products Study Question 3: If a product is permanently accessible, is it also likely to be scheduled for retention with the National Archives and Records Administration (NARA)? No, the majority of products surveyed that are permanently accessible are not likely to also be scheduled for permanent retention with NARA. Permanent public accessibility and permanent record retention are two distinct concepts. GPO, through the FDLP, has a historical commitment to permanent accessibility of paper products, and now to electronic products. To that end, GPO requests that agencies provide information products in all mediums to GPO and work with GPO and Federal depository libraries to provide permanent public accessibility to electronic products. Agencies are responsible for transferring those products that are scheduled as permanent records to NARA. However, not all records that are scheduled for permanent retention by NARA are products within the scope of the FDLP. For such records, permanent public accessibility through the FDLP is not an issue. Of the respondents who said yes, the product is permanently accessible, and the respondents who reported their product is scheduled for retention with NARA, only 25 percent reported that the product is both permanently accessible and also scheduled for retention with NARA (table 35). The majority of products that are publicly accessible are not likely to also be scheduled for retention with NARA. While there is not information from the survey data to identify reasons for this situation, some possibilities are that: > the product is not a permanent or official record of the U.S. Government as defined by Federal Records legislation. > the product is in a format that is accepted by GPO but in a format that NARA does not currently accept, and therefore could not be transferred to NARA. > agencies are overlooking this important part of the information life cycle of electronic products. Table 35. (Omitted - Will appear at a later date) Study Question 4: Is the licensing of search and retrieval software likely to be a barrier to unrestricted public access? No, for the products surveyed, the licensing of commercial search and retrieval software by the agency does not appear to be a barrier to unrestricted (no fee) use. Of the respondents who reported that they license commercial search and retrieval software for their products, and those who reported that all users are charged a fee for the products, only 2 percent who license commercial search and retrieval software also charge a fee for all users (table 36). A slightly larger number of respondents (4 percent) who use commercial search retrieval software for their products also charge a fee for some users. Twenty- five percent of respondents who license search and retrieval software for their products charge no user fees. Table 36. (Omitted - Will appear at a later date) Study Question 5: Are respondents who have purchased commercial search and retrieval software for their products also transferring the products to NARA? No, based on the products surveyed here, respondents are not transferring permanent records to NARA for products in which they have purchased commercial search and retrieval software. Of the respondents who reported issuing commercial search and retrieval software, and those who reported scheduling products for permanent retention with NARA, only about 10 percent who have purchased commercial software for products have also scheduled their products for permanent retention with NARA (table 37). Table 37. (Omitted - Will appear at a later date) Other Issues: Authenticity and Metadata Study Question 6: If an agency ensures authenticity, is it also likely to provide permanent public access to the product or do agencies rely on another agency to provide permanent public access? Yes, based on the products surveyed, agency respondents who ensure authenticity for their products are also more likely to provide permanent access to them directly, rather than through another agency. Of those respondents who reported they ensure authenticity and those who reported they provide direct permanent public access to their products, 47 percent both ensure authenticity for their products and provide direct permanent access to them (table 38). However, only close to 14 percent of the respondents who reported they ensure authenticity for their products also reported that another agency provides permanent public access to the product (table 39). Table 38. (Omitted - Will appear at a later date) Table 39. (Omitted - Will appear at a later date) Study Question 7: Are online products hosted by the agency that created it more likely to have a metadata record than products hosted by another agency? Yes, based on the products surveyed, those that are hosted by the agency that created it are more likely to have a metadata record than those hosted by another agency. Tables 40 and 41 show that almost 20 percent of the products that are hosted by an agency also have a metadata record, while only 7 percent of the products that are hosted by another agency also have a metadata record. Table 40. (Omitted - Will appear at a later date) Table 41. (Omitted - Will appear at a later date) 4 Qualitative Findings This section of the report highlights the qualitative findings from the three site visits with Federal depository libraries, five agency meetings, and six expert interviews. Appendices F through H include interview questions and detailed responses from the site visits to depository libraries (F), agency meetings (G), and expert interviews (H). Site Visits to Federal Depository Libraries The purpose of the site visits to the three depository libraries was to identify the key issues and concerns librarians have about providing public access to electronic Government information products through the Federal Depository Library Program. (See Appendix F for a complete list of questions posed to librarians.) The site visits were held with one regional depository library and two selective depository libraries in the Washington, D.C., metropolitan area. It is important to note that the three libraries visited may not be representative of all depository libraries in terms of the geographical location and library user characteristics (e.g., education level, socioeconomic status, etc.). Therefore, readers are cautioned about generalizing these observations to all depository libraries. Highlights of the three librarians' responses are provided below. Appendix F contains a detailed description of the librarians' responses to the interview questions. Highlights of Site Visits to Three Depository Libraries User Needs and Concerns > Librarians interviewed noted that the general public is still more comfortable using Government information products in paper and microfiche than they are using the Internet. Patrons (and librarians) are least comfortable using products on CD-ROM. > Librarians expressed concern about the difficulty patrons experience in accessing Government-produced CD-ROMs that are not standardized. They reported that the search and retrieval software is different for each CD, CD-ROMs often have no installation instructions or user documentation, and they are not user-friendly. > Librarians indicated that some users are still intimidated by electronic mediums and computers. Most users ask librarians to help them search for materials on the web and frequently need help downloading large files. > Librarians noted that since most Government websites only contain the most recent information, they are concerned about users having permanent public access to retrospective Government information on the web in the future. Librarians' Concerns: User Fees, Hardware, Training, and Costs > Although none of the libraries visited currently charge fees for printing materials from the Internet or CD-ROMs, all three librarians are either considering charging fees or are planning to charge fees and expressed concerns about how this will affect their patrons. > Users do not have access to enough workstations, so the libraries must limit use. Also, if libraries had additional money for hardware, they would order hardware in support of CD-ROMs (e.g., a new CD-ROM server and an 18-disk CD changer). (Even though CD-ROM is the least preferred medium and declining in number in the FDLP.) > All librarians interviewed expressed concerns about finding time and money to train librarians and staff, especially on using CD- ROM products, but also on downloading files, effectively searching the Internet for Government information, and creating and maintaining web pages. They welcome any additional training on using GPO Access, Geographic Information Systems, etc. > Time and money permitting, librarians expressed interest in establishing partnerships with GPO and other Government agencies to put some retrospective online Government information on their servers so users can have reliable access to it in the future. In addition, librarians would like to provide outreach to public schools, community centers, etc., to educate students and adults about the wide variety of valuable information available from the Federal Government. > One librarian expressed strong feelings about the need for Congress to provide long-term financial support to Federal depository libraries so they can continue to provide permanent public access to digital materials. This librarian's perspective was that the cost to provide access to electronic Government information is steadily increasing. Agency Meetings Meetings were held with four agencies between September 14 through September 24, 1998: > Department of Health and Human Services > Environmental Protection Agency > U.S. Department of Education > U.S. Department of Commerce Although meetings also were held with the U.S. Supreme Court and the National Archives and Records Administration (NARA), these two agencies did not respond to the agency discussion questions in the agency meetings; they chose to discuss the survey questionnaire only. However, Lewis Bellardo, the Deputy Archivist of the United States, sent in written responses to the discussion questions (Appendix G). The National Commission on Libraries and Information Science (NCLIS) provided discussion questions for the agency meetings; Westat modified some of the questions with NCLIS's approval. The purpose of the agency meetings was to supplement survey data by collecting more general information on electronic Government information products that are not product-specific. For example, one of the survey objectives is to assess the cost-effectiveness and usefulness of preferred medium and format standards, an issue that was not directly addressed on the survey. In addition, the agency meetings afforded NCLIS, GPO, and Westat an opportunity to review the survey questionnaire with agency respondents and to address any questions they might have. Highlights of the agency meetings are provided below. For a more detailed summary of the responses to the 12 questions posed to agencies, see Appendix G. Agency Meeting Highlights Preferred Mediums and Formats > Agencies interviewed reported using the same preferred medium and format standards as those reported by survey respondents: web, CD-ROM, bulletin board; HTML, PDF, and ASCII. Additional preferred formats mentioned by agency representatives include TIFF, JPEG, and Lotus/Domino. > All agencies are exploring a wide range of innovative and creative web design approaches including the use of SQL, Oracle, ColdFusion, and animated GIFs. Some examples of ways in which agencies are utilizing web technologies include data warehousing, interactive GIS, multimedia CD-ROM, live "real-time" web casting of selected speeches, and real-time forecasting of air pollution levels for 22 states. > Four of the five agencies have guidelines or "best practices" for presentation and organization of products or publications on the web. Most of the guidelines discuss preferred formats for some types of products. The most common problem experienced by the agencies in this regard is compliance issues (i.e., encouraging personnel to adhere to them). > There are some trends for migrating certain families of products to the web for newsletters, training manuals, annual reports, and conference proceedings and presentations. > Agencies consider many factors when making decisions to create/retain products in more than one medium: budget, cost, accessibility to users, and size of audience the product reaches. The decision-making process varies from agency to agency and sub-unit to sub-unit. Assessing User Needs > All agencies reported involving users in testing and evaluating the usefulness of the web and CD-ROM products. The most frequently used assessment methods are focus groups, videotaping of users, and online user surveys. Agencies are using the results of these evaluation methods to add and change some formats and mediums as well as content. > Four of the five agencies interviewed reported that they maintain some type of GILS records to help the public locate their information resources. Information Life Cycle Management, Permanent Public Access, and Permanent Retention > No agencies are addressing the following key information resources management issues: permanent public access, information life cycle management, and permanent retention. (The expert interviews provide some insight into the reasons that agencies are not addressing these issues. See the summary section of this report.) Cost-Effectiveness of Various Mediums and Formats > No agencies have conducted a formal cost-benefit analysis for creating products in formats and mediums for distribution to the Federal Depository Library Program. Generally, agency representatives reported it costs less to create products for the web because they can avoid production, printing, and distribution costs for paper and CD-ROM products. Expert Interviews The interviews with six experts also enriched and supplemented the survey findings. Since the interviews were conducted after the site visits and agency meetings, they were helpful in providing a broad context within which the survey findings could be viewed. The expert interviews were conducted between October 27 and November 24, 1998. Telephone interviews were held with two webmasters, two preservation specialists, and two professors of information resources management. These experts were selected from a list provided by the NCLIS. Highlights from each set of interviews are provided here. (See Appendix H for a detailed summary of each telephone interview.) Interviews With Webmasters Highlights from interview with webmasters Jerry Malitz, National Center for Education Statistics (NCES), and Linda Wallace, the Internal Revenue Service (IRS), on October 27, 1998. Preferred Formats > The IRS, unlike the other agencies surveyed, primarily uses SGML, followed by PDF, HTML, and Postscript. They train their authors to use SGML because they consider it "intelligent data" that can automatically generate other formats (e.g., web, BBS, Fax on Demand) through templates and filters. All NCES publications are in PDF, then HTML (optional); they rarely put an entire publication in HTML format only. > The IRS has conducted a cost-benefit analysis of the costs of delivering requests through different formats. They have found that it costs $3 per phone call to fill a request, 1 cent to access their Internet site for forms, etc., and $2.50 to make a CD-ROM containing 5 years of IRS publications. > IRS indicates that it provides permanent public access to tax information online for 5 years, and from their "core repository library" for about 14 years, but not for every application. However, this 5 to 14 years means that IRS provides current but not permanent public access to their Government electronic products. > All IRS documents are ADA-compliant, online searchable, and downloadable. User Needs > Both IRS and NCES assess and evaluate the effectiveness of their web sites with advisory groups (IRS), or for NCES, through an Internet Working Group made up of representatives from each program area. > Both agencies have GILS records. Interviews With Preservation Specialists Highlights from interview with preservation specialists Evelyn Frangakis from the National Agricultural Library (NAL) and Abby Smith from the Council on Libraries and Information Resources (CLIR), November 10, 1998. Goals of Preservation > It is useful to think about preservation goals such as enhancing the long- term preservation of and access to information of enduring value for as long into the future as possible. > There is no standard accepted method of ensuring long-term access to digital information. It may be more accurate to say that one of the primary goals of preservation is to set up systems that "sustain predictable levels of loss." Barriers to Preservation of Digital Materials > The concept of preservation in the traditional preservation world examines the concept of permanence, but in the print world the concept of permanence relates to chemical inertness and mechanical durability. These concepts do not translate easily into a digital world. > There are two problems with digital preservation: (1) media in which information resides may be unstable; and (2) software/hardware configurations on which information is stored becomes obsolete so quickly that even when one migrates information from one system to another, much of the data and functionality are lost. > Other barriers to digital preservation include that it is difficult to understand what we can and cannot do under current copyright law, and any transmission link is as strong as the weakest link. The weak link in the transmission of electronic information is human beings, not technology (e.g., no one agency or organization has stepped forward to address issues like information life cycle management). Preservation of information must be thought about at the creation stage, not after the information has been collected and disseminated. > One of the core infrastructure problems is the need to create a failsafe archives mechanism for materials that disappear from the web. Current Preservation Models and Initiatives > NAL and partner institutions are implementing a model for permanent public access and preservation of agricultural literature that addresses all the key issues in information resources management: inventory and life cycle of information, permanent public access, technical requirements, and user access and retrieval. (NAL is one of the few examples for ensuring a failsafe archives for preservation of agricultural literature.) CLIR Initiatives > CLIR commissioned a report by Jeff Rothenberg from RAND Corporation on emulation. (Emulation is the process of imitating one system with another so both accept the same data, execute the same programs, and achieve the same results.) > CLIR commissioned an analysis of migrating file formats to do a risk assessment associated with those file formats during migration. > CLIR identified a computer scientist at Carnegie Mellon University, John Ockerbloom, who has developed a system of file conversion called TOM (Typed Object Model), a type of migration that converts web-based materials to different file formats. Interviews With Information Resources Management Specialists Highlights from interviews with John Bertot, November 18, and Charles McClure, November 28, 1998. (These two telephone interviews were held separately.) Barriers to Successful Implementation of Information Resources Management Initiatives > Agencies are struggling with issues such as permanent public access, information life cycle management, and permanent retention due to a general lack of information resources management (IRM), as well as organizational policy integration for Federal Government legislation and initiatives. > Agencies do not view information as a strategic resource that is directly related to agency missions. Most Government IRM initiatives focus on the technology side of IRM because it is tangible. > Sometimes smaller agencies are more successful in implementing IRM initiatives due to fewer organizational and communication barriers to working collaboratively. > The Information Technology Management and Reform Act of 1996 did little to clarify the role of the CIO and IRM staff, so agencies are now struggling with what to do with these functions. > Agency resources are now almost exclusively devoted to Y2K efforts with little time and resources left to devote to IRM, standards, and operability. > Staffing and training are critical for both IRM and CIO staff. > Challenges for agencies in the next few years include how to coordinate information technology and information technology management, interoperability and standards that cut across agencies, and education and training of staff. 5 Discussion of Quantitative and Qualitative Findings This section synthesizes, integrates, and discusses issues and the major themes that emerge from the survey and the qualitative data collection activities, including the interviews with Federal depository librarians, agency personnel, and other experts, and the literature review. The section is arranged by the following key study issues: > Preferred mediums and formats, > Evaluating websites, > Cost-effectiveness of formats and mediums, > Depository library needs, > Public access (public domain and user fees), > Permanent public access and preservation. Preferred Mediums and Format Standards Survey respondents and agency representatives reported they most often use the following mediums: > Paper > Web > CD-ROM > Bulletin board systems (to a lesser degree) Both respondents and representatives also reported use of the following formats: > HTML > PDF > GIF > ASCII > TIFF However, most agencies whose products were surveyed use these mediums and formats as a common agency practice, rather than as an agency mandate. In addition, agency representatives and webmasters reported they use SGML, Oracle (with ColdFusion or SQL), JPEG, and TIFF because these formats meet the information needs of their individual constituents or are used in some of their creative web approaches. The IRS is one of the few agencies interviewed that uses SGML. IRS' Linda Wallace, one of the webmasters who served as an expert consultant for this project, indicates that most agencies do not use SGML because it is difficult to use. But Wallace noted that IRS uses SGML because it is much more robust, and it is easy to change a document format to match customer needs (e.g., tax law information for consumers and for lawyers). (See Appendix H for detailed notes on the telephone interview with Linda Wallace.) A few survey respondents indicated they are planning to change to or add XML or other object-oriented formats. XML may be appealing to some agencies because data can be stored in a format provided by XML that is transferable to a wide range of hardware and software environments (Bryan, 1998, p. 14). In addition, according to Stuart Culshaw, XML makes it easier for authors to produce documents for many different output mediums (i.e., paper, online help, web) from a single source (Culshaw, 1998, p. 7). Most of the agency representatives who participated in the meetings also reported that their agencies have established written guidelines or "best practices" that specify preferred formats for the presentation of information on the web. Even though these guidelines are not agency-mandated, they seem to be a common agency practice. Several of the agencies interviewed indicated they have modified or adopted their agency guidelines from the guidelines established by the Federal Web Consortium in 1996. The Consortium, founded in 1994 by the National Science Foundation and the U.S. Nuclear Regulatory Commission, established guidelines with other Government agencies (see http://www.dtic.mil/staff/ct homps/ guidelines/). The guidelines provide suggestions to help the Federal community accomplish agency missions to improve services to customers. Consortium guidelines cover a wide range of topics including: > Home page checklist (content, navigation/organization, style/markup); > File formats (i.e., agencies should not be restricted to proprietary formats such as WordPerfect, Microsoft Word, SAS, PDF); > Rationale for using certain kinds of formats such as HTML, GIF, and JPEG; > Guidelines for formats to be used for downloading or display (e.g., HTML, GIF, JPEG, PDF, Postscript); and > Emerging standards. Agency representatives indicated that one of their biggest challenges is to convince personnel from all program areas to follow the agency's internal guidelines when creating products for the web. Another challenge for agencies is to consolidate web guidelines from different agency sub-units so they are complementary rather than contradictory. Evaluating Websites Agency representatives, per OMB Circular A-130 and the Government Performance and Results Act of 1993, are assessing the usefulness of their websites and CD- ROMs as part of a larger effort to measure program effectiveness. Focus groups, online customer surveys, and videotaping of customers online are the most common ways in which agencies evaluate and test products on their websites. One objective of the evaluation is to test both formats and web approaches. Based on the evaluation results, agencies may change or add formats. For example, one agency, after testing their site with children, eliminated PDF files on the site and made it more interactive. Another agency made the decision to keep their BBS because many of their international users do not have ready access to the web. One agency webmaster indicated that the needs of their business clients, who participate on their advisory board, help drive their format needs. A fourth agency stores its documents in TIFF format for image and textual data. As customers request documents, the agency converts them to PDF so customers can download the material. A fifth agency created a simple set of rules for producing CD-ROMs based upon user input: keep it simple to use, intuitive, and self-tutorial. User needs for easy access to electronic information products will continue to affect how agencies make decisions about formats and mediums. Bertot and McClure suggest that more agencies should continue to monitor the information needs of the public as well as targeted constituencies to enhance current access to electronic Government information products (Bertot and McClure, 1997, p. 288). Cost-Effectiveness of Formats and Mediums None of the agency representatives who attended the agency meetings has conducted a formal cost-benefit analyses for producing or creating products in preferred or emerging formats, mediums, or online approaches for distribution to the FDLP. Most agencies reported that migrating products to the web substantially reduces printing and distribution costs associated with paper mediums. However, the crosstabs in tables 28-29 reveal that many Government products are still produced in more than one medium and often in more than one format. Providing permanent public access to electronic mediums ultimately may exceed the one-time costs associated with producing and distributing the same information in print or microform (GPO, 1996, p. 24 and A71-A74). In her role as Chief, Electronic Information Services, at the Internal Revenue Service, Linda Wallace has analyzed the costs of delivering documents to customers (Appendix H). She found that: > It costs IRS $3 per call for the public to call into their toll-free number and for IRS to fill the request. > The cost to IRS for the public to use the Internet to access and use the forms is 1 cent, a difference of 300 to 1. (However, this shifts the cost to the public, who must have access to the Internet.) > It costs IRS $2.50 to make and distribute to all public libraries (including the Federal depository libraries) each CD-ROM containing 5 years of tax forms, instructions, and publications. Based on these numbers, the IRS has made some internal decisions about where they will focus their resources and time in order to reach the maximum number of customers in the most cost-effective manner. Depository Library Needs Since depository librarians serve as the intermediary between the users and electronic information products, their observations and experiences about user and library needs are critical. In general, the five agencies interviewed focused on public users or their target audiences rather than depository library users when discussing usage of their electronic Government information products. First, the librarians interviewed emphasized that many patrons still prefer Government information in paper mediums, followed by the web and then CD-ROM. The respondents surveyed indicated that many of their products are produced both in paper and on the web. Second, librarians expressed concerns about lack of standardization for producing Government CD- ROMs. One agency representative indicated that they are undertaking several initiatives to make their CD-ROMs more user- friendly by making them as intuitive as possible and incorporating a user testing component into the production schedule. A third important concern for the librarians interviewed is the rising cost of computer hardware and the simultaneous rise in user expectations for state-of- the-art computer workstations. Although the three libraries recently received updated computer workstations that met or exceeded the recommended minimum guidelines for depository libraries, they are beginning to change their policies on access to workstations by placing a time limit on their use. A fourth issue concerns the rising costs to purchase and maintain new equipment, which have caused depository librarians to reconsider their policies on charging printing fees. One librarian indicated that their library already charges patrons for photocopying materials; this change is not dramatic, but it does affect the concept of no-fee access when an overwhelming number of products are offered on the Internet. Fifth, time and resources to train library staff (and patrons) on how to use the new technology (i.e., how to download files), conduct Internet searches, design and develop their own websites, and load, search, and use CD-ROMs are major concerns expressed by the depository librarians interviewed. The fact that Government information exists in a variety of mediums and formats only increases rather than diminishes the need for training. Finally, all librarians are troubled by how GPO, the FDLP, and Government agencies will address the problems of permanent public access to electronic information products that are constantly being replaced and updated by new ones. In addition, the preservation of retrospective electronic Government information is an issue of concern. Public Access The survey data revealed that 15 percent of the products surveyed are not in the public domain, for all or part of the product. In addition, user fees are charged for 30 percent of the products. These data suggest that these two critical public access goals have not yet been achieved. Permanent Public Access to and Permanent Retention of Electronic Government Information Perhaps more than any other issues, permanent public access and preservation pose two of the greatest challenges to the FDLP, and ultimately to the public. Each of the experts raised different issues and shared various perspectives about these issues. It might be helpful here to summarize their perspectives and describe initiatives underway to address the problems associated with the provision of permanent public access and preservation. Most of the survey respondents indicated that permanent access is currently provided for the products surveyed, although most of the responses indicated that this concept is not fully understood and that access is not provided by the agency responsible for the product. Instead, they are relying on GPO, Federal depository libraries, the National Technical Information Service, or other agencies to provide this permanent public access. In its policy and planning document, Managing the FDLP Electronic Collection (see ttp://www.access.gpo.gov/ su_docs/dpos/ecplan.html), GPO states that "the `first- level' collection management activity depends upon knowledge that the products exist. In order to ensure current and permanent access, GPO will .rely on notification from and outreach to other agencies and notification from the depository library community." The responses of agency representatives on the issue of permanent public access may provide additional information about the problem. Most agency epresentatives said their agencies had not discussed the issue or were exploring the issue to see how it should be addressed, and they indicated that they did not understand the concept of permanent public access in relation to permanent retention. The one exception was the representative from National Archives and Records Administration, who is clear about the agency's role to provide permanent public access to its own products. It might be helpful here to clarify the distinctions between the two concepts. GPO's definition of permanent public access "means that electronic Government information products within the scope of the FDLP remain available for continuous, no-fee public access through the program" (GPO, 1998, p. 19). Lewis Ballardo, deputy archivist of the United States, in a recent article in the Washington Post (March 12, 1999, p. A01) stated that the problem of digital preservation must be addressed "or memory will be lost for the latter half of the 20th century." In addition, Bellardo, in a written response to agency questions, articulated agency responsibilities to GPO for permanent public access and to NARA for permanent retention. GPO will accept products in all mediums to provide continuous, no-fee public access, if notified by agencies that access is being discontinued. Agencies are responsible for transferring those products that are scheduled as permanent records (official records as defined by Federal Records legislation) to NARA. Linda Wallace described the IRS' methods for providing current public access to their materials. Using SGML format, the IRS has built and maintains a core knowledge repository to generate media output in any application to respond to customer needs. The repository maintains materials for 14 years, but not for every application. In addition, all tax forms, publications, instructional materials, etc., are available online for 5 years. Since none of the agencies interviewed is providing permanent public access to its products, it was useful to ask two information resources management experts, John Bertot and Charles McClure, to provide some larger context within which the problem can be viewed. Perspectives on Permanent Public Access and Information Life Cycle Management from Information Resources Management Experts Both Bertot and McClure have extensively studied and taught information esources management (IRM). They attribute the lack of successful implementation of IRM initiatives in the Federal Government to the following factors: > There is no comprehensive integrated Federal IRM policy; current policies do not adequately address permanent public access, information life cycle, and electronic records management. > There is no strategic vision of IRM by agencies; information is not viewed as a resource that should be used to accomplish agency missions. > Most agency initiatives focus on the technology side of IRM because it is tangible. > Most agencies are targeting their information technology resources toward Y2K efforts. > There is no clear distinction between the role of information resources managers and CIOs. > There is no ongoing training for IRM and CIO staff. (See Appendix H for detailed notes on telephone interviews with Bertot and McClure, and Bertot and McClure, 1997, pp. 280- 282.) There are many IRM policy instruments from the Paperwork Reduction Act of 1980 and 1986, OMB Circular A-130 (1985; and 1993 and 1994 revisions) through the Information Technology Management Reform Act (ITMRA) of 1996 and Executive Order 13011 (July 1996). But Bertot and McClure (1998) emphasize that there is still a lack of an integrated policy. For example, in their focus group with IRM managers, Bertot and McClure noted that managers felt that the Paperwork Reduction Act assumed that the managers understood and knew how to manage the information life cycle, but they agreed that agency management at all levels never grasped the concept either in theory or in practice. In addition, the ITMRA that created a position for an agency- based CIO to oversee agency IRM activities and to provide education for agency IRM personnel and agency managers (among other things) does not clarify the relationships between and among CIOs and IRM managers. Consequently, it is ambiguous about whether the agency CIO's organization replaces, incorporates, or is separate from current agency IRM functions. Given this larger context, it is not surprising that IRM issues such as information life cycle management, preservation, and permanent public access have not been adequately addressed. Conventional organizational barriers such as size, culture, poor communication and interaction across and within agencies, and lack of ongoing, strategic training for IRM and CIO staff may exacerbate these challenges faced by agencies (telephone interview with Bertot, Appendix H). (As an example, McClure states that IRM graduate students' degrees are useful for about 1-2 years after they graduate. After that, their skills are 50 percent out of date; telephone interview with McClure, Appendix H.) Several experts are involved in initiatives that address some of these important IRM issues. Current Initiatives on Permanent Public Access and Permanent Retention Several agencies, organizations, and Federal depository libraries with partner institutions are exploring ways to address the problems of permanent public access, preservation, and electronic records management. Appendix H contains more detailed information about each of these initiatives that will be summarized here. Abby Smith from the Council on Libraries and Information Resources (CLIR), and Evelyn Frangakis from the National Agricultural Library (NAL) are supporting research and testing models for permanent public access and preservation. The three CLIR initiatives are described below: > A commissioned report by Jeff Rothenberg from RAND Corporation on emulation. The report has been completed and was published in January 1999. The report describes the weaknesses of migration and the strengths of emulation and sets up a research agenda to develop emulation. (Log onto publications on CLIR site for a summary of Rothenberg's report: http://www.clir.org/pubs/reports/rothenberg/contents.htm l.) > A commissioned analysis of migrating file formats to support a risk assessment associated with those file formats during migration. The study by Cornell University, using data from the Mann (agricultural) Library, will use numeric file formats and databases and text formats. The report, to be finished by September 1999, will include analysis and a template that others can use for doing a risk assessment of migration of those file formats. > CLIR is working with John Ockerbloom, a computer scientist at Carnegie Mellon University (CMU) who has developed a system of file conversion called TOM (Typed Object Model). (See www.cs. cmu.edu/afs/cs.cmu.edu user/spok/www/defense/index.html). CLIR would like to see if they can bring his concepts into fuller application. Smith describes NAL and its efforts to provide permanent public access and to preserve agricultural literature as one of few examples where a failsafe archives might work, partly because NAL is a national library dedicated to one type of literature. Evelyn Frangakis is involved in NAL's efforts to develop its own preservation program that includes a traditional preservation program and digital efforts. Their digital efforts are two-pronged: > Conversion of brittle paper materials into digital products by working with the best available guidelines to implement good preservation practices. They will make this digital material available on the web. > Development of a program to preserve USDA digital materials (i.e., materials that are born digitally). In addition, Frangakis is also involved in a national effort to preserve agricultural literature. The U.S. Department of Agriculture's Digital Publications Preservation Steering Committee was established in 1998 to oversee the implementation of the plan, A Framework for the Preservation of and Permanent Public Access to USDA Digital Publications. This group met for the first time in October 1998. The plan may serve as a model that other agencies or institutions can adapt. USDA is incorporating the following needs and considerations into its framework: > Inventory and life cycle management, > Technical requirements, and > User access and retrieval. USDA is moving ahead to implement the plan. The USDA CIO accepted the report, and under Frangakis' guidance, NAL established a national steering committee made up of representatives from USDA and from agribusiness, the research library community, the U.S. Agricultural Information Network (USAIN), Federal partners, etc. The group will meet on a quarterly basis for the first 2 years. They will establish test groups to explore the technical and funding issues. They are hoping to secure funding for a pilot project to test the framework on an agency within USDA to see how manageable it will be for full-scale implementation (see Appendix H for a detailed description of the Framework). Finally, GPO has established partnerships with several depository libraries and Federal agencies to provide permanent public access to remotely accessible electronic Government information products. Three such partnerships include: > Partnership with the University of Illinois at Chicago's Richard J. Daley Library and the U.S. Department of State (DOS) to provide permanent access to remotely accessible electronic DOS information products. > An Online Computer Library Center/GPO pilot project with the U.S. Department of Education/National Library of Education (NLE) provided free public access through the FDLP to remotely accessible electronic Educational Resources Information Center (ERIC) documents. > A project with the Department of Energy (DOE) Office of Scientific and Technical Information (OSTI) to provide public and depository library access to DOE technical reports in image format via the web service called "DOE Information Bridge" (Aldrich, 1998). Preservation specialists Smith and Frangakis noted that technology is not the biggest barrier to permanent access and preservation; the human infrastructure is not in place yet that would ensure permanent access and preservation telephone interview with Smith and Frangakis, Appendix H). The plans and initiatives described here, coupled with the recommendations for training, policy integration, and support for best practices to implement policies are a few of the strategic actions that appropriate agencies, libraries, and institutions should undertake to ensure that future generations will have unrestricted, no-fee access to Government information in all formats. Next Steps As a followup effort, NCLIS indicated that they will use these findings as a point of departure and analyze them in greater depth. It is expected that this followup effort will result in broad conclusions and recommendations to the President and Congress about how the problems and challenges revealed in this study can be constructively addressed. Bibliography Books and Articles Achenrach, Joel. 1999. The too-much information age: Today's glut jams libraries and lives, but is anyone wiser ? Washington Post, March 15, p. A01. Adler, Prudence S. 1996. Federal information dissemination policies and practices: One perspective on managing the transition. Journal of Government Information 23, 4:435-441. Adler, Prudence S. 1998. The times they are a changin' for our depository libraries. Managing Technology, The Journal of Academic Librarianship, September. Aldrich, Duncan. 1998. Partners on the Net: FDLP partnering to coordinate remote access to Internet- based government information. Government Information Quarterly 15, 1:27- 38. [This issue of Government Information Quarterly is devoted entirely to a symposium on Federal depository libraries. The issue was edited by John A. Shuler and Gary Cornwell.] Beachboard, John C. 1997. Assessing the Information Technology Management Reform Act from a bureau's perspective. Government Information Quarterly 14, 3: 291-311. Bertot, John Carlo. 1997. The impact of federal IRM on agency missions: Findings, issues, and recommendations." Government Information Quarterly 14, 3:235- 253. Bertot, John Carlo, and Charles R. McClure. 1997. Key issues affecting the development of federal IRM: A view from the trenches. Government Information Quarterly 14, 3:271- 290. Bertot, John Carlo, Charles R. McClure, William E. Moen, and Jeffrey Rubin. 1997. Web usage statistics: Measurement issues and analytical techniques. Government Information Quarterly 14, 4:373- 395. Bryan, Martin. 1998. An introduction to the extensive markup language (XML). Bulletin of the American Society for Information Science 25, 1:11-14. Culshaw, Stuart. 1998. SGML, HTML, and XML: Sorting out the puzzle. Information Standards Quarterly 10, 2:6-8. Depository Administration Branch, Library Division. Library Programs Service. 1997. List of classes of United States Government publications available for selection by depository libraries. Washington, DC: U.S. Government Printing Office. Dugan, Robert E., and Ellen M. Dodsworth. 1994. Costing out a depository library: What free government information? Government Information Quarterly 11, 3: 285- 300. General Accounting Office. 1988. Federal information: Agency needs and practices. GAO/GGD- 88_115FS. Washington, DC: Government Accounting Office. Gorman, G. E., and R. H. Miller. 1997. Collection management for the 21st century: A handbook for librarians. Westport, CT: Greenwood Press. Hernon, Peter. 1994. Information life cycle: Its place in the management of U.S. Government information resources. Government Information Quarterly 11, 2:143- 170. Hernon, Peter. 1994. Discussion forum: A time of change. Government Information Quarterly 11, 2:137- 142. Hernon, Peter. 1994. Information life cycle: Its place in the management of government information resources. Government Information Quarterly 2:143-170. Hernon, Peter, and Charles R. McClure. 1993. Electronic U.S. government information: Policy issues and directions. In Annual Review of Information Science and Technology, vol. 28, edited by Martha E. Williams. Medford, NJ: American Society for Information Science. Marcum, Deanna B. 1996. The preservation of digital information. The Journal of Academic Librarianship 22, 6:451-454. McClure, David L. 1997. Improving Federal performance in the information era: The Information Technology Management Reform Act of 1996. Government Information Quarterly 14, 3:255- 269. Okay, John, and Roxanne Williams. 1993. Interagency workshop on public access: A summary for historical purposes. Government Information Quarterly 10, 2:237- 253. Radack, Shirley M. 1994. The Federal Government and information technology standards: Building the national information infrastructure." Government Information Quarterly 11, 4:373- 386. Ryan, J., Charles R. McClure, and R. T. Wigand. 1994. Federal information resources management: New challenges for the nineties. Government Information Quarterly 11, 3:301- 314. Ryan, Susan M. 1996. Downloading democracy: Government Information in an electronic age. Cresskill, NJ: Hampton Press, Inc. Schwartz, Candy, and Mark Rorvig, eds. 1997. Digital Collections: Implications for users, funders, developers and maintainers. Proceedings for the 60th American Society for Information Science Annual Meeting, 1997, vol. 34. Washington, DC, November 1-6. Medford, NJ: Information Today, Inc. Tennant, Roy. 1999. Beyond GIF and JPEG: New digital image technologies. Library Journal (February): 111-112. Turock, Betty J., and Carol C. Henderson. 1996. A model for a new approach to federal government information access and dissemination. Journal of Government Information 23, 3:227-240. Uhlir, Paul. 1997. Framework for the preservation of and permanent public access to USDA digital publications. Washington, DC: National Academy of Sciences, National Research Council. U.S. Code 44: 1901-1916 Title 44-Public Printing and Documents. Chapter 19, Depository Library Program. U.S. Congress. 1988. Informing the nation: Federal information dissemination in an electronic age. LCCN 88-600567. Washington, DC: Office of Technology Assessment. U.S. Congress. House of Representatives. Congressional Record, House. H11245. October 19, 1998. U.S. Department of Commerce. National Telecommunications and Information Administration. United States Advisory Council on the National Information Infrastructure. 1996. A nation of opportunity: Realizing the promise of the information superhighway. Washington, DC: U.S. Government Printing Office. U.S. Government Printing Office. 1995. Biennial report to Congress on the status of GPO access: A service of the U.S. Government Printing Office. Washington, DC: U.S. Government Printing Office. U.S. Government Printing Office. 1996. Report to the Congress. Study to identify measures necessary for a successful transition to a more electronic Federal Depository Library Program. As Required by Legislative Branch Appropriations Act, 1996 Public Law 104-53. Washington, DC: U.S. Government Printing Office. U.S. Government Printing Office. Superintendent of Documents. Library Programs Services. 1998. Managing the FDLP electronic collection. Washington, DC: U.S. Government Printing Office. Web Sites Acrobat in Action. The Acrobat info source on the web. http://www.purepdf. com/action/gov.html. Accessed February 24, 1999. Guidelines for Accessible Web Pages. Microsoft, accessibility and disabilities. For developers, writers and designers: For web page designers. http://www.microsof t.com/enable/dev/we b_guidelines.htm. Accessed February 24, 1999. Introduction to Accessible Web Pages. Microsoft, accessibility and disabilities. For developers and authors: For web page designers. http://www.microsof t.com/enable/dev/we b_intro.htm#tools. Accessed February 24, 1999. McClure, Charles. Guidelines for electronic records management on state and federal agency websites. In Analysis and development of model quality guidelines for electronic records management on state and federal websites. Chapter 6. http://istweb.syr.edu/ ~mcclure/nhprc/nhpr c_chpt_6.html. Accessed February 24, 1999. National Academy of Science's Computer Science and Telecommunications Board (CSTB). Review in which CSTB developed a detailed statement of work that defined the data collection process required to conduct the assessment. http://www.nclis.gov /info/gpo1.html. Accessed February 24, 1999. National Archives and Records Administration Guidelines for Digitizing Archival Materials for Electronic Access. http://www.nara.gov/ nara/vision/eap/eaps pec.html. Accessed February 24, 1999. National Commission on Libraries and Information Science. 1998. National Commission on Libraries and Information Science, government information product assessment questionnaire. http://www.nclis.gov /news/nclisqux.pdf. Accessed March 15, 1999. Ockerbloom, John. 1998. Mediating among diverse data formats. Thesis defense. http://www.cs.cmu.e du/afs/cs.cmu.edu/us er/spok/www/defens e/index.html. Accessed March 15, 1999. Rothenberg, Jeff. 1998. Avoiding technological quicksand: Finding a viable technical foundation for digital preservation. http://www.clir.org/p ubs/reports/rothenbe rg/contents.html. Accessed March 15, 1999. The Federal Web Locator. This site is a service provided by the Villanova Center for Information Laws and Policy and is intended to be the one site to locate federal government information on the World Wide Web. http://www.law.vill.e du/fed- agency/fedwebloc.ht ml. Accessed February 24, 1999. U.S. Department of Education Guidelines. (The Federal Consortium Guidelines are based on the guidelines set by the Department of Education.) http:/www.ed.gov/int ernal/wwwstds.html. Accessed February 24, 1999. U.S. Department of Health and Human Services. Information resources management policy. http://www.hhs.gov/ policy/irm-pol.html. Accessed February 24, 1999. U.S. Department of Health and Human Services. World Wide Web applications and the Internet. Best practices and guidelines. http://www.hhs.gov/ progorg/oirm/bestgui d.html. Version as of May 26, 1998. U.S. Environmental Protection Agency. Information resources management (IRM) policy, standards, guidelines and planning documents. http://www.epa.gov/i rmpoli8/. Accessed February 24, 1999. U.S. Government Printing Office, Library Programs Service. Managing the FDLP electronic collection: A policy and planning document. October 1, 1998. http://www.access.g po.gov/su_docs/dpos / ecplan.html. Accessed March 25, 1999. U.S. Government Printing Office. 1996. Final report to Congress: Study to identify measures for a successful transition to a more electronic Federal Depository Library Program. http://www.access.g po.gov/su_docs/dpos /fdlppubs.html#4. Accessed March 15, 1999. Uncle Sam Migrating Government Publications. Government Publication Department. Regional Depository Library, University of Memphis. Last updated September 2, 1998. http://www.lib.mem phis.edu/gpo/mig.ht m. Accessed February 24, 1999. WWW Federal Consortium Guidelines. 1996 revised guidelines. http://www.dtic.mil/ staff/cthomps/guideli nes/. Accessed February 24, 1999.