CENDI PRINCIPALS AND ALTERNATES MEETING

National Library of Medicine
Bethesda, MD
April 6, 1999

Minutes

Riding the Federal R&D Roller Coaster: FY00 Budget Analysis and AAAS STI Policy Initiatives
U.S. Imagery and Geospatial Information System (USIGS) Architecture Overview
National Library of Medicine (NLM): Current Activities

WELCOME

Mr. Pedtke began the meeting at 9:10 a.m. He thanked NLM for hosting the meeting. Introductions were made, and a special welcome was extended to Dr. Shepanek, Principal for the newest CENDI agency, the U.S. Environmental Protection Agency.

Riding the Federal R&D Roller Coaster: FY00 Budget Analysis and AAAS STI Policy Initiatives
Al Teich, Director, Science and Policy Programs, American Association for the Advancement of Science

Dr. Teich provided highlights from the R&D in the FY2000 Budget report (http://www.aaas.org/spp/dspp/rd/xxiv/contents.htm) and the FY2000 Budgets Project Cuts to Federal R&D: AAAS Analysis of Outyear Projections for R&D in the FY2000 Budget (http://www.aaas.org/spp/dspp/rd/outyr00p.htm), which are prepared by the AAAS each year. The full report will be the basis for the AAAS Science and Technology Policy Colloquium on April 14-16, 1999.

Dr. Teich reviewed a number of different cuts of budget trends and projections to provide an overall perspective on the federal R&D budget. He provided a series of long-term trends; a closer look at the FY00 budget proposals; projections for out-years in perspective; and some ideas on what to expect from the current Congress.

An analysis of the current federal R&D budget situation paints a complicated picture, with some agencies still rebounding from the 1995 budget cuts. Although the balanced budget agreement has spending caps until 2002, this does not show the full picture. Increased R&D spending has been possible because the discretionary spending caps are being exceeded through complex budget processes including emergency spending bills. For example, the cap was exceeded by $ 20B in FY99. This difference has allowed for some large increases in R&D spending for NIH and other agencies. It is unlikely that discretionary spending caps will be eliminated, but exceeding them is likely to continue.

In looking at the overall picture of R&D spending from 1953-1998, including government and industry, federal R&D is a shrinking part of the national picture. Federal spending on defense and non-defense R&D are actually separate pieces of the allocation process. This is not a trade-off area. Under the Reagan Administration, the proportion was nearly 70 percent defense, 30 percent civilian. Under the Clinton Administration, it has come back to 50:50. Dr. Teich looked at trends in non-defense R&D by function. The trends show that research priorities follow national priorities like the emphasis on space sciences in the 60's; the energy crisis in the 70's; and the aging population and health issues in the 90's. Similarly, we see trends by discipline with the life sciences getting the large increases in recent years. For the mid-term historic perspective (1994-99), we see dramatic changes due to the balanced budget frenzy. There were big cuts in 1995-96. Since then, as the budget picture settles, there have been new increases. NIH has been the big winner; NSF is doing well; DoD is down.

Turning to the current situation, the overall proposed FY00 budget from the Administration is $1,798B. The R&D portion of the budget is still quite small at approximately 11-12 percent, or $77.9B. Six agencies get 94 percent of the R&D dollars. All others get only about 6 percent. DoD's portion is about 44 percent. With DOE defense-related programs included, the percentage for defense R&D spending increases to approximately 50 percent. The administration has achieved its goal of reducing defense R&D and increasing non-defense R&D to achieve a 50/50 split.

In the Administration's proposed budget for FY00, NIH R&D will be up a little, DoD would be down about 5 percent, NASA will remain about even, and DOE would be up approximately 7 percent. USDA is likely to be up by 11 percent, but included in that is some hold-over spending in Year 2000 from 1999. NSF may be up about 6 percent.

However, if these figures are compared to 1994 levels, most agencies are just regaining what was lost in the 1995 budget cuts. Other than NIH and NSF, the majority of agencies are either even with or still slightly below 1994 levels in 1999 dollars. In total, if you do not count the NIH increases, non-defense R&D in 1999 is down about 5 percent.

If the President's proposed figures are projected out to FY04, the picture does not improve. AAAS estimates that defense R&D would fall approximately 14 percent, based both on budget cuts and adjustment for inflation. This is despite an increase in overall defense spending. Civilian agency R&D would also be down approximately 6 percent, according to the AAAS estimates. However, Dr. Teich indicated that these are projections, rather than forecasts. The actual budgets have historically been more than the projections. Many factors come into play that will likely provide relief to the negative picture. Congress is likely to add R&D money as it has in the past.

With regard to basic versus applied research, Dr. Teich estimated the total federal spending for basic research in FY98 was about $15B, of which $1B is conducted in defense. About two-thirds of NIH is considered basic research. In the civilian agencies, the order of the amount of spending is basic research, applied and development. The order is the opposite for defense spending.

Dr. Teich briefly listed some of AAAS's activities in information policy. These include concern about the application of FOIA to grant research data, database protection, cyberspace censorship, anonymous communications on the Internet, and electronic publishing and archiving. AAAS will be helping NSF to involve scientists in national conversations on the Internet-2 initiative. The goal is to ensure that the Internet-2 meets the needs of the research community. They hope to identify how NSF can be most useful in this process. Dr. Teich suggested that CENDI invite Mark Frankel to talk about some of these policy initiatives in more detail.

Discussion

Members of CENDI asked if there is any indication that a downturn or an upturn in R&D spending at the federal level has any impact on private sector R&D spending. Dr. Teich replied that the perception is that private sector R&D spending is down, but this is probably because many mature companies, such as AT&T, have closed or are closing large R&D installations. However, Microsoft has spent $80M on laboratories. In addition and very significant is the increase in R&D in new industries and smaller firms. Members asked about the impact of multi-national companies. A recent Commerce Report showed that the U.S. has actually seen a net gain in investment by foreign firms conducting R&D in the U.S. However, it isn't clear exactly what kinds of companies or R&D installations are included in this report. R&D spending by percentage of sales is the largest in the computers and telecommunications sector. However, the actual dollars may still be higher in autos and aerospace. There is clearly less basic research and more shorter term R&D.

There is no good way to tell if decreases in federal R&D are offset by private sector R&D spending. Even if numbers could be gathered correctly, the private sector does not segment its activities in the same way that the government does. However, one study suggests that private sector R&D appears to follow behind the government with a time lag of two years or more. While there are tax credits for R&D spending, this amount is unlikely to make any major difference in spending on the part of the private sector. The decisions made by the private sector are made based mainly on their competition and market factors, rather than tax credits. It should be noted that most federal R&D money goes outside the government through grants, agreements, or partnerships to universities, private organizations, and local governments where it helps to fuel the economy.

The group discussed the division of R&D budgets among the agencies. Dr. Teich noted that R&D spending reflects national priorities. Two-thirds of the total civilian R&D in the 1960s was spent on aerospace during the race to put a man on the moon. It is also possible to see the rise and fall of energy concerns during the 1970s. With the graying of America, there has been a steady increase in health related spending. This is now about 40 percent of the non-defense R&D spending. The highest expense by discipline is in the life sciences. It was noted that R&D in the life sciences is not just conducted at NIH, but within Defense, Agriculture, EPA, Interior, etc.

U.S. Imagery and Geospatial Information System (USIGS) Architecture Overview
Clyde Housel, Systems Design Engineer, National Imagery and Mapping Agency [NIMA])

NIMA was formed through the consolidation of the Defense Mapping Agency (DMA), the Central Imagery Office (CIO), the Defense Dissemination Program Office (DDPO) and the National Photographic Interpretation Center (NPIC) as well as the imagery exploitation and dissemination elements of the Defense Intelligence Agency (DIA), the National Reconnaissance Office (NRO), the Defense Airborne Reconnaissance Office (DARO) and the Central Intelligence Agency(CIA). Its mission is to collect, process and archive imagery and geospatial information. NIMA purchases images from commercial providers and accepts images and maps from government agencies. NIMA stores and disseminates this information, exploiting it for geospatial and intelligence data. Public access is available for unclassified information. NIMA has approximately 260 organizations who have imagery and GIS needs who directly use NIMA information. There are approximately 6,000 users who access NIMA indirectly.

In response to the merger of these variant systems, NIMA is developing the United States Imagery and Geospatial Information System (USIGS) Architecture. In this context, the term "architecture" is defined as "the structure of components, their relationships, and the principles and guidelines governing their design and evolution over time." This architecture will guide the evolution of the legacy systems into the future, ensuring integration and interoperability. The architecture provides a framework including a conceptual model upon which local logical and physical models can be developed.

The genesis of the architecture is the NIMA's plan in response to the Information Technology Management Reform Act (ITMRA) of 1996. The ITMRA required federal agencies to develop and implement information technology architectures that ensure alignment of the information system requirements with the agency's missions and goals, adequate interoperability, redundancy and security, and the application and maintenance of standards by which the agency evaluates and acquires new systems. The NIMA ITMRA plan requires compliance with the Command, Control, Communications, and Computers, Intelligence, Surveillance, and Reconnaissance (C4ISR) program. The C4ISR Architecture Framework provides guidance on how to describe an architecture. However, it does not provide guidance on how to design and implement specific architectures. The goal of the C4ISR is to standardize the system description so it is possible to see how the systems interface.

The architecture is driven by several key elements. The architecture must provide seamless access to tailorable imagery, imagery intelligence and geospatial information. Information must be available on very short timelines and at the lowest possible classification level. The best available information technology must be used, with an emphasis on the use of commercial products. The architecture is also driven by the need to prepare for future imagery architectures, to eliminate functional duplication and maximize the reuse of components, and, ultimately, to save time and money.

The USIGS Architecture is distributed. It provides, through interoperable systems and metadata, access to imagery and mapping information across the partner organizations, and it integrates these remote sources seamlessly with the customer's own libraries. For example, the Command Imagery Library at the National Air Intelligence Center (NAIC) will be one of the customer libraries. The development of imagery product libraries will be done in smaller groups. Access is through a common application programmer interface (API). The architecture presents information about the missions to be accomplished, the operational elements to whom the missions are assigned, nodes where these elements are located, tasks and activities to be performed, the information flows, systems to support the accomplishments of the tasks and the defined information flows, system attributes, performance measures, and technical criteria that govern the development and implementation of the systems.

The architecture is compliant with the C4ISR architecture descriptions. The design is separated into the Operational Architecture, which identifies operational relationships and information needs; the Systems Architecture, which relates system capabilities and characteristics to operational requirements; and the Technical Architecture, which prescribes standards and common practices. The final component is the Conceptual Data Model, which ties all the pieces together. The Conceptual Data Model defines the data content and the interrelationships of the content. Each of these components is reflected in specific sections of the architectural documentation.

The conceptual data model provides the framework for development of logical data models, and ultimately physical data structures. The conceptual data model provides common terms of reference (including common names), describes the logical relationships among the information elements, standardizes the definitions, and defines the current boundaries of the information content. It holds common data elements and can be extended as necessary to meet local needs for additional elements. Current data layers include geospatial, imagery overlays, imagery intelligence, other intelligence information, weather, and user defined layers. Only the first three layers are populated by the Imagery and Geospatial Community. The first three layers include a variety of information and data types, including geophysical, hydrological, features, etc.

NIMA currently provides paper products which are predefined in their scope and content. This new system will allow users to extract certain information on a smaller topographic area , drilling through the layers to identify more precise information. This will provide more customized products.

Future, more specialized briefings can be given on the individual components of the USIGS Architecture, if there is any interest within CENDI (a list of points of contact for the various parts of the architecture was provided.). A CD will be available soon which includes the actual data model and unclassified products related to this project. Some of this information is also available at www.nima.mil.

Discussion

Ms. Carroll noted that there are other groups working on standards within the geospatial community. Mr. Housel indicated that NIMA is involved with the Federal Geospatial Data Committee efforts. Ms. Carroll also asked about the degree of change in the data model. Mr. Housel indicated that the layers in the data model are growing over time and that the conceptual model changes as new layers are identified.

National Library of Medicine (NLM): Current Activities
Dr. Donald Lindberg, Dr. Elliot Siegel, Julia Royall, National Library of Medicine

These are changing times for the National Library of Medicine (NLM). There are many new areas of relevant knowledge, new populations to be served, and new dissemination methods. The life sciences are changing with an ever increasing amount of literature in genetics, involving non-textual material such as molecular structures and sequence data. The biological sequences are interrelated and the linkages between them provide very powerful research information. MEDLINE now serves as the metadata for these structures. This system is supported by the scholarly journals that require deposit of the sequence data with the National Center for Biological Informatics' (NCBI) GenBank.

PubMed (http://www.nlm.nih.gov/pubmed/) has grown substantially. There are now 339 journals that are directly linked from MEDLINE to the full text articles. Access by any given user depends on the arrangement between the user or his/her organization and the publisher. Machine input for MEDLINE metadata is growing. NLM is looking to get from the publishers in SGML, from optical scanning and character recognition, and from NLM data entry.

Another exciting project is the National Cancer Institute's Cancer Genome Anatomy Project. This deals with gene expression and what is the change at the cellular level. Researchers have found that oncogenes are present before birth and generally shut down; i.e., are not expressed and therefore not producing protein. However, various cancers may be related to the fact that some genes start to function again. The array data of gene expression is being captured. It is coming in from all over the world and is being stored at NCBI.

The Unified Medical Language System (UMLS) (http://www.nlm.nih.gov/research/umls/umlsmain.html) and the Metathesaurus continue to grow. The 1999 version will have 626,000 concepts, over 50 vocabularies, and 1.3M terms. There are about 850 institutional users.

The Visible Human data (http://www.nlm.nih.gov/research/visible/visible_human.html/) continue to be used in novel ways. Dr. Lindberg showed a number of applications, including surgical training and support to the identification of colo-rectal polyps by comparing a patient's spiral cat scan with normal colo-rectal structures. NLM will be collaborating with the NIH's Eye Institute, the Deafness Institute, and Dental Research Institute on a detailed head and neck view of the Visible Human. There are approximately 1,000 customers worldwide, with 1,200 application licenses for the Visible Human.

As part of the G-7 Group on Health Care, NLM has been involved in health-related projects internationally. The most interesting has been the development of Smart Cards. France and Germany are well ahead of the U.S. in the development and application of these cards. However, there is now a pilot project with the Western Governor's Association. The SmartCard not only handles the public welfare benefits, but includes family immunization history. The HPP Card is being tested in Nevada, Wyoming and North Dakota with over 30,000 readers deployed. The results show that the holders of the cards value them.

Since MEDLINE (http://www.nlm.nih.gov/databases/freemedl.html) has been made available on the Internet, the number of searches has increased from 7M to 75M and is expected to reach 140M in FY99. The public usage has increased from less than 1 percent to an anticipated 34 percent. Building on this public usage, NLM's MEDLINE Plus is a new service that caters to patients, families and the public. This service brings together the results of pre-defined searches of MEDLINE on topics of interest, such as diabetes and AIDS. Diabetes is currently the most requested topic. It also includes links to related web sites. However, this project has raised an interesting question on how to deal with medical sites with advertising.

In connection with the MEDLINE Plus service (http://www.nlm.nih.gov/medlineplus/), NLM and its Regional Medical Library Network are investigating the degree to which public libraries are involved in disseminating health information. The question is, "does the public bring health concerns to public libraries?" If so, what questions do they bring, what type of answers do they get, and how can NLM help? The study is being conducted at 39 libraries and 219 sites in the New York City, Houston and Baltimore regions. The public libraries have been teamed with medical libraries in the same geographic region, and a Medical Library Public Advisors Group is being formed. The public libraries serve as "listening posts". On recent visits to public library sites in New York City, Dr. Lindberg and Mr. Smith were encouraged by the dedication of the public library staffs and the quality of the reference services provided. A report will be made available to CENDI at a later date.

NLM continues to be involved in telemedicine projects that address technical issues such as signal processing, information issues related to decision support, and policy issues such as arranging to practice medicine remotely (http://www.nlm.nih.gov/research/telemedinit.html/). In addition to the more elaborate satellite communications, NLM has been involved in cable TV and telephone/modem applications. One of these is the Home Health Care visiting nurse application. By having video and audio connections into peoples' homes, the visiting nurse is able to "visit" many more patients than if he were driving to physically visit their homes. Dr. Lindberg also showed examples of diagnostic support provided by medical schools based on information received via telecommunications. In one example, a digital camera was used to photograph a dermatological problem which had gone undiagnosed for some time. When the image was evaluated by a large medical school, the diagnosis was made very quickly.

Dr. Siegel and Ms. Royall then described in more detail the NLM's involvement in the Multi-lateral Initiative on Malaria (MIM) as chair of the Communications Working Group. The goal in this project is to provide the ability for researchers and doctors at 100 malaria research sites in Africa to talk with each other and with colleagues and research institutes internationally. It not only provides for discussions with colleagues, but access to database information and ultimately to the literature.

Dr. Harold Varmus, Director of NIH, participated in a meeting on this topic about two years ago. He was so taken by the problem of insecticides and medicines that no longer work and the death rate of 1M-3M people per year that this has been promoted as a priority concern. The economic impact is significant with 35 percent of Mali's budget being spent in one way or another on malaria.

The issues related to medical communication in Africa vary from country to country. Some countries lack the funds, while others lack the telecommunications infrastructure. In others, it is the impediment of inertia and bureaucratic politics. It is also the case that the basis for communications is different in most of these countries. In the U.S., communication infrastructure is considered an entitlement. However, it is difficult to get this across in other countries where the telecommunications infrastructure is handled in completely different ways. It is important to get researchers and institutes to consider the ongoing communication costs when a project is funded. In NLM's work with MIM, NLM is funding equipment installation and training costs initially in several malaria research sites in Africa. The funder of each site agrees to pay ongoing operational costs. A "champion" on the ground to continue to move the project forward locally is also required before a country is brought into the project.

Ms. Royall indicated that the Internet connectivity is increasing in this region. Forty-nine of the 54 African countries have Internet access in the capital cities. Five countries have no local Internet access. Fourteen countries have local ISPs or POPs in some secondary towns and twelve have local dial-up Internet access nationwide. Three have no Internet access at all.

The technologies that are being used to support the 100 malaria research sites vary from country to country. They include dial-up connections to local ISPs, leased lines with data connections, wireless connections, high frequency radio, and VSAT (Very Small Aperture Terminal) satellites. In some countries, there are regulatory issues to be overcome. Key people in the countries are involved in setting up telecenters throughout the sub-Saharan region and lobbying PTTs for changes in regulations and economics. Mike Jensen provides a wealth of information. (http://www3.sn.apc.org/africa/.)

Surprisingly, language has not been a significant barrier in this project. Because this is an area of such deep concern to those involved, they have transcended the parochial issues about language.

There were several lessons learned to date. If you connect the people, they will get hooked. The decision makers and politicians must be involved as early in the process as possible. It is important to start small and grow, but not to frustrate people along the way. This includes going with the technology that is available at any given point, without skimping on equipment. Ensuring sustainability in terms of staffing and funding is critical. The cost benefits of providing such a telecommunications infrastructure is tangible, but it is important not to forget to price information. Above all, in this project, it is important that the voice of Africa be heard.