USGS Data Accessibility, Standards, and Quality Control for Imagery on cd-rom. Carl Abston Good morning ladies and gentleman. I am Carl Abston from the USGS in Denver, Colorado. This morning I want to discuss photo archives. I will show a few high resolution photos which require a high resolution projection system. I enticed a vendor, Network Spectrum, (their booth is located with the other vendor exhibits), to set up a $14,000 system for this presentation. The USGS has two primary photo archive libraries, one in Denver and other in Hawaii. The Denver library has 400,000 photos and the Hawaii Photo library holds another 250,000 photos. These photos date back to the 1850's which is before the establishment of the USGS. In fact, there were three surveys of the western U.S. collecting photographic information prior to Dr. John Wesley Powell's historical survey. There are photo pairs showing glaciers and river valleys a 100 years ago and the same glaciers and river systems today. They have photos of earthquake damage in the 1800's as well as all of the twentieth century earthquake disasters, and other disasters both national and international. In fact they have photos of every described geological feature though out the world, and so on. Although it may sound like it, this is not meant to be a commercial for our library system, but rather to acquaint you with what we have today and what we will lose tomorrow. When people say that a picture is worth a 1000 words, well, that is a under statement, a photo in the 1800's can be worth its weight in gold. To lose these data is to lose part of our scien- tific and historical heritage. These data are simply irreplace- able. There are two fundamental problems: 1---ACCESS 2---PRESERVATION ACCESS _______ The Denver Photo library has 300,000 photos indexed on 3x5 inch cards and the other 100,000 are not indexed at all. There is no automated index. To access these photos and select something of interest, it is necessary to come to the Denver Photo Library and spend a considerable amount of time browsing though 3x5 inch cards and thousands of photos. PRESERVATION ____________ 1 Although the photos are stored in reasonably good environmental conditions there are fundamental problems. For example, 50,000 photos are still stored on nitrate film and the majority of these have not been duplicated because of the lack of resources. It is clear that they will self destruct, it is not clear exactly when that event will occur. The majority of the photos are stored in one location without any duplicate copies. There was a fire in the 1970's which destroyed a lot of these (one of a kind) photos, many of which have never been seen by the public, by historians or by scientists. Those data are lost forever. It is interesting that everyone understands that photos deteriorate with age, and yet we totally ignore this fact, we build photo libraries, provide scant resources, and sleep well at night in the belief that all is well. Photographs deteriorate with age. You can store them under ideal or hostile conditions. You can speed up or slow down the aging process but you can not stop it. Black and white photos have been remarkably stable, however the newer color imagery is much less stable. For example, Kodachrome shot during the 1960's, un- less stored under very ideal conditions, is generally washed out and hardly useful today. In order to preserve these collections as long as possible, the photo libraries continually make duplicates from originals, and then duplicates from duplicates, and so on. And with each dupli- cate generation, quality and content is lost. In fact after several generations, the quality becomes so poor that the photo is no longer of value. In other words, it is physically impossible to preserve photos for centuries and maintain quality and content using our present analog methods. WHAT CAN BE DONE ________________ Several years ago I begin investigating these problems and some possible solutions. If we can't preserve photos in the tradi- tional way (paper, film) over a long time period, could we preserve them digitally. Well, we know the answer is yes, but can be preserve them in a useful quality at a reasonable price and would the public accept them. In 1992, we introduced a CD-ROM DDS-8 (Digital Data Series) con- taining 550 photos from the Denver photo library. This was the first such CD-ROM produced by the U.S.G.S. 2 In order to produce such a CD, it was necessary to acquire some hardware. We chose an Hewlett Packard 2C flatbed scanner ($1600 at the time), and run color slides through a Canon color copier for scanning. We also had to develop a software engine that could access and display both imagines and captions. Our decision was not only to store the digital representation of the photo in high resolution, but also to store the thumbnail's. This would allow the user a quick way of browsing hundreds of photos. We stored the caption text in ASCII along with a search and display engine. All photos were stored in a standard inter- nationally accepted image format (PCX) which was usable by all image viewing and processing commercial packages. We used no com- pression. Using PCX image files and ASCII TEXT permitted access by other than PC platforms. What did we find out from this first exercise? The quality of black and white photos in digital form were good, good enough for archiving, distribution and future generations. But the quality of the color slides which were reproduced onto paper through a Canon copier and then scanned were of such poor quality that they were unacceptable for any serious uses. The CD master was produced for less than $1000 and each replica less the $2. For the first time, it was possible to look at photos without coming to the Denver library. It was possible to decide to order certain photos in high quality without coming to Denver. It was possible to reproduce 550 photos for less than $4. It was also possible to use these images as clip art, or use them directly in digital publications, or use them in schools (universities and high schools). As we discovered, the possible applications were enor- mous and they were fully accepted. Although we were learning, and the color quality was poor for this first CD, never-the-less, it was one of the most popular CD's produced by the USGS. In the next four years we perfected this technique. By September 1995, we have introduced DDS8, DDS12, DDS21, DDS23, and DDS24 which contained photographic data. DDS21 contained 1650 photos in high quality, of geological features in general. DDS24 contained 470 photos of Volcanic activity at Kilauea 1983-1993. The photos from DDS24 are the best photography I have seen. Indeed I would encourage you to purchase DDS21, DDS23, and DDS24 as examples of what can be accomplished. Call 303-202-4700. DDS21 sells for $62+$3.50 handling, DDS23 sells for $32+3.50 handling, DDS24 sells for $32+3.50. During this 4 year learning curve, we acquired a Kodak slide scanner for color slides and a Microtec scanner for 8x10 inch film transparencies of large scale geologic maps. We also learned the art of image processing and optimization. Here we used both Adobe--Photoshop and Aldus--Photostyler packages. The first image 3 CD DDS-8 was an excellent tool for selecting certain photos to be ordered from the library, whereas the last image CD's DDS21 and DDS24, contain photos which are as good as, but frequently much better than the original. All photos on these last CD's were digitally enhanced. The poorer the quality the greater the enhancement. Indeed, original photos shot in the 1800's show yellowing, creaking, gain, poor contrast, and focus problems. In fact, these enhanced digital images look like modern pictures. And the time to optimize a digital image is 3 minutes of less. CONCLUSION __________ We proved we can convert photos, of all types and sizes into digital form in a quality satisfactory to both todays users and future generations. We proved that digital images are as useful in teaching and pub- lishing as conventional photos. We proved that this conversion can be done economically. We proved we can duplicate data without lose of quality and dis- tribute digital photographic data economically, world wide. The industry has proved that CD-ROM is the longest lasting and most standard media known to man. And we proved that CD-ROM is the only archive and distribution media capable of achieving these results. OPPORTUNITIES _____________ This exercise which was interesting for its own sake, has given us a rare vantage point to see the opportunities for the future, opportunities for the 21St. century. For example the entire Denver Photographic library could be scanned into digital form and stored on CD-ROM's for less than 1/2 million dollars (16 staff years). The entire collection would fit onto 800 of todays CD's without compression. The new DVD type CD's which will hold 18GB and will be available within the next year or two, would reduce that number to 31 CD's. The thumbnail imagines would occupy 40 of today CD's or 2 DVD CD's. The 16 staff years necessary to convert the entire collec- tion could be done over a 16 year period or 1 year depending on the number of people working on the project concurrently. In digital form, the aging stops. The CD's can be duplicated at less than $8,000 per set. Every USGS library could have a set. Indeed, every university library could have a set. A thumbnail 4 set could be duplicated for less $400. These could be dis- tributed world wide to every library which has $400. Finally, 150 years of geological photographic history and scientific data could be saved for the future but made available today for less than $8000 to everyone, including scientists, publishers, broad- casters, yes and even you or I. With the availability of the new DVD CD-ROM technology, we can expect cost reductions of 10 to 1. THE FINAL QUESTION __________________ The final question really is, DO WE WISH TO PRESERVE OUR SCIEN- TIFIC AND HISTORICAL HERITAGE, rather than can we preserve our scientific and historical heritage. Thank you ladies and gentlemen. 5