ATTACHMENT 1THE MICROFILM COLLECTIONS The information in this section is intended to provide an overview of the microfilm which is to be scanned and some problems which the films may present. The camera master negatives and printing masters (or duplicate negatives) which form the Library of Congress master microform collection are held in the Library's microform vaults. They are currently the custodial responsibility of the LC Photoduplication Service. Service copies (positive copies) are available to researchers in the public reading rooms.

1.1 Terminology for Collection Formats

The following terminology is used in this solicitation to describe the collections formats of Library microfilm which will be scanned. The typical level of physical and bibliographic access available for the original materials, which may also appear on the microfilm, is also noted.

Manuscripts

Unique documents like letters or typed reports, typically cataloged by broad collection title, and typically organized by series, subseries, containers and folders. The original documents are physically housed in file folders which are then placed into containers or boxes. Access to these various levels and the individual documents on the microfilm is through printed or machine finding aids or indexes which are often filmed with the materials.

Monographs

Books and pamphlets, typically cataloged as separate entities. Access is through unique, separate monograph records, or sometimes through collections-level records. When available, catalog records often appear on the microfilm.

Serials

Periodicals and journals, including magazines, typically cataloged by title but not by issue or article. When microfilmed, detailed descriptions of the serials, often to the issue level, may be described in collation records or guides to contents of the microfilm. Typically, these appear on the microfilm. Of course, the microfilms may also include cumulative indexes prepared by the publishers of the serials.

1.2 Content of LC Microfilm

Capturing the text and other content of documents contained on the microfilm is the major objective of the NDL Program. However, there is considerable material filmed at the beginning, within, and at the end of a reel of microfilm which serves to inform the viewer of the contents of a reel or a collection, as well as any anomalies or irregularities in the original material which are also reproduced on the microfilm. The Library will provide guidelines regarding what explanatory material and targets are to be scanned for each

Explanatory material - - Preservation microfilm contains explanatory material at the beginning of a reel (termed head-of-reel information) and also at the end of a reel (termed end-of-reel information). The explanatory information at the end of the reel often repeats some of the frames which appear at the head of the reel. These materials include targets (when they can be read without magnification are termed eye-legible), and other information, overviews, guides to the contents of collections, narrative descriptions, bibliographic information, catalog records or cards, finding aids, copyright information, and other associated information which serves to inform the reader of the microfilm about the content, extent and sequencing of the entire collection or of the material contained on that reel. Generally speaking, head-of-reel and end-of-reel explanatory information will not be scanned.

Technical targets - - The frames at the head-of-the-reel also include technical targets or resolution targets which are filmed to provide a method to measure the resolution or the line pattern resolved on the film. Current practice requires that a resolution target also appear at the end of the reel. However, a review of some of the film being considered for scanning under the NDL Program shows that resolution targets were not routinely filmed.

NOTE: Technical targets may be important when the film receives a pre-scanning analysis, but scanned images of technical targets will not be a part of the final digital collection.

Irregularities targets - - Anomalies and irregularities in the original material which was filmed are noted by filming targets which identify the problem, for example, targets noting that material is missing, that the original is in poor or deteriorated condition, or if there are defects in the original.

1.3 Categories of Microfilm and Filenaming Structure

This solicitation presents five different categories or materials formats which are expected to cover the microfilm materials selected for scanning. These categories are identified not only by format but most importantly by the filenaming systems which have been devised for the resulting digital images. Section C.4. provides detailed information about the filenaming procedures and directory structure which the contractor is required to provide with the images.

A sample frame sequence for each of the five categories which includes head-of-reel and end-of-reel information is included on pages J-7 - J-11 of this attachment.

The categories are:

  1. Manuscripts - Numbered document structure
  2. Manuscripts - Unnumbered document/file folder structure
  3. Monographs - Bibliographic record/print-page number structure
  4. Serials - Serial structure
  5. Copyright and technical document collections - Copyright registration and technical document number structure

1.4 Special Problems--Variations in Filming Practice

As noted above, there is considerable variation in LC Photoduplication Service filming practices because procedures were revised or enhanced over the years and they also differ according to the collections format converted to film (monographs, serials, manuscripts). Examples of complex microfilm frames are illustrated on pages J-15 and J-16. Explanatory information appearing on the film and the technical specifications used in filming can vary from collection-to-collection and from reel-to-reel. However, there is also much consistency in the microfilm, if the original material was also consistent in size and content and when established filming practices were used.

Due to variance in the film, it is likely that adjustments will be required routinely at scanning so that the information in the microfilm frames can be successfully captured and the Library's filenaming requirements are accurately completed. It shall be essential for contractor staff to fully evaluate the selected microfilm and recognize the features and characteristics which are discussed in Section C.4.3. Some of the variations in filming practice, which need to be considered when scanning, include the following:

  1. Materials have been filmed in all four film positions - - 1A, 2A, 1B, and 2B (see page J-13 for illustrative chart). However, film position rarely changes within a reel.
  2. The orientation (landscape and portrait) of the original material and positioning of items or pages in the frame can vary within a reel and from reel-to-reel based on the dimensions of the original and whether maintaining consistency in reduction ratio within a reel was required.
  3. Film reduction ratios vary - - from collection to collection, from reel to reel, and also sometimes within a reel. However, they are typically in a range from 10:1 to 14:1.
  4. Indication of reduction ratio - - The reduction ratio used for a reel is often not indicated in a resolution target or alternatively able to be interpreted from a ruler or scale filmed in the resolution target alongside a document. As noted earlier, a resolution target or any other technical target or information, may not have been filmed anywhere on the reel. Also, changes in reduction ratio within a reel are rarely indicated.
  5. Ruler or scale - - For most manuscript material, the reduction ratio used in filming is shown by including, in the same frame and at the same reduction ratio, a section of an inch and millimeter scale at least 3 inches (7.62mm) long which appears beside the first manuscript. When the material's size requires a change in the reduction ratio, at a minimum another ruler should have been filmed.
  6. In some cases, materials have been photographed against a light background, (copyboard) in others against a black background. This may cause some difficulty in determining the edges of a document, or may present multiple "edges". Also, the film frames can move abruptly from a white to a black background within reels.
  7. Although a small percentage of reels have sprocket holes or perforations, sometimes in the camera negative or sometimes in the positive print, they appear on the positive film most often because they are present in the camera negative and are transferred when the positive is printed. In no case is the position of the perforations a reliable guide to the location of the edge of the "next" frame. However, documents may have been filmed so that their informational content extends through the sprockets. Therefore, the sprockets will appear in the digital image.
  8. Duplicate exposures - - When the original material was filmed in 2A or 2B position (two pages or items per frame), this sometimes results in text and color or continuous-tone or half-tone black and white illustrations to be present in the same frame. That particular frame was then sometimes filmed more than once, using different exposures, so that each part of the double-page image could be effectively captured. Frames with extreme variation in lightness/darkness of background may also have been filmed more than once.
  9. The size of the documents on the film can vary greatly, reflecting the high variance in document size in the original paper collections or changes in reduction ratio within a reel. For example, for one reel of the Lincoln Papers microfilm, the film document image size ranges from 5mm high x 8mm wide to 31mm high x 25mm high.
  10. The front and back covers and end sheets for monographs, serials and books contained in manuscript collections, may have been filmed if the practice at that time was to provide a facsimile reproduction of the book or journal.
  11. Some documents (maps, charts, illustrations) are segmented on the film. The size of the originals changed and in order to maintain the same reduction ratio, the document was filmed in segments on successive frames. A chart or target indicating the correct sequence of the segments in the successive frames as they relate to the original document may appear on the microfilm, but often it has not been provided. (See page J-14, Segmented Material Targets)
  12. Although frames on the microfilm do not overlap, the spacing between exposures is usually not entirely consistent throughout a reel.
  13. In the manuscript collections which have been filmed, frame counters with numbers can sometimes be found in the lower right hand corner of the frames or along the bottom of the frames. The numbers which appear in these frames are provided by an automatic counter mounted in a camera bed. The counter is set back to 000000 at the beginning of each reel. The container target is always 000000, the first folder of the manuscript is 000001, the first document is 000002 etc. The frame counters are not used consistently for all of the manuscript collections film produced over the last decades, and the numbers often cannot be identified or easily read. A decision regarding scanning this information will be made on a job-by-job basis.
  14. Blank pages may have been filmed and often absolutely no information appears.
  15. Irregularities targets: Anomalies or irregularities can be noted in explanatory material appearing at the head of the reel, but irregularities which refer to specific pages/frames will appear in place of the material and before the first frame of material which follows.
  16. Image tonal range: The Library's original source materials and the microfilms that reproduce them vary in terms of tonal range. Although microfilm is a high contrast medium, the Library's films, like those produced by many libraries and archives, do preserve some tonal values. Therefore, the most successful approaches to digital imaging from microfilm may be ones that exploit microfilm tonality at capture time.





LIST OF SELECTED MICROFILM
TARGETS THAT CAN APPEAR ON LC FILM
START

END
End of Reel/
Please Rewind

FILMED AS BOUND
Some Pages in the Original Contain Flaws and
Other Defects Which Appear on the Film

Blank Pages Not Filmed

REEL NO:

Volume(s) Missing
Page(s) Missing

Issue(s) Missing

Continued on next reel

Series No.

Title
Container

Best Copy Available
Material listed as missing, if located at a later time
may be added to the end of the reel.

There were in the original file some
pages containing mutilations and other
defects. These unavoidably constitute part
of the filmed file.




ATTACHMENT 2
RESEARCH USE OF LIBRARY OF CONGRESS IMAGE COLLECTIONS 2.1 Display-screen Viewing and Printed Output

The students and researchers who use Library of Congress collections online desire the ability to view the images on their computer display screens and to print copies, typically on a laser printer. Most students and researchers use current-generation color-capable display systems with resolutions of 1024 x 768 or 1280 x 1024 pixels; their printers are likely to be capable of printing at settings of 300 or 600 dpi.

For the foreseeable future, access to Library of Congress collections will be provided using software associated with the World Wide Web protocols for Internet.

Informal experiments by the Library of Congress suggest that the image type that works best for display may not be the type that works best for printing. Display systems often produce the greatest legibility (and thus the best results) with a grayscale image. But printers often do best with a bitonal image. When a grayscale image is printed it must be "halftoned" and this tends to break up small features like fine print.

Generally speaking, students and researchers using Library of Congress text-based (as opposed to pictorial) collections place greater importance on the printed output than on screen display. They do not always view a document page as an end in itself, but typically will use the information that they find in the documents when they write their own articles or reports. Although some researchers may "carry away the document" on a floppy disk, most will prefer to print it and carry away a sheet of paper for later reference.

In past paper-scanning projects, the preference for printed output over screen display has led the Library to favor bitonal images. More recent explorations, however, have shown that a laser printer's representation of a grayscale image can be very good. In one informal experiment, for example, some manuscript pages were scanned from the original paper at 150 and 300 dpi. Using graphic-arts software, the laser printer was set for 600 dpi output (which affected the way in which the halftoning occurred) and the resulting paper copy was very legible.

For this procurement, the Library seeks proposals to create images that will display and print successfully for researchers working in contexts like those described above, with the greatest emphasis placed on successful printing.

2.2 Scaling at Output Time and Capture Resolution for High-detail Content

The researchers who access Library of Congress collections via Internet employ a variety of software packages, ranging from modest freeware associated with World Wide Web browsers to sophisticated graphic arts software for image handling. With varying degrees of effectiveness, this software scales (changes the sizes of) the images at display and print time. As noted above, the Library findings thus far suggest that, for documents (as distinct from pictorial matter), printed output is of greater importance to users than screen display. Typically, a researcher's personal computer will have a laser printer as a peripheral device; the Library's digital images must be conveniently printed within such a system.

The Library recognizes the emergent state of software associated with the World Wide Web and it well aware of the shortage of available tools for certain image types, especially viewing and printing software appropriate for bitonal images, especially bitonal images with TIFF headers and CCITT group 4 compression. In fact, the Library is planning to make a special arrangement to offer viewing and printing software for this purpose to researchers who wish to use Library collections via the World Wide Web.

Although some researchers wishing to print document images may be limited to software like that described in the preceding paragraph, many others will have additional graphic arts or other software (not intended for use "within" the World Wide Web environment) capable of handling raster-scanned images.

ATTACHMENT 3
IMAGE RESOLUTION AND IMAGE QUALITY
3.1 The Analysis of Spatial Resolution

Various methods may be used to determine the appropriate levels of spatial resolution for digital images. A starting point, of course, will be a determination of the actual "delivered" resolution of the film itself.

The determination of the film's delivered resolution may result from an examination of resolution targets appearing on the film or other features that permit actual measurement. Since not many Library of Congress films produced during the period under discussion include images of resolution targets, in many cases, the contractor's analysis will have to be based on the creation and comparison of images produced in different ways and/or at different levels of scanning resolution. In such tests, for example, lines of representative text (that is lines of printed or written characters) may be scanned and examined to determine the actual level of film spatial resolution.

For reference, the Library offers this brief summary of informal tests carried out with a handful of Library microfilms during 1995. The films were scanned with a device that was reported to apply resolutions ranging from 1200 to 4000 dpi to the film. Both grayscale and bitonal images were produced. In an informal review of the resulting images, significant differences were observed when the 1200 and 2700 dpi images were compared; the differences between the 2700 and 4000 dpi image were insignificant or negligible. This informal finding suggests that many Library films may not contain recoverable data beyond the level of about 3000 dpi. For materials at a reduction ratio of 12:1, this suggests that the recoverable resolution in terms of the original document may be on the order of 250 dpi.

Based on the analysis performed on the sample films prepared for any given job under terms of this procurement, the contractor will recommend a course of action for the job to the Library. This course of action, of course, will take into account such factors as whether the delivered images will be grayscale (for which matching the spatial resolution of the film may be recommended) or bitonal (for which higher resolution may be recommended to compensate for the reduction of scanned data to one bit-per-pixel).

The analysis and recommendations will be reviewed by the Library' project leader and work will proceed after the leader gives his or her approval of the proposal.

3.2 Genuine, Interpolated, and Nominal Spatial Resolution

The reduction ratios for Library microfilms are not always known nor is it always possible to state the original dimensions of the documents on the microfilms. For this reason it will often be difficult to state the spatial resolution of the digital images in reference to the original documents.

It will however, be possible to state the resolution in terms of the film image itself, as suggested by the preceding section. The numerical value of this film-reference resolution, however, should not be recorded in image file headers or their equivalent if such recordation will cause printers to output "postage stamp" or other deviant-sized hard copy. The resolution in terms of the film shall be provided in the analytic documents created before a job begins and in the documentation that accompanies the digital images, e.g., in or in association with the scanning log.

When this film-reference resolution is given, it shall be stated in genuine terms, i.e., the actual optically achieved spatial resolution of the image. The numerical value shall not be based upon interpolation, i.e., the achievement of high levels resolution by the use of computer algorithms that "fill in" missing pixels.

The resolution stated in the delivered file headers or their equivalent shall be a nominal resolution that represents an approximation of the resolution as referenced to the original document. When the film's reduction ration or the original document size is known, then the nominal resolution shall be as accurately rendered as is practical. When the reduction ration and original document size is unknown, a reasonable estimate shall be provided. In every case, the nominal resolution shall be agreed upon during the analysis of the film that precedes a given job.

As noted in Section C.3.1.2, the digital image headers or their equivalent shall be such as to permit easy printing with a standard-type laser printer. If this means that it is inappropriate to place the nominal resolution value in such a header or equivalent, then the contractor shall report the nominal resolution in the scanning log or other report and place in the header the resolution value that will yield the desired outcome when printing the image.

3.3 Suppressing Print Through

To the degree possible, the Library desires images in which legibility of front-of-sheet writing is enhanced by the suppression of printing or other marks that may show through from the back of the sheet. This print-through is often of a lighter tone than the ink or data on the facing side of the sheet and, on the film, the human eye can "tune it out." However, in scanning--especially to produce a bitonal image--there is a severe risk that the threshold setting will render the lighter-tone print-through as black, i.e., at the same tonal value as the desired text. The resulting mix of desired characters and undesired "noise" degrades the legibility of the page.

The capture of legible images of handwritten documents may also be made difficult by show-through, presenting the same risk as described for printed matter. Handwritten documents also present the challenge of tonal information: some marks (e.g., ink) may be darker than others (e.g., pencil). If a bitonal image is produced, sophisticated thresholding is required to capture both light and dark marks. It is understood that a grayscale digital image will retain the tonal characteristics of the microfilm.

J.3.4 Suppressing Moire Pattern Interference for Printed-Halftones

The Library desires "clean/clear" reproductions of illustrative matter on the microfilm. It is understood that high-quality capture of illustrative material can be difficult when the source microfilms do not reproduce illustrations very well.

The production of clean digital images when the microfilm reproduces a printed halftones in a book can be especially difficult. The digital images may be marred by moire patterns, caused when the "frequency" of the original printed-halftone (resolution in lines per inch) encounters the implicit grid of the scanning device, with its own frequency (resolution in dots per inch).

Approaches that exist to solve this problem include at least the following: the use of a dithering algorithm at scan time to "randomize" the implicit grid produced by the scanner, the use of grayscale imaging (although this may only defer moire problems to the point of imagin printing), the use of a de-screening and rescreening algorithm such as that employed by certain Xerox scanners.

ATTACHMENT 4
IMAGE FILENAMES AND DELIVERY DIRECTORIES 4.1 Naming Files and Directories

The contractor shall assign a digital-image filename to each image captured as part of the initial image-capture process, and deliver these files to the Library in a certain arrangement of delivery directories and subdirectories, each containing no more than 300 files. Directory names and filenames shall conform to DOS naming conventions and, when alphabet letters are used, these shall be lower case.

The Library will specify what is called an identifier for the name of a delivery directory. An identifier is the prefix or left-side (right-truncated) portion of a name that may contain as many as eight characters. (See section C.4.1)

4.2 Naming files and directories: Five Structures

Different collections will require different structures for assigning filenames and naming directories. The Library identifies the five structures listed below for this contract.

  1. Numbered document structure
  2. Unnumbered documents in folder structure
  3. Bibliographic record/print-page number structure
    1. a. When printed page numbers are tracked
    2. b. When printed page numbers are not tracked
  4. Serials structure
    1. a. When printed page numbers are tracked
    2. b. When printed page numbers are not tracked
    3. c. For collation records and/or cumulative indexes
  5. Copyright-registration-number and technical-document structure

4.3 FILENAME/DIRECTORY STRUCTURE 1: NUMBERED DOCUMENT STRUCTURE

Generally speaking, the numbered document structure applies to certain manuscript collections, e.g., the presidential papers collections. Every leaf (an individual sheet of paper) received a sequential number when the collections were processed in the 1930s, 1940s, and 1950s. The number was stamped on the leaf with a rubber stamp and called a leaf number. In some cases, the documents were then mounted on larger sheets in bound volumes and the leaf number (still on the document proper) is then called a mounting number.

Many leaves have writing on both sides; a leaf may thus "contain" two pages. On the microfilms of manuscript collections, the back side has been filmed if it contains a marking of any kind. The back side of a leaf, of course, always appears on the film following the front side. The rubber-stamped leaf number, however, does not appear on the back side.

The contractor shall assign filenames based upon the leaf or mounting number. Depending upon the collection, this number may reach six digits, e.g., 140862. The filename consists of the mounting number, with leading zeros added as needed to create a six-digit expression. In addition, the letter a is added to the six-digit expression to indicate that this image reproduces the front or numbered side of the leaf. For example, the image of the front of leaf 435 shall be assigned the filename 000435a.jpg (or 000435a.tif); the image of the front of leaf 140826 shall be assigned 140862a.jpg (or 140862a.tif).

Digital images shall be created for all back sides that appear on the film. These shall receive the same number as the front, with the substitution of the letter b for the letter a as the seventh character in the filename. For example, if the microfilm contains images for the front and back of leaf number 435, the two images shall be assigned the filenames 000435a.jpg (or 000435a.tif) and 000435b.jpg (or 000435b.tif).

At the time that a job for a particular numbered-document collection is assigned, the Library will provide written instructions, a copy of the finding aid (in print and/or in machine-readable form), and will also mark sample reels to show typical patterns for head-of-reel information and similar features.

Manuscript collections with leaf numbers are typically numbered consecutively throughout. Sets of pages (images) not to exceed 200 (100 leaves) shall be grouped in directories for delivery. Thus the name assigned to each directory will indicate the leaf numbers included within. The quantity limit has been set to facilitate ease of handling at the Library. The table that follows outlines a directory structure for the Lincoln Papers, to be created by the contractor when the images are delivered.
Directory name assigned by contractor Images
lp000000
Note: "lp" stands for "Lincoln Papers"
All images for leaves through number 99
(e.g., 000001.tif through 000099.tif)
lp000100 Leaves 100-199
lp000200 Leaves 200-299
continues as needed
lp045000 Leaves 45,000-45,999

In this structure, missing, repeated, and unscannable pages or documents shall be recorded in the scanning log.

4.4 FILENAME/DIRECTORY STRUCTURE 2: UNNUMBERED DOCUMENTS IN FOLDER STRUCTURE

Generally speaking, this structure applies to certain manuscript collections, e.g., the Booker T. Washington Papers. The documents in these collections have been placed in separate file folders within certain logical elements: series and subseries. Each folder, series, and subseries represents units that cohere intellectually. In addition, the folders are stored in containers (boxes), in sequence. Each collection's organization, including a list of series, containers, and folders, is found in the collection's printed finding aid. The following table illustrates this form of organization:

Collection