MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Indexes Compiled by Machine chapter Mary Elizabeth Stevens National Bureau of Standards In machine-compiled indexes, no item or entries are eliminated by the machine, whereas in even the most rudimentary of machine-generated indexes, such as KWIC, various reductive or extractive operations are automatically applied as a part of the machine procedure. We shall be concerned in this section with brief discussions of machine-compiled indexes and related devices, specifically, concordances, card or book catalogs mechanically prepared, citation indexes, and special indexes such as Tabledex. The use of machines to compile, sort, duplicate and list index entries can only be con- sidered to be mechanized indexing in a relatively trivial sense. We shall consider, there- fore, only a few representative examples, emphasizing early work and some of the pioneering instances. 2.1 Concordances and Complete Text Processing When as early as 1856, Crestadoro proposed the use of permutations of the words in titles as a subject-content index the only "machines" available for the processing opera- tions were people acting in a strictly clerical way. Precisely such clerical operations have been used for centuries in a process that is, in the special sense of full representa- tion of document contents, an index-producing operation--the making of concordances. 1/ The task of listing each separat[OCRerr] word in a book in all the contexts in which it appears is incredibly time-consuming and tedious when carried out by manual means. There are those who have spent the major part of their lifetimes at this task. For example: "It 21 took James Strong thirty years to compile his exhaustive Concordance of the Bible..." - The use of machines capable of processing signals which represent and preserve in- formation offered a potentially revolutionary change, and with the advent of the electronic computer even more radical possibilities of very high speed processing were opened up. As early as 1949, J. W. Mauchly (the co-inventor of ENIAC and UNIVAC) envisioned the use of computers for documentation and library science activities. He suggested that the full information contents of the Library of Congress collections could be recorded in machine language, stored in this form on magnetic tape, and searched by machine in a procedure which would match words or other selection indicia occurring in the recorded information to the specified words or selection criteria of a query or search prescription. Specifically, he estimated that the entire collection, then amounting to 10, 000, 000 books, could when transcribed to binary-code representation 3/ be serially searched in 20 hours. 4/ 1/ See, for example, Black, 1962 [65], p.314: "The oldest book in the world has had such an index for many years--the concordance to the Bible;" Markus, 1962 [394], p.19: "The ultimate in permutation for indexing is a published concordance;" Linder, 1960 [363], p.99: "[OCRerr]e know of a concordance prepared in the 13th Century;" Simmons and McConlogue, 1962 [555], p.3: "Complete indexing has been used of course for centuries in the preparation of concordances." 2/ 3/ Carlson, 1963 [101], p. 211. That is, markings which have one of two values (thus, binary digits or "bits"), can be used to distinguish between 2n different other symbols such as alphabetic characters by using log 2n of such markings. A binary code for the 26 letters of the English alphabet requires a five-bit representation for each letter. If numeric digit characters are also recorded, (26+10), a six-bit code representation is required. Mauchly, 1949 [406], p.295. See also "Report to the Secretary of Commerce on the application of machines..." 1954 [620], p. 67. 15 4/