MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Automatic Indexing chapter Mary Elizabeth Stevens National Bureau of Standards manpower requirements and time necessary to produce indexes, but also problems of glut in terms of man-hours necessary for the individual scientist to maintain awareness of what is going on in his field. There are major problems created by newly emerging fields of effort, new interdisciplinary areas of interest, and dynamically evolving terminology. Increasing specialization, on the other hand, brings out additional difficulties in finding what has been done elsewhere that might be applicable to one's own work and in avoiding wasteful duplication of effort, with their own attendant problems of terminology. All these problems are aggravated by the increasingly critical urgency which should apply to making all useful information available to those who need it as promptly and as selectively as possible. Recognition of this urgency and of the inadequacies of present solutions has therefore prompted consideration of the feasibility of using machines to assist in the indexing process. The term `1mechanized indexing" signifies the accomplishment of some or all of the indexing operations by mechanized means. The term includes the use of machines to prepare and compile indexes, and to sort, assemble, duplicate and interfile catalog cards carrying index entries. In this report, however, we shall be concerned primarily with the area of automatic indexing, that is, the use of machines to extract or assign index terms without human intervention once programs or procedural rules have been estab- lished. This term is chosen in preference to auto-indexing as originally suggested by21 1/ Luhn (196[OCRerr] [373]) for the reasons set forth by Bar-Hillel, - and to machine indexing - due to possible confusion with machine tool operations. Automatic indexing has been used by such workers in the field as Gardin (1963 [209]), Kennedy (1962 [310]), Maron (l96[OCRerr] [395]), Swanson (1962 [584]), and Wyllys (1963 [653]). For obvious reasons, we also subsume under this term any specifically "clerical" (Fairthorne, 1956 [[OCRerr]88], 1956 [[OCRerr]89], 1961 [[OCRerr]90] and hence machinable operations that can similarly be substituted for human intellectual effort. There is nothing that machines can do which people cannot do except for limitations of time, cost, or availability of appropriate resources. Thus, we shall consider "machine-like indexing by people" (O'Connor, 1961 [447]; Montgomery and Swanson, 1962 [421]) as falling properly within the scope of automatic indexing, especially in the sense of ". . . deciding in a mechanical way to which category (subject or field of knowledge) a given document belongs . . . decid- ing automatically what a given document is `about'." 3/ The principle of indexing, that is, of using subject-content clues and item surrogates as substitutes for searches based on perusal of the full contents, has a history of several millenia. In ancient Sumaria and Babylon, clay tablets were sometimes covered with a thin clay envelope or sheath that was inscribed with brief descriptions of the contents of the tablet itself (Carlson, 1963 [101]; Hessel, 1955 [268]; LaIley 1962 [343]; Olney, 1963 [458]; Schullian, 1960 [525]). The first known instance of an index list is apparently that of Callimachus in the third century B.C., which was a guide to the con- tents of some 130,000 papyrus rolls (Olney, 1963 [458]; Parsons, 1952[469]). 1/ 2/ 3' Bar-Hillel, 1962 [35], p. 417. Bohnert, 1962[69]; Ldmundson, 1959 [176]; and others. Maron, 1961 [395], p. 404. 3