MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Appendix B: Progress and Prospects in Mechanized Indexing appendix Mary Elizabeth Stevens National Bureau of Standards manual methods is unrealizable. There is also a pressing need to extend the coverage of a myriad of unpublished working papers. Hence, there is an utter necessity for automatic indexing, abstracting, and summarization by electronic data processors[OCRerr]" Secondly, little confidence can be attached to routine, manual operations to produce subject-content selection indicia for subsequent selection and retrieval of stored documen- tary items for the following reasons: 1. Wide variations of intra- and inter-analyst consistencies occur in the assignment of content-indicia, even with respect to well-established client- interests and index term vocabularies. 2. Potential clients may or may not be inclined to use the system, regardless of whether or not it provides efficient content-indicator-clue and selection criteria mechanisms. 3. Future queries cannot, in general, be effectively predicted in advance, except for the cases of specific author or title retrieval requests. The problem of intra-indexer and inter-indexer inconsistency is of special interest because the degree of inconsistency will seriously affect search and retrieval effectiveness and because serious questions are raised with respect to the evaluation of any indexing system in terms of prior or independent human indexing. With respect to the effect of indexer inconsistency upon subsequent search effec- tiveness, O'Connor 12/ considers the possibilities of overassignment (i.e., the assign- ment of indexing terms to an item that a subsequent searcher would not consider pertinent to that item) in the case where a search is specified by index terms A, B and C, each term is over-assigned with ratio 1.0, and assignments and overassignments by the recognition rules are statistically independent: "Then only one eighth of the papers selectedby the conjunction of A, B and C would correctly have all three terms." The complementary disadvantage of missing relevant references on search, because of indexer failure to supply all the appropriate indexing terms that a searcher would have considered relevant to a particular document would imply that, for a three-term query, assuming independence of term-assignments and a consistency level of 50 percent, only 12.5 percent of the documents that the searcher would consider relevant would be retrieved if someone else had indexed these items. We have previously reported 13/ on the results of 700 simulated 3-term searches based upon both manual and machine indexing of approximately 20 items with respect to a fixed vocabulary of less than 100 allowed descriptors. These results show, that if indexer A assigns to a given document the term "A" as indicative of subject content, then his sub- sequent chances of retrieving that document with a query for term "A" are 58.4 percent if the item had been indexed by someone other than himself, and 55.8 percent if indexed by an automatic indexing procedure developed at NBS, called SADSACT" (Self-Assigned Descriptors from Self And Cited Titles) 14/. For three-term searches, any one searcher would be able to retrieve 26.4 percent of the items he would consider relevant to his query if they had been indexed by any of the other user-indexers, and 24.7 percent if the items had been indexed by the machine technique. Tinker 15/ provides evidence on the relationships between inter-indexer inconsistency and retrieval efficiency, assuming that a given indexer is a potential querist, with average chances of retrieval ranging from 6.5 to 36 percent. Additional evidence on the generally unsatisfactory state of manual indexing consistency has been reported as follows: 224