NIST Monograph 91:Automatic Indexing: A State-of-the-Art ReportTable of Contents |
|||||
Page | |||||
Abstract |
1 | ||||
1. Introduction |
1 | ||||
1.1 Definitions and background | 2 | ||||
1.2 Scope of this study | 10 | ||||
1.3 Derivative vs. assignment indexing |
13 | ||||
2. Indexes compiled by machine |
14 | ||||
2.1 Concordances and complete text processing | 15 | ||||
2.2 Card catalogs, book catalogs, bibliographies and subject index listings prepared by machine |
19 | ||||
2.3 Tabledex and other special purpose indexes | 25 | ||||
2.4 Citation indexes | 27 | ||||
2.5 Machine conversion from one index set to another |
38 | ||||
3. Indexes generated by machine - automatic derivative indexing |
40 | ||||
3.1 KWIC indexes   |
40 | ||||
3.1.1 Applications of KWIC indexing techniques | 41 | ||||
3.1.2 Advantages, disadvantages and operational problems of KWIC indexing   |
55 | ||||
3.2 Modified derivative indexing   |
68 | ||||
3.2.1 Title augmentation | 68 | ||||
3.2.2 Book indexing by computer | 71 | ||||
3.2.3 Modified derivative indexing - Baxendale's experiments   |
73 | ||||
3.3 Derivative indexing from automatic abstracting techniques   |
75 | ||||
3.3.1 Auto-condensation and auto-encoding techniques of H. P. Luhn | 75 | ||||
3.3.2 Frequencies of word n-tuples - Oswald and others | 79 | ||||
3.3.3 Relative frequency techniques - Edmundson and Wyllys, and others |
81 | ||||
3.3.4 Significant word distances | 83 | ||||
3.3.5 Uses of special clues for selection | 84 | ||||
3.3.6 Recent examples of mixed systems experimentation   |
86 | ||||
3.4 Quality of modified derivative indexing by machine   |
89 | ||||
4. Automatic assignment indexing techniques |
91 | ||||
4.1 Swanson and later work at Thompson Ramo Wooldridge | 91 | ||||
4.2 Maron's automatic indexing experiments | 93 | ||||
4.3 Automatic indexing investigations of Borko and Bernick | 94 | ||||
4.4 Williams' discriminant analysis method | 97 | ||||
4.5 SADSACT | 98 | ||||
4.6 Assignment indexing from citation data | 99 | ||||
4.7 Similarities and distinctions among assignment indexing experiments | 100 | ||||
4.8 Other assignment indexing proposals   |
105 | ||||
5. Automatic classification and catagorization |
106 | ||||
5.1 Factor analysis | 108 | ||||
5.2 The theory of clumps | 110 | ||||
5.3 Latent class analysis | 113 | ||||
5.4 Examples of other proposed classificatory techniques |
113 | ||||
6. Other potentially related research |
114 | ||||
6.1 Thesaurus construction, use and up-dating | 114 | ||||
6.2 Statistical association techniques |
118 | ||||
6.2.1 Devices to display associations: EDIAC | 119 | ||||
6.2.2 Statistical association factors - Stiles | 119 | ||||
6.2.3 The association map - Doyle and related work at SDC | 122 | ||||
6.2.4 Work of Giuliano and associates, the ACORN devices | 124 | ||||
6.2.5 Spiegel and others at Mitre Corporation |
126 | ||||
6.3 Clues to index-term selection from automatic syntactic analysis | 127 | ||||
6.4 Probabilistic indexing and natural language text searching |
132 | ||||
6.4.1 Probabilistic indexing - Maron, Kuhns and Ray | 133 | ||||
6.4.2 Natural language text searching - Swanson | 134 | ||||
6.4.3 Full text searching - legal literature |
135 | ||||
6.5 Other examples of related research in linguistic data processing | 136 | ||||
6.6 Machine assistance in translations of subject content indications to special search and retrieval language |
140 | ||||
6.7 Example of a proposed indexing system utilizing related research techniques |
142 | ||||
7. Problems of evaluation |
143 | ||||
7.1 Core problems | 145 | ||||
7.2 Bases and criteria for evaluation of automatic indexing procedures |
149 | ||||
7.2.1 The Cranfield project | 150 | ||||
7.2.2 O'Connor investigations | 151 | ||||
7.2.3 Questions of comparative costs | 153 | ||||
7.2.4 Summary: potential advantages as bases for evaluation |
156 | ||||
7.3 Findings with respect to inter-indexer and intra-indexer consistency | 157 | ||||
7.4 Special factors and other suggested bases for evaluation |
160 | ||||
8. Operational considerations |
164 | ||||
8.1 Questions of input | 164 | ||||
8.2 Examples of processing considerations | 168 | ||||
8.3 Output considerations |
171 | ||||
9. Conclusion: Appraisal of the state of the art in automatic indexing |
173 | ||||
Acknowledgements |
182 | ||||
Appendix A: List of references cited and selected bibliography |
183 | ||||
Appendix B: Progress and prospects in mechanized indexing |
223 | ||||
Appendix C: Selective bibliography of additional references |
237 |
Retrieval Group home page IAD home page Date updated: Tuesday, 16-Jan-01 11:25:22 Date created: Monday, 18-Sept-00 |