ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Design Criteria for Automatic Information Systems
chapter
M. E. Lesk
G. Salton
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
V-35
I. Term Weights
Weighted Word Stems >[OCRerr] Logical Stems
Weighted Synonym Classes » Logical Synonym Classes
2. Document Length
Full Summaries (2000 words) > Abstracts (150 words)
Abstracts (ISO words) » Titles Only
3. Synonym Recognition:
Abstracts with Thesaurus )> Abstracts Null
Summaries with Thesaurus > Summaries Null
4. Phrase Recognition:
Synonym and Phrase > Synonym Recognition
Recognition (Thesaurus) only
5. Syntactic Analysis:
Syntactic Analysis » Word Stem Match
Syntactic Analysis > Synonym Recognition
Syntactic Analysis - Statistical Phrase Recognition
6. Term-Term Associations:
Stem-Stem Associations > Simple Word Stems
Concept-Concept (rhesaurus - Synonym Recognition
Class) Associations
7. Manual Indexing:
Abstract Stem Matching Index Term Match
lndex Term with Thesaurus > Abstracts with Thesorus
»> ::much greater than"
greater than81 Overoll Evaluation Results
- "about equal to"
(based on experiments with 4 collections in 3 topic areas)
Fig. 16