SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Latent Semantic Indexing (LSI) and TREC-2
chapter
S. Dumais
National Institute of Standards and Technology
D. K. Harman
(documents not in the LSI scaling were mislabeled),
so the isial run results are incomplete and misleading.
We have corrected this translation problem, and the
correct results are labeled lsial*. These results are
summarized In Table 2. We have not yet completed
the comparison agalnst the 9 separate subspaces from
ThEC-1.
performance would be somewhat better. Similarly for
the topl()0() documents, lsial* had more than 4000
more documents without relevance judgements than
did lsiasm.
Table 3
isiasm lsial* isiasm `sial*
Table 2 toplOO toplOO toplOOO toplOOO
isiasm isial lsial* relevant 2153 2122 15559 12230
error correct not-relevant 2847 1961 7869 6987
not-judged 0 927 26572 30694
Rel_ret 7869 4756 6987
Avg prec .3018 .1307 .2505 Table 3: Summary of missing relevance
Prat 100 .4306 .2664 .3922 judgemeuts for standard vector method and LSI.
Prat 10 .5020 .3340 .5100
R-prec .3580 .1937 .3069
Q >= Median 37 (2) 16 (1) 25 (1)
Q< Median 13 (0) 34 (7) 25 (0)
Table 2: LSI Adhoc Results. Comparison of
standard vector method with LSI (corrected
version, but missing relevance judgements).
In terms of absolute levels of performance, both
lsiasm and lsial* are about average. The SMART
results (lsiasm) are somewhat worse than the [OCRerr]IREC-2
SMART results reported by Buckley et al., Fuhr et al.,
or Voorhees, but this is because we used slightiy
different pre-processing options and did not include
phrases. Mthough it is generally difficult to compare
across systems, the SMART (lsiasm) and LSI (lsial*)
runs can meaningfully be compared since both use the
same pre-processing. The starting term-document
matrix was the same in both cases. Much to our
disappointment, the reduced-dimension LSI
performance appears to be somewhat worse than the
comparable SMART vector method. However, it is
important to realize that many of the documents
returt[OCRerr] by lsial* were not judged for relevance
because they were not submitted as an official run.
Table 3 shows the number of doccments for which
there are no judgements. Consider the results for just
the toplOO documents for each query (i.e., the
documents judged by the NIST assessors). For lsiasm,
all 5000 documents were judged since this was an
official run, and 2153 were relevant. For lsial*, only
4073 documents were judged and almost as many,
2122, were relevant. Thus, if only 31 of the 927
unjudged lsial* documents are relevant LsI
performance would be comparable to SMART
performance, and if more than 31 were relevant LSI
110
Because the missing relevance judgements make
direct comparisons between SMART and LSI difficult,
we decided to look at performance for just the
documents for which we had relevance judgements.
That is, we looked at performance considering just the
38175 unique documents for which we have adhoc
relevance judgements. These results are shown in
Table 4.
Table 4
lsiasm lsial*
38175 38175
Rel_ret 9493 9596
Avg prec .3700 .3789
Prat 100 A306 .4466
PratlO .5020 .5220
R-prec .3977 .3995
Table 4: LSI Adhoc Results. Comparison of
standard vector method with LSI using only
documents for which relevance judgements were
available.
The most strildng aspect of these results is the higher
overall levels of performance. This is to be expected
since we are only considering the 38175 documents
for which we have relevance judgements, and there
are 700k fewer documents than in the official results.
Considering only this subset of documents, there is a
small advantage for LSI compared to the SMART
vector method. Taken together with the results for
just the top 100 documents these results suggest that
LSI can outperform a stralghtforward vector method.
We were somewhat disappointed at the relatively