IX-5 The major procedures used for evaluation in the SMART system are described elsewhere. [3,43 They are the recall-precision curve, and four global measures: rank recall, log precision, normalized recall, and normalized precision. The measures vary from 0 to 1, with 0 representing the worst possible performance and 1 representing perfect performance. These measures all reflect both recall and precision, requiring both perfect recall and perfect precision to produce a measure of 1, but the rank recall and normalized recall measures both reflect recall more than pre- cision, while the log and normalized precision reflect precision more strongly than recall. The "quasi-Cleverdon" recall-precision curves shown here are averaged recall-precision curves over the set of 42 requests. 3. Results Table 1 shows the distribution of association pairs as a function of word frequency, with a cosine correlation at a cutoff of .6. It is seen that the largest number of correlations occur for words of very low frequency, frequencies 1 and 2. With the correlation measure used, it is very easy for low frequency words to co-occur significantly, since, if two words of frequency 1 occur in the same document they will always have a correlation of 1.0. With a collection size of 200 documents, in which 1179 words occur only once, one may expect over 7000 correlations above cutoff of words of frequency 1 with other words of frequency 1 purely on a random basis. If the words of frequency 2 are also considered, the total number of random correlations above .6 would be expected to be about 12000. It is clear therefore that the 18000 correlations observed do not actually