ranks. This effect was due to the positive coefficient of lognumierms, which gave indexing weights larger than 1 for the terms in these documents, thus yielding high retrieval status value for these documents w.r.t. most queries. Obviously a linear indexing function is not well suited for coping with such extreme distributions of the elements of the relevance description. Besides the effect on the few very long documents, other documents are also affected by this distribution, for the following reason: Many terms in the long documents will be assigned indexing weights larger than 1. For the computation of the squared error that is to be minimized, the quadratic difference (y - u(x~)2 between the indexing weight u(x) and the actual relevance decision y (0 or 1) in the learning sample is considered. Even if y = 1, there is still the difference (u(x) - 1)2, for u(x~ > 1. In order to reduce this error and thus the overall error, a larger error for other indexing weights is taken into account. There are three possible strategies for coping with this problem: * Some experiments performed with other material have shown that overall indexing quality can be improved by excluding outliers from the learning sample ([Pfeifer 91]). As outliers, not only those pairs with weights lying on the "correct side" of the interval [0,1] (i.e. either y = 0 and u(x) < 0 or y = 1 and u(x~ > 1) should be regarded. Also the exclusion of those pairs with weights lying on the "wrong side" (i.e. either y = 0 and u(x~ > 1 or y = 1 and u(x~ < 0) yields better results. * Using logistic functions instead of linear functions overcomes the problem of indexing weights outside the interval [0,1]. However, we have some experimental evidence ([Pfeifer 90]) that even in this case the removal of outliers from the learning sample (where the outliers are defined w.r.t. a linear indexing function) improves the indexing quality. * For the experiments described here, we switched from linear to polynomial functions, which also gave fairly good results. The function finally used for indexing with single terms only is u(i) = 0.00042293 + 0.00150083 if logidf imaxif + -0.00150665 if imaxif + 0.00010465 logidf + -0.00122627 lognumierms imaxif. F or indexing with phrases, we first had to derive the set of phrases to be considered. We took all adjacent non-stopwords that occurred at least 25 times in the D1 (training) document set. Then an indexing function common for single words and phrases was developed by introducing additional binary factors is~single (is~phrase) having the value 1 if the term is a single word (a phrase) and 0 otherwise. The parameters if and logidf were defined for phrases in the same way as for single words, and in the computation of imaxif and lognumierms for a document both phrases and single words were considered. This procedure produced the indexing function u(x~ = 0.00034104 + 0.00141097 is~single if logidf imaxif + -0.00119826 is~single if imaxif + 0.00014122 is~single logidf + -0.00120413 lognurnierms zmaxif + 0.00820515 is~phrase if logidf . imaxif + -0.04188466 is~phrase if imaxif + 0.00114585 is~phrase logidf. 92