lations, and the searchers' responses to our questionnaire.
 Table 1 shows the distribution of numbers of words and op-
 erators per query, and also of time required to construct a
 query. Table 2 shows the distribution of searchers' attitudes
 to the topics, each indicated on a scale of one to five, from
 least to most.

[_______ Mean       StdDev.  Mm.    Max.       N
 Operators  9.94    5.72     1.00   44.00      375
[Words      19.40   14.63    1.0    145.00     375

LL 11.31            7.48     1.00   40.00      367
 minutes

 Table 1. Characteristics of queries for ad hoc and routing
 topics.

            Mean    StdDev.  Mm.    Max.       N
 Familiar-  1.81    1.15     1.00   5.00       388
 ity        __
 Easeof     2.82    1.11     1.00   5.00       372
 construc-
 tion
 Enough     3.20    1.11     1.00   5.00       322
 informa-
 tion       ________

 Table 2. Characterization of topics by searchers, for rout-
 mg and ad hoc topics.

      Our ad hoc questionnaire also included a question on
 how many years of experience each searcher had in online
 searching.  The mean response was 6.8 years.  Unfortu-
 nately, we do not have these data for the routing searchers.
      We wished to consider whether there were any relation-
 ships between the various characteristics of queries and top-
 ics and the performance of the queries themselves. For this
 purpose, we constructed a table in which each separate query
 formulation (75x5=375) is associated with performance
 measures, the characteristics enumerated in tables 1 and 2,
 and the three topic categories of broadness, hardness and re-
 striction defined by Harman (this volume).  For perfor-
 mance, we considered using one or more of three measures:
 average of 11-point precision; precision at 100 documents;
 and R-precision (defined by Harman, this volume). Factor
 analysis of these three measures showed that a single factor
 accounts for more than 90% of the variance among them,
 so that they represent, in effect, a single aspect or factor of
 performance. The average precision was chosen as represen-
 tative of this factor, and we have used it both in evaluation
 of our retrieval results, and in attempting to determine the
 effect of the other variables we have considered, on retrieval
 performance. Since this variate does not exhibit a normal
 distribution, logarithmic and logistic transforms were ex-
 plored. The logistic leads to a most nearly normal distribu-
 tion of the transformed score, but we can still not say that
 the transformed variable follows a normal distribution.
      The results of applying ANOVA to seek a predictor of
 p are shown in Table 3. No significant relations appear.
 Because of the range of values assumed by the variables
 Operators, Words and Time, the relation was sought using


                                                       38

regression analysis. Once again, no significant relations
were found, and the scatter plots (not included here) make it
clear that there is no trend to be found.. Both hardness and
broadness are significantly related to performance.  The
former is expected, since the hardness is determined by me-
dian average precision; the latter is less obvious.


            Anal sis of variance for lo I 1-
   Independent variable             Significance
Familiaritv                0.149
Easiness                   0.169
Information                0.907

Table 3. Significance levels of F-tests using ANOVA to
seek dependence of the logistically transformed average pre-
cision on the searcher's assessments of their query formula-
tion.

   The search for relations between average precision and
characteristics of the query formulation, whether provided
by the search, or determined from the query text itself, was
motivated by the results, discussed below, which show that
it is desirable to weight formulations in proportion to their
average precision. Thus, if we could find a surrogate for
average precision which can be known without evaluating
the retrieved documents, it would be possible to approxi-
mate the effective combination on the first pass of a re-
trieval operation. This hope is frustrated at this time.

3.3  Query Combination and Data Fusion
Results:      Ad hoc Topics

   The official results reported to JkBC-2 were for the
overall performance of each of two treatments for the ad hoc
topics, and of one treatment for the routing topics. For
those results, we refer the reader to the relevant section of
this volume. Here we report on our further investigations
on the effect of combination of queries, and of data fusion,
on performance.
   Our first investigation in query combination was to see
if combining query formulations has a regular, beneficial ef-
fect, as hypothesized. To do this, we generated the five dif-
ferent search groups for the ad hoc topics, as described in
section 2.2, and did experimental runs on all single query
groups, all 2-way combinations of queries, all 3-way com-
binations of queries, all 4-way combinations of queries, and
the combination of all 5 query formulations. The results
are presented in Table 4, where it is evident that the average
performance increases monotonically as more evidence is
added. The increase is strict and significant, as shown in
Table 4a, where we display the number of times that each
combination level performed better than each other level.
We note that the data fusion results are not significantly
better than any but 1-way combination (that is, average per-
formance for single queries), but also that its performance is
not significantly different from unweighted 5-way combina-
tion