The table beloW illustrates the problem of weight-
ing plirasal terms Using topic 101 and a relevant docu-
ment (WSJ870226-0091).
        Topic 101 matches W5J870226-0091
        duplicate terms not shown
                 TERM      TE JDF
                    sdi     1750
                    eris    3175
                    star    1072
                   wars     1670
                   laser    1456
                 weapon     1639
                 missile     872

NEW WEIGHT
    1750
    3175
    1072
    1670
    1456
    1639
    872

          space+base       2641     2105
          interceptor      2075     2075
       exoatmospheric      1879     3480
       system+defense      2846     2219
       reentry+vehicle     1879     3480
      initiative+defense   1646     2032
      svstem+interceptor   2526     3118
         DOCRANK            30        10

changing the weighting scheme for compound terms,
along with other minor improvements (such as expanding
the stopword list for topics, or correcting a few parsing
bugs) has lead to the overall increase of precision of
nearly 20% over our official ~II~EC-2 ad-hoc results.
Table 4 summarizes these new runs for queries 101-150
against WSJ database. Simllar improvements have been
obtained for queries 51-100.
    The results of the routing runs against SJMN data-
base are somewhat more troubling. Applying the new
weighting scheme we did see the average precision
increase by some 5 to 12% (see column 4 in Table 3), but
the results remain far below those for the ad-hoc runs.
Direct runs of queries 51-100 against SJMN database
produce results that are about the same as in the routing
runs (which may indicate that our routing scheme works
fine), however the same queries run against WSJ data-
base have retrieval precision some 25% above SJMN
runs. This may indicate some problems with SJMN data-
base or the relevance judgements for it.


~OT SPOT' RETRIEVAL
    Another difficulty     with frequency-based term
weighting arises when a long document needs to be
retrieved on the basis of a few short relevant passages. If
the bulk of the document is not direcdy relevant to the
query, then there is a strong possibility that the document
will score low in the final ranling, despite some strongly
relevant material in it. This problem can be dealt with by
subdividing long documents at paragraph breaks, or into
approximately equal length fragments and indexing the
database with respect to these (e.g., Kwok 1993). While


                                              134

      Run         nyuirl    nyuirla  nyuir2   nyuir2a

      Name        ad-hoc    ad-hoc   ad-hoc   ad-hoc

      Queries       50       50       50       50

               Tot number of docs over all queries

      Ret         49884     5OOOO    49876    5OOOO
      Rel          3929     3929     3929     3929
      ReIRet       2983     3108     3274     3401

       Recall           (interp) Precision Averages

         0.00     0.7013    0.7201   0.7528   0.8063

         0.10     0.4874    0.5239   0.5567   0.6198

         0.20     0.4326    0.4751   0.4721   0.5566
         0.30     0.3531    0.4122   0.4060   0.4786

         0.40     0.3076    0.3541   0.3617   0.4257
         0.50     0.2637    0.3126   0.3135   0.3828
         0.60     0.2175    0.2752   0.2703   0.3380
         0.70     0.1617    0.2142   0.2231   0.2817
         0.80     0.1176    0.1605   0.1667   0.2164
         0.90     0.0684    0.1014   0.0915   0.1471
         1.00     0.0102    0.0194   0.0154   0.0474

               Average precision over all rel does

      Avg         0.2649    0.3070   0.3111   0.3759

                        Precision at

      5 does      0.4920    0.5200   0.5360   0.6040
      10 does     0.4420    0.4900   0A880    0.5580

      15 does     0.4240    0.4653   0A693    0.5253
      20 does     0.4050    0.4420   0.4390   0.4980

      30 does     0.3640    0.3993   0.4067   0.4607
      100 does    0.2720    0.2914   0.3094   0.3346

      200 docs    0.1886    0.2064   0.2139   0.2325
      S00docs     0.1026    0.1103   0.1137   0.1229
      1000 does   0.0597    0.0622   0.0655   0.0680

                    R-Precision (after Rel)

      Exact       0.3003    0.3332   0.3320   0.3950


Table 4. Automatic ad-hoe run statistics for queries 101-150 against
WSJ database: (1) nyuirl - TREC-2 official run with <desc> and <narr>
fields only; (2) ?~uir1a - revised term weighting run; (3) nyuir2 -
official TREC-2 run with <desc>, <con>, and <fac> fields only; and (4)
nyuir2a - revised weighting run.


such approaches are effective, they also tend to be costly
because of increased index size and more complicated