NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)

SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Recent Developments in Natural Language Text Retrieval chapter T. Strzalkowski J. Carballo National Institute of Standards and Technology D. K. Harman with a different combination of fields from the topics used to create search queries. A typical topic is shown below: <top> 41ead> Tipster Topic Description <num> Number: 107 <dom> Domain: International Economics <title> Topic: Japanese Regulation of Insider Trading <desc> Description: Document will inform on Japan's regulation of insider trading. <narr> Narrative: A relevant document will provide data on Japanese laws, regulations, and/or practices which help the foreigner understand how Japan controls, or does not control, stock market practices which could be labeled as insider trading. <con> Concept(s): 1. insider trading 2. Japan 3. Ministry of Finance, Securities and Exchange Council, Osaka Securities Exchange, Tokyo Stock Exchange 4. Securities and Exchange Law, Article 58, law, legislation, guidelines, self-regulation 5. Nikko Securities, Yamaichi Securities, Nomura Securities, Daiwa Securities, Big Four brokerage firms <fac> Factor(s): <nat> Nationality: Japan <Ifac> <Aop> This topic actually consists of two different statements of the same query: the natural language specification con- sisting of <desc> and <narr> fields, and an expert- selected list of key terms which are often far more infor- mative than the narrative part (in some cases these terms were selected via feedback in actual retrieval attempts). The table below shows the seatch query obtained from fields <desc> and <nair> of topic 107, after an expansion with similar terms and deleting low-content terms. Ouery 107 standard+trade idf=16.81 weight""0.38 standard+trade idf=16.8l weight""0.38 regulate+japanese idf=15.40 weight=1.00 standard+japanese idf=14.08 weight"'0.38 regulate+trade idf=12.84 weight=l.00 regulate+trade idf=12.84 weight=l.00 controls idf"'9.97 weight=l.00 labele idf[OCRerr].20 weight=l.00 trade+inside idf=8.62 weight=l.OO trade+inside idf=8.62 weight=l.00 inside idf"'iA9 weight=l.00 inside idf""iA9 weight=l.00 inside idf[OCRerr]A9 weight=l.00 regulate idf=5.66 weight=l.00 regulate idf=5.66 weight=l.00 131 regulate idf=5.66 weight=l.00 practice idf=5A6 weight=l.00 practice idf""5.46 weight=l.00 data idf=4.91 weight=l.00 data idf=4.91 weight"'0.51 data idf=4.91 weight=O.26 japanese idf=4.84 weight=l.00 japanese idf=4.84 weight=l.00 standard idf=4.81 weight=O.38 standard idf1"4.81 weight=O.38 standard idf=4.81 weight[OCRerr].38 inform idf=4.71 weight=l.00 inform idf"'4.71 weight"'0.26 inform idf=4.71 weight"'0.51 protect idf=4.69 weight=O.41 Note that many `function' words have been removed from the query, e.g., provide, understand, as well as other `common words' such as document and relevant (this is in addition to our regular list of `stopwords'). Some still remain, however, e.g., data and inform, because these could not be uniformly considered as `common' across all queries. Results obtained for queries using text fields only and those based primarily on keyword fields are reported separately. The purpose of this distinction was to demon- strate that (or whether) an intensive natural language pro- cessing can make an imprecise and frequendy convo- luted narrative into a better query that an expert would create.14 The ad-hoc category runs were done as follows (these are the official IkEC-2 results): (1) nyuirl: An automatic run of topics 101-150 against the WSJ database with the following fields used: <tide>, <desc>, and <narr> only. Both syntactic phrases and term similarities were included. (2) nyuir2: An automatic run of topics 101-150 against the WSJ database with the following fields used: <tide>, <desc>, <con> and 4ac> only. Both syntactic phrases and term similarities were included. 14 Some results on the impact of different fields in [OCRerr][OCRerr]EC topics on the final recall/precision results were reported by Broglio and Croft (1993) at the ARPA HLT workshop, although text-only runs were not included. One of the most striking observations they have made is that the narrative field is entirely disposable, and moreover that its inclusion in the query actually hurts the system's performance. Croft (personal communication, 1992) has suggested that excluding all expert-made fields (i.e., <con> and <fac>) would make the queries quite ineffective. Broglio (personal communication, 1993) confirms this showing that text-oaly retrieval (i.e., with <desc> and <narr>) shows an average pre- cision at more than 30% below that of <con>based retrieval.