2. H. Topic Statement Processin~ for Conceptual Graph Generation

 The processing of topic statements for CG generation does not make use of the output of the Natural Language
 Query Constructor, but instead the current system first applies the same RCD and CG generator modules to produce
 topic statement Cr5) CGs. Several TS-specific processing requirements have been identified, some of which have
 been implemented as post-processing routines and others are under development.

 - Elimination of concept and relation nodes corresponding to contenfless meta-phrases (e.g. `~Relevant
   document must identify ..."). If both of the concept nodes in a concept-relation-concept triple belong
   to a meta-phrase, the CRC is ignored. When only one of them is a meta-phrase concept, the triple is
   not removed blindly uniess the other concept occursin another triple.

 - Handling of negated parts of topic statements. The weights are adjusted in such a way that an occurrence of the
   negated concept in a document will contribute to the negative evidence that the document will be relevant. In
   effect, the two weights for the concept are switched.

   Automatic assignment of weights to concept and relation nodes. There are several factors we consider: the
   conventional way of determimng the importance of terms using inverse document frequency (DF) and total
   frequency; the location of terms occurring in topic statements; the part of speech information for each term; and
   indications in the topic statement sublanguage (e.g. the document MUST contain...). Although we have
   implemented a program that tags individual words with the degree of importance based on the sublanguage
   patterns, we assigned concept weights based on IDF values of terms in the collection for the evaluation, due to
   time constraints.

   Merging common concept appearing in different sections of topic statements. Although it is not safe
   in general to assume that two concepts sharing the same concept name actually refer to the same concept
   instantiation and merge them blindly, we have observed that this is not the case in the topic statements. In fact,
   we believe that it is desirable to merge CG fragments using common concept nodes. This is an important process
   that eliminates undesirable effects on scoring. Without this, a document contaimng a concept occurring repeatedly
   in <desc>, <narr>, and <con> fields would be ranked unnecessarily high (or low if it is negated) because each
   ocerrence of the concept would make an independent contribution to the overall score.

 Since an integrated automatic topic processing module was not available, the mechanical aspects of the process were
 hand-simulated with some parts done automatically and other done manually.

 2.1. Relation Concent Detector ~CD)

 The output of the Complex Nominal Phraser and the Proper Noun Interpreter modules described above provide
 concept-relation-concept triples directly to the Relation-Concept Detector ~CD) module. In addition, the following
 RCD handlers are operative.

 One of the more distinct aspects of the DR-LINK system is its capability of extracting and using relations in the
 fmal representation of documents and topic statements in their CG representations. This module provides bullding
 blocks for the CG representation by generating concept-relation-concept triples based on the domain-independent
 knowledge bases we have been constructing with machine-readable resources and corpus statistics. In this module,
 there are several handlers that are activated selectively depending on the input sentence.

 2. L 1. Case Frame (CF) Handler

 The main function of the CF Handler is to generate concept-relation-concept triples where one of the concepts comes
 typically from a verb. It identifies a verb in a sentence and counects it to other constituents surrounding the verb.
 Since the relations (about 50 we use currently) included inour representation are originated froin the theories of
 linguistic case roles (Somers, 1987, and Cook, 1989) and are all semantic in nature, this module consults the


                                              93