IV-21 a) The Phrase Dictionaries Both the regular as well as the null thesauruses are based on entries corresponding either to sinnle words or to single word stems. In attempting to perform a subject analysis of written text, it is possible however, to go frrther by trying to locate "phrases" consisting of sets of words which are judged to be important in a given subject area. For example, in the field of computer science, the concepts of "program" and ~ may mean many things to many people. On the other hand, the phrase concept which results from a combination of these individual words, that is, 1'progremming language" has a much more specific connotation. Such phrases can be used for subject identification by building phrase dictionaries to be used in locating combinations of concepts, rather than individual concepts alone. Such phrase dictionaries would then normally include pairs, or triples, or quadruples of words or concepts, corresponding in written texts to the more likely noun and prepositional phrases which may be expected to be indicative of subject content in a given topic area. Ma;ny different strategies can be used in the construction of phrase dictionaries. For example, it is possible to base phrase dictionaries on c~~ibinations of high-frequency words or word stems occurring in documents and search requests; alternatively, one may want to use a thesaurus before appeal is made to a phrase dictionary. U~ider those circumstances, the phrase dictionary would then be based on con~inations of concept categories included in the thesaurus, rather than on combinations of words. Furthermore, given the availability of a phrase dictionary one can recognize the presence of phrases in a given text under a variety of cir- cumstances: for example, the existence of a phrase may be recognized whenever the phrase components are present within a given document, regard-