[an error occurred while processing this directive] [an error occurred while processing this directive]

The SPECIALIST Lexicon

A major component of the Lexical Systems Project is the SPECIALIST lexicon, a large syntactic lexicon of medical terminology. Lexical items are collected into unit records containing morphological, syntactic and spelling information about each item. The lexicon will contain over 125,000 lexical items in 119,000 records in the 1999 UMLS release. Morphological information includes full inflectional information about each item. A data base of derivational relationships is also maintained with the lexicon. Morphological and spelling information is important for matching of lexical items and forms a major part of the capability of the Lexical tools. Syntactic information includes verb complement patterns for verbs and sequencing information for adjectives. Verb complement information is central to analysis of sentence structure.

The SPECIALIST lexicon has been built and maintained by a Mr. Allen Browne along with a variety of expert consultants in several lexicon building projects. Lexicon building tools have been used to facilitate lexicon building and maintain the consistency of the lexicon. Ms. Van Nguyen wrote and maintains the Web based lexicon building tool, Lexbuild. This tool enforces a complete and consistent form for lexical entries, and provides users with a menu based approach to entering lexical information. A "lexicon grammar" specifying the correct form of lexical records assures that lexical records created by the tools are correctly formed. The first large scale thrust in lexicon building was in 1990 when five expert consultants worked off-site using government supplied computer equipment and software (a PC port of Lextool) to expand the lexicon. This effort brought the lexicon to over 40,000 lexical records. In 1994, when the lexicon was first released as one of the UMLS Knowledge sources it contained approximately 64,000 lexical records. Another highly successful lexicon building project in 1995 increased the size of the lexicon to over 80,000 items.

An ongoing lexicon building project with consultants Dr. Susan Hoyle and Dr. Lynn McCreedy has brought the lexicon to its current size. The lexicon consultants are working off-site with LexBuild. New lexical records are downloaded to NLM via the internet bi-weekly. This project is scheduled to continue until April 1999. Each of the lexicon building projects have emphasized quality control and correction of errors in the existing lexicon as well as growth. The current effort has included a review of spelling variation within the lexicon and a review of the morphological databases. Ms. Amy Markey, a summer student, also contributed to quality control of the lexicon by reviewing the lexicon's treatment of irregular verbs. Dr. McCreedy and Dr. Hoyle are actively engaged in adding new terms to the lexicon as well as discovering and correcting lexicon errors. Both consultants are professional linguists with broad knowledge of the issues involved in lexicon coding and experience as consultants in previous SPECIALIST lexicon building projects.

A word list derived in part from the SPECIALIST lexicon is now used in the the library's system for semi-automatic entry of journal article data into MEDLINE. Experiments leading to this choice of word lists are described in a paper entitled "Lexicon Assistance Reduces Manual Verification of OCR Output" presented at the Eleventh IEEE Symposium on Computer Base Medical Systems.

Last updated: Tuesday, 20-Nov-2001 14:40:23 EST
Please send comments/corrections to the document author.
This page: http://lhncbc.nlm.nih.gov/cgsb/research/nls/lex/specialist/


[an error occurred while processing this directive]