PPT Slide

Previous slide Next slide Back to first slide View graphic version

Notes:

A Part of speech (POS) tagger is employed to remove ambiguity when a lexical Element contains several senses for that term. The part of speech tagger makes a choice which is the proper sense for a term given the context of where it falls in the sentence. For example if a term can be both a noun and a verb, but it has been preceded by a determiner, the part of speech tagger will judge that this lexical element is a noun.

Logically, the part of speech tagger is inserted at this point of the process. However, because we employ an independent tagger, and that tagger works on only the original text, and many of the contextual clues the part of speech uses are also sentence boundary clues, tagger can also be employed in the tokenization process. For efficiency, when we do employ the tagger, we will take advantage of it’s tokenization and sentence boundary identification.

A part of speech tagger client will be employed for this purpose, relying on a tagger server defined and developed external to this project.