******************************************************************************

ITL Hosts Large Vocabulary Conversational Speech Recognition Workshop

ITL's Information Access Division (IAD) hosted the 2001 Workshop on Large Vocabulary Conversational Speech Recognition (LVCSR), May 3-4 in Linthicum, Maryland. Workshop participants reviewed the results of this year's evaluation conducted by IAD in cooperation with DoD sponsors. Participating sites developed systems to automatically generate word-level transcriptions of recorded telephone conversations; these were scored against official reference transcriptions, produced by IAD linguist Bruce Lund. Eight research groups from the United States and Europe participated in parts of the evaluation and discussed their work at the workshop. This year's test set was the largest ever in size, involving sixty five-minute telephone conversations from three different corpora. One of these was the new Switchboard-Cellular Corpus, marking the first time evaluation has been done on cellular telephone data. Mark Przybocki and Alvin Martin of IAD presented an analysis of the evaluation results and compared these results to those of previous evaluations. The overall performance results of the best systems, measured by word error rate, were the best ever achieved on the Switchboard-2 Corpus. As in recent years, the system achieving the lowest word error rates on each corpus this year was that developed by the HTK group of the Engineering Department at Cambridge University.

As in the 2000 evaluation, a separate non-competitive evaluation was conducted on a subset of the test data, in which sites were asked to generate transcriptions at both the word and phonetic levels. Analysis of these results was performed by researchers at the International Computer Science Institute (ICSI) in Berkeley, California and presented at the workshop.

See the Web site for more details: http://www.nist.gov/speech/tests/ctr/h5_2001/index.htm.

 

CONTACT: Alvin Martin, ext. 3169