2aSC1 Speechreading aids based on automatic speech recognition: Prospects

ASA 129th Meeting - Washington, DC - 1995 May 30 .. Jun 06

2aSC1. Speechreading aids based on automatic speech recognition: Prospects for the automatic generation of cued speech.

L. D. Braida

R. M. Uchanski

L. A. Delhorne

Res. Lab. of Electron., MIT, Cambridge, MA 02139

Current progress in the development of automatic speech recognition (ASR) systems may soon permit discrete symbolic speechreading supplements to be derived from the speech signal. Such supplements could be similar to those used in manual cued speech, in which the talker uses discrete hand positions and shapes to provide distinctions between constants and vowels that are often confused in speechreading. Highly trained receivers of manual cued speech can achieve nearly perfect reception of everyday connected speech materials at normal speaking rates through the visual sense alone. To understand the accuracy that might be achieved with automatically generated cues, we measured how well trained spectrogram readers and an automatic speech recognizer could assign cues for various cue systems. A model of audiovisual integration was then applied to these measurements and data on human recognition of consonant and vowel segments via speechreading was published. This analysis suggests that with cues derived from current recognizers, consonant and vowel segments can be received with accuracies in excess of 80%, roughly equivalent to the segment reception accuracy required to account for observed levels of manual cued speech reception. To provide guidance for the development of automatic cueing systems, we describe techniques for determining optimum cue groups for a given recognizer and speechreader, and estimate the cueing performance that might be achieved if the performance of current recognizers were improved.