ASA 128th Meeting - Austin, Texas - 1994 Nov 28 .. Dec 02

2aSP5. Robust hands-free speech recognition.

Qiguang Lin

Chi Wei Che

James Flanagan

CAIP Ctr., Rutgers Univ., Piscataway, NJ 08855-1390

When speech recognition technology moves from the laboratory to real-world applications, there is increasing need for robustness. This paper describes a system of microphone arrays and neural networks (MANN) for robust hands-free speech recognition. MANN has the advantage that existing speech recognition systems can directly be deployed in practical adverse environments where distant-talking sound pickup is required. No retraining nor modification of the recognizers is necessary. MANN consists of two synergistic components: (1) signal enhancement by microphone arrays and (2) feature adaptation by neural network computing. High-quality sound capture by the microphone array enables successful feature adaptation by the neural network to mitigate environmental interference. Through neural network computation, a matched training and testing condition is approximated which typically elevates performance of speech recognition. Both computer-simulated and real-room speech input are used to evaluate the capability of MANN. Measurements of isolated-word recognition in noisy, reverberant, and distant-talking conditions show that MANN leads to a word recognition accuracy which is within 4%--6% of that obtained under a close-talking condition in quiet.