ASA 129th Meeting - Washington, DC - 1995 May 30 .. Jun 06

1pSC4. Automatic speech recognition using signal processing based on auditory physiology and perception.

Richard Stern

Dept. of Elec. and Comput. Eng., School of Comput. Sci., and Biomed. Eng. Program, Carnegie Mellon Univ., Pittsburgh, PA 15213-3890

Signal processing for automatic speech recognition has traditionally been inspired more by models of speech production than by models of auditory perception. While some aspects of human auditory processing have been implicit in traditional signal analysis for speech recognition, there is growing interest in the development of more computationally demanding signal processing strategies that are directly motivated by knowledge of auditory physiology and perception. The use of physiologically motivated signal processing has been shown to improve the accuracy of some automatic speech recognition systems, particularly in difficult acoustical environments. This talk will review and discuss the role that knowledge of human hearing has played in the design of speech recognition systems. The common features of auditory models that have emerged from various laboratories as well as their differences will be discussed. The recognition accuracy obtained using auditory models will be compared with the accuracy obtained using conventional signal analysis techniques, as well as the accuracy obtained using other approaches to robust recognition that are not physiologically based. Finally, an attempt will be made to identify aspects of monaural and binaural auditory processing that appear to be most helpful for robust recognition. [Work supported by ARPA.]