2pSP15 Speech recognition in noise: Comparing an auditory model with

ASA 125th Meeting Ottawa 1993 May

2pSP15. Speech recognition in noise: Comparing an auditory model with human performance.

Georg F. Meyer
William A Ainsworth

Dept. of Comput. Sci. and Dept. of Commun. and Neurosci., Keele Univ., Keele, Staffordshire ST5 5BG, UK

A potential application for models of the auditory system is as the front end to speech recognition systems. The model discussed here consists of a physiologically plausible model of the cochlear nerve and of the major neuron types in the cochlear nucleus, the next stage of information processing in the auditory pathway. The recognition task for the machine and human observers was the identification of 100-ms-long plosive--vowel combinations in three noise conditions: clean, 3 dB and 0 dB S/N ratio. Human listeners were asked to identify the combination under a multiple forced choice regime while the auditory model was used as a front end to a hidden Markov model. Two types of noise were used: Continuous noise and gated noise coincident with the utterance. Physiological experiments show that continuous noise causes a substantial threshold shift while short duration coincident noise has no effect. Psychophysical experiments show that human recognition performance is also a function of the noise type. The performance of the auditory model with plausible threshold shifts is compared with human performance for both conditions.