4aSP6 A model of concurrent vowel segregation using an implementation of

ASA 124th Meeting New Orleans 1992 October

4aSP6. A model of concurrent vowel segregation using an implementation of the subtraction strategy.

Andrew P. Lea

ATR Human Inform. Process. Res. Labs., 2-2 Hikaridai, Seicha-cho, Kyoto 619-02, Japan

The perceptual experiments performed by Lea [``Auditory Modeling of Vowel Perception,'' unpublished doctoral thesis, University of Nottingham (1992)] required listeners to identify the members of pairs of steady-state synthetic vowels presented monaurally, called concurrent vowels. Accuracy of identification was higher when one vowel was voiced and the other whispered compared to control conditions in which both were whispered or both were voiced with the same fundamental frequency (f[sub 0]). Surprisingly, the improvement in accuracy was restricted to the whispered member of the voiced/whispered pair. This outcome is compatible with the idea that one strategy used by listeners to recover a target voice from a mixture of voices is to cancel an interfering voice by a process of spectrotemporal subtraction. Here, a computational model is described that implements this strategy. The model is based on a bank of bandpass filters and an array of autocorrelators. The period of the dominant pitch in a stimulus is estimated from the largest peak in the normalized sum of the autocorrelation functions. This estimate guides the synthesis of an array of autocorrelation functions that are subtracted from the array generated by the stimulus, hence, canceling an interfering voice. Following Meddis and Hewitt [J. Acoust. Soc. Am. 91, 233--245 (1992)], vowel identification is performed by matching the short-time part of the resulting summed autocorrelation functions to a set of stored templates. The model makes good qualitative predictions of the accuracy of listeners' identification responses to the stimuli used in the perceptual experiments. In addition, it makes good quantitative predictions of the effects of introducing a difference in f[sub 0] between concurrent vowels reported previously by Assmann and Summerfield [J. Acoust. Soc. Am. 88, 680--697 (1990)]. [Work supported by the MRC of the UK.]