ASA 130th Meeting - St. Louis, MO - 1995 Nov 27 .. Dec 01

4pSC16. Dynamic information for vowel identity is formant-based, while steady-state information is based on spectral shape.

Fred Cummins

Depts. of Linguist. and Cognitive Sci., Indiana Univ., Bloomington, IN 47405

Recent vowel research has attempted to identify a canonical set of acoustic parameters which best supports vowel categorization. Other work has argued that the speaker-independent information specifying vowel identity is time-varying, rather than static. The present study examines the possibility that these two research issues are related in complex ways. Recurrent neural networks were trained to identify vowels based on one of two types of time series: either spectral-shape (PLP) representations or formant peak (F1, F2, and fundamental frequency) representations. Networks were trained using inputs that reproduced the dynamics of vowels excised from continuous speech. The trained networks were then tested with both static and time-varying vowel tokens. Those trained on formant information outperformed those trained on spectral shape information. However, when tested on stimuli lacking dynamic information, the formant-based networks suffered more from the absence of time-varying information than did the PLP-based networks. This suggests that time-varying and steady-state information for vowel identity may not share a single ``best'' representation. [Work supported by ONR.]