L. Deng D. Braam
Dept. of Elec. and Comput. Eng., Univ. Waterloo, ON N2L 3G1, Canada
In the work by Sussman et al. [J. Acoust. Soc. Am. 90, 1309--1325 (1991)], locus equations, which describe linear relationships between the onset and steady-state formant values in consonant--vowel syllables, have been experimentally tested using a large quantity of acoustic data and been proposed as a source of relational invariance for stop place categorization. In this presentation, a statistical model is presented, which utilizes the conceptualization of the locus equations as a basis for parametric modeling of phonetic contexts---place of articulation, and of their acoustic consequences---formant transitions. The model is based on hidden Markov model representation of formant--transmission microsegments of speech. A generalized expectation-maximization algorithm is developed for automatic estimation of the model parameters. The proposed model is capable of generalizing consonant characteristics from a small training data set where the contextual information is only sparsely represented, and is hence applicable to very large vocabulary speech recognition problems. Results from vowel classification experiments (TIMIT database), demonstrating relative performance between this locus-based model and the conventional hidden Markov model, will be presented.