ASA 125th Meeting Ottawa 1993 May

3pSP1. Evaluation of objective vowel classifiers.

J. D. Miller

F. E. Kramer

D. J. Meyer

S. Lee

R. M. Uchanski

Central Inst. for the Deaf, 818 S. Euclid, St. Louis, MO 63110

Four classifiers were examined for their ability to identify nine monophthongal American English vowels. The classifiers, (1) Bayesian, (2) a standard back-propagation neural net with one hidden-layer, (3) a modified ellipse method, and (4) an automatic region-drawing method, operated on two-dimensional vowel representations. Additionally, three different types of two-dimensional data were evaluated; (a) (log F1, log F2), (b) (norm log F1, norm log F2) normalized by a sensory reference [Miller, J. Acoust. Soc. Am. 85, 2114--2134 (1989)], and (c) (x',y') of the auditory-perceptual space [Miller, J. Acoust. Soc. Am. 85, 2114--2134 (1989)]. Our corpus of 2304 vowels spoken in a CVC context by male and female talkers includes stress and speaking-rate variations [Fourakis, J. Acoust. Soc. Am. 90, 1816--1827 (1991)]. For each vowel utternace and each data type, a single two-dimensional data point is computed from the vocalic segment selected by Fourakis. Separate training (75% of the data) and testing (25%) subsets of the corpus were used in a jacknife procedure. In general, all the classifiers except the ellipse method performed similarly, and obtained the highest scores using (x',y') data. However, unless training and testing of the classifiers is restricted to vowels that are relatively steady-state, identification scores are less than ideal (<90%). [Work supported by NIDCD.]