5aSC13. Identification of vowels based on visual cues within raw complex speech waveforms.

Session: Friday Morning, May 17

Author: Michael A. Stokes
Location: Indiana State Dept. of Health, P. O. Box 241153, Indianapolis, IN 46224


Testing was performed to demonstrate that subjects can identify vowels using visual displays of raw complex waveforms. Two Midwestern males produced nine vowels [Peterson and Barney, J. Acoust. Soc. Am. 24, 175--184 (1952)] for two identification trials. In both trials, nine vowels were presented in random order by an experimenter. Subject MS correctly identified five out of nine vowels in both trials. Identification was accomplished by recognizing temporal interactions between F0 and F1 that provide categorical boundaries across the vowel space. The presence or absence of F2 in the range of 2000 Hz also provides a distinguishing characteristic between categorical pairs. When formant values and articulatory data are organized by categories seen within speech waveforms, the vowel space resembles the pairings of the stop consonants /b/--/p/, /d/--/t/, and /g/--/k/. Tongue position (reflected in F1 values) categorizes a vowel as place of articulation categorizes stop consonants. Lip position (reflected in F2 frequency) distinguishes vowels within a category as voicing distinguishes these consonant pairs. The details and potential success of a new model of vowel perception based on these findings will be discussed.

from ASA 131st Meeting, Indianapolis, May 1996