ASA 125th Meeting Ottawa 1993 May

2pSP11. Fourier descriptors for time domain labeling of multi-speaker voiced/unvoiced stop consonants.

Jack Green

Dept. of Comput. Sci., Western Michigan Univ., Kalamazoo, MI 49007

Ben Pinkowski

Dept. of Comput. Sci., Western Michigan Univ., Kalamazoo, MI 49007

Fourier descriptors (FD's), common shape features often used in computer image analysis, were used to describe time domain waveform shapes of speech signals. The signals examined were initial stop consonants contained in continuous speech samples of five speakers in the TIMIT corpus. The stops were extracted from a consonant/vowel (CV) combination, and information contained in the vowel portion of the utterances was not used. Data included with the corpus were used to segment the sounds. Sixty four FD's were calculated for each of the 58 samples used in the study. These FD's were used to label each stop as being either voiced or unvoiced. Preliminary results showed 61% of the stops were correctly labeled. Because time domain information requires less computational effort to obtain, a labeling could be obtained while frequency domain information is being calculated. The labeling could then be used to limit the search space in the frequency domain. Despite limitations of working in the time domain, it is possible that these results will approach 75%. Present work is examining the use of neural networks as a method of increasing correct classification.