ASA 129th Meeting - Washington, DC - 1995 May 30 .. Jun 06

1aSC40. Use of TRACTTALK for adaptive voice mimic.

Qiguang Lin

Gael Richard

Jingyun Zou

Dan Sinder

James Flanagan

CAIP Ctr., Rutgers Univ., Piscataway, NJ 08855-1390

Various speech-processing technologies necessitate parametrization of the speech waveform. Cepstrum coefficients (including their derivatives and variants) are to date commonly used in speech and speaker recognition. This paper seeks more compact parametric description of speech information based on the adaptive voice mimic [Flanagan et al., 780--791 (1980)]. The mimic system utilizes an articulatory-based speech synthesizer to generate synthetic speech, which is adapted to arbitrary speech input. The perceptually weighted spectral difference be-tween the input and synthesized speech is next minimized by optimizing the underlying articulatory parameters until the difference is driven below a predetermined level. The resultant representation, adapted moment by moment, provides efficient parametrization of the signal information by which the problems of speech synthesis, speech recognition, and low bit-rate speech coding are coalesced into a compact framework. In this paper, an articulatory speech synthesizer, TRACTTALK, is first described. TRACTTALK simulates the vocal tract based on principles of linear acoustics and incorporates features which include interaction between the voice source and the vocal tract. Preliminary results of adaptive mimicking using TRACTTALK are presented and discussed. [Work supported by ARPA Contract #DAST 63-93-C-0064.]