ASA 129th Meeting - Washington, DC - 1995 May 30 .. Jun 06
1aSC40. Use of TRACTTALK for adaptive voice mimic.
Qiguang Lin
Gael Richard
Jingyun Zou
Dan Sinder
James Flanagan
CAIP Ctr., Rutgers Univ., Piscataway, NJ 08855-1390
Various speech-processing technologies necessitate parametrization of the
speech waveform. Cepstrum coefficients (including their derivatives and
variants) are to date commonly used in speech and speaker recognition. This
paper seeks more compact parametric description of speech information based on
the adaptive voice mimic [Flanagan et al., 780--791 (1980)]. The mimic
system utilizes an articulatory-based speech synthesizer to generate synthetic
speech, which is adapted to arbitrary speech input. The perceptually weighted
spectral difference be-tween the input and synthesized speech is next minimized
by optimizing the underlying articulatory parameters until the difference is
driven below a predetermined level. The resultant representation, adapted
moment by moment, provides efficient parametrization of the signal information
by which the problems of speech synthesis, speech recognition, and low bit-rate
speech coding are coalesced into a compact framework. In this paper, an
articulatory speech synthesizer, TRACTTALK, is first described. TRACTTALK
simulates the vocal tract based on principles of linear acoustics and
incorporates features which include interaction between the voice source and
the vocal tract. Preliminary results of adaptive mimicking using TRACTTALK are
presented and discussed. [Work supported by ARPA Contract #DAST 63-93-C-0064.]