4aSC26. HMM-based speech synthesis with various voice characteristics.

Session: Thursday Morning, December 5


Author: Takashi Masuko
Location: Precision and Intelligence Lab., Tokyo Inst. of Technol., 4259, Nagatsuta, Midori-ku, Yokohama, 226 Japan
Author: Keiichi Tokuda
Location: Nagoya Inst. of Tech., Gokiso-cho, Showa-ku, Nagoya, 466 Japan
Author: Takao Kobayashi
Location: Tokyo Inst. of Technol., Yokohama, 226 Japan
Author: Satoshi Imai
Location: Tokyo Inst. of Technol., Yokohama, 226 Japan


This paper presents a text-to-speech synthesis system based on continuous density HMM which can synthesize speech with various voice personality characteristics. An algorithm for speech parameter generation from HMM has been proposed, and it is shown that by using differential parameters as dynamic features, a smoothly varying speech parameter sequence according to the statistical information of static and dynamic features modeled by HMMs can be generated. A framework of HMM-based text-to-speech synthesis system using this algorithm is also shown. In this paper, an approach to voice conversion for the HMM-based text-to-speech synthesis system is described. In the approach, voice conversion is achieved by changing the parameter of HMMs which are used as speech units in the system. To transform the voice characteristics of synthesized speech to the target speaker's, a speaker adaptation technique to the HMMs was applied. From the subjective experimental results, it is shown that speech can be synthesized easily with various voice characteristics by transforming HMM parameters, and investigating the relation between the voice characteristics and the amount of the adaptation data.

ASA 132nd meeting - Hawaii, December 1996