ASA 127th Meeting M.I.T. 1994 June 6-10

4aSP2. Freedoms and constraints in computational prosody modeling from speech corpora.

Yoshinori Sagisaka

ATR Interpreting Telecommun. Labs., 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-02 Japan

With the growth of the availability of large speech corpora, statistical models of prosody control have been studied intensively. These computational models have improved the naturalness of synthetic speech and are expected to provide additional supra-segmental information for speech recognition. Though only conventional statistical methods such as linear regression, regression trees, or neural nets have been employed in these computational models, their success has been due to efforts to accommodate observed prosodic characteristics and qualitatively known control mechanisms. In this talk, a review will be presented on how observed prosodic characteristics have been modeled with these statistical tools and also how new statistical models can be designed to cope with the insufficiencies of conventional models. It is expected that the investigation of well-constrained models and their constraints will lead one to more efficient computational models and deeper understanding of prosody control mechanism through these modeling procedures.