4aSC30. A method for controlling the speech rate in speech synthesis and its perceptual evaluation.

Session: Thursday Morning, December 5

Time:


Author: Sumio Ohno
Location: Dept. of Appl. Electron., Sci. Univ. of Tokyo, Noda, 278 Japan
Author: Masamichi Fukumiya
Location: Dept. of Appl. Electron., Sci. Univ. of Tokyo, Noda, 278 Japan
Author: Hiroya Fujisaki
Location: Dept. of Appl. Electron., Sci. Univ. of Tokyo, Noda, 278 Japan

Abstract:

It is well known that speech rate varies both globally and locally in natural discourse due to various factors such as contrastive stress, syntactic boundaries, emotion, etc. While the global speech rate can be clearly defined by the durations of utterances and pauses, the local speech rate has not been well defined. The present authors have proposed a rigorous and quantiative definition for the relative local speech rate and showed an objective method for its measurement [S. Ohno and H. Fujisaki, Proc. EUROSPEECH'95, Vol. 1, pp. 421--424 (1995)]. Based on the analysis of changes in both global and local speech rates found in a speech material consisting of readings of a story at various speech rates, the present paper proposes rules for controlling the global and local speech rates in order to produce a synthetic discourse to fit exactly in a specified time interval. The validity of the method has been tested and confirmed by perceptual experiments using synthetic discourse of various durations generated from a natural discourse by analysis--resynthesis.


ASA 132nd meeting - Hawaii, December 1996