5pSP21 Analysis of quality factors in synthetic speech produced by rules.

ASA 128th Meeting - Austin, Texas - 1994 Nov 28 .. Dec 02

5pSP21. Analysis of quality factors in synthetic speech produced by rules.

Eri Miyazawa
Hiromi Nagabuchi

NTT Telecommun. Networks Labs., Midori-cho, Musashino-city, Tokyo, 180 Japan

This paper investigates how various factors affect the quality of synthetic speech produced by rules. Using rules to synthesize speech will be an important technique for providing various telecommunication services in future intelligent networks. The quality of synthetic speech is generally measured by subjectively evaluating the speech from the viewpoint of intelligibility, or by comparing it with the quality of other types of synthetic speech. However, the development of a practical speech synthesis method for use in telecommunication networks requires an overall quality evaluation, including intelligibility and naturalness. The quality should be compared with that of natural telephone speech. To establish an overall quality evaluation method, the effects of several factors on the overall quality (expressed by MOS) of speech synthesized by several Japanese text-to-speech systems are quantitatively compared with the effects of using additive speech-correlated white noise as a natural speech material. Experimental results show that such factors as subject, listening experience, average pitch frequency, and text affect synthetic speech more than natural speech. Quality evaluation characteristics due to these factors are discussed and an overall quality evaluation method for synthetic speech is proposed.