ASA 127th Meeting M.I.T. 1994 June 6-10

4aSP6. Annotation and prosodic control in the Eloquence text-to-speech system.

Kenneth J. deJong

Susan R. Hertz

Eloquent Technol., 24 Highgate Cir., Ithaca, NY 14850

Dept. of Modern Languages and Linguist., Cornell University, Ithaca, NY

A simple set of text annotations that enables users to produce sophisticated prosodic effects is being developed as part of the Eloquence text-to-speech system. The annotations relate to concepts such as ``emphasis'' and ``level of excitement.'' The rules interpret the typically sparse annotations in the process of building up a rich, ``multi-stream'' phonological and phonetic representation from which the final values for synthesis are derived. This structure includes prosodic phrases with associated tones, words with associated pitch accents, syllables and their nuclei, fundamental frequency values, and durations. For example, marking a work for emphasis triggers several actions: the rules place an accent on the word; they associate tones appropriate to the level of emphasis and phrase type; they attract nuclear stress to the word and deaccent following words in the phrase; they increase the pitch range at the emphasized word; and they lengthen the accented syllable in accordance with our nucleus-based timing model. The sophistication of the underlying rule set gives even untrained users access to a broad range of phonological and phonetic resources. The annotations along with their phonological and phonetic consequences will be described, and their effects on the final synthetic speech demonstrated. [Work supported by NIH.]