4aSP2 Text analysis for speech synthesis.

ASA 126th Meeting Denver 1993 October 4-8

4aSP2. Text analysis for speech synthesis.

Kenneth Ward Church

AT&T Bell Laboratories, 600 Mountain Ave., Murray Hill, NJ 07974

Text analysis is a catch-all term for a range of tasks such as tokenizing the input text into words and sentences, assigning parts of speech identifying clitics, parsing phrasal verbs, identifying and expanding abbreviations, dates, fractions, and amounts of money, and so on. Text analysis is important for speech synthesis for two reasons. First, word pronunciation sometimes depends on usage: I can be a pronoun or a Roman numeral, wind can rhyme with ``bind'' of ``binned,'' Dr. can be ``doctor'' or ``drive,'' 2/3 can be ``two thirds'' or ``February third,'' or ``two slash three.'' A second, equally important reason for text analysis is that its results will be used to modulate the pitch, timing, and amplitude of the speech so as to present the text's message clearly. For example, the function word ``that'' should be cliticized (reduced) in certain usages (e.g., as a subordinating conjunction: ``It is a shame that [schwa] he is leaving'') but not in other usages (e.g., ``Did you see THAT [no schwa ]?''). If a synthesizer could reliably make such distinctions, it might sound a little more like it knows what it is talking about (and that it cares).