Helen E. Karn
Dept. of Linguist., Georgetown Univ., Washington, DC 20057-1068
Determining phonological and intonational phrase boundaries is an important step in synthesizing natural-sounding prosodic contours in a text-to-speech (TTS) system. An algorithm is presented here that generates phrase boundaries in Spanish texts. The basis for this algorithm is Liberman and Church's function group (f-group) parser for English [M. Y. Liberman and K. W. Church, ``Text Analysis and Word Pronunciation in Text-to-Speech Synthesis,'' in Advances in Speech Signal Processing, edited by S. Furui and M. M. Sondhi (Marcel Dekker, New York, 1992), pp. 791--831]. The Spanish f-group parser produced satisfactory results for shorter sentences and sentences containing a fair amount of punctuation, but the parser tended to overgenerate phrase boundaries in longer sentences and sentences with little punctuation. In addition, the parser generated but did not distinguish different types of sentence and paragraph transitions. To refine the parser in these areas, selections from a Spanish text [E. Sabato, El tunel (Editorial Seix Barral, Barcelona, 1982)] were read by adult native Spanish speakers and analyzed for phrase boundaries as evidenced by significant changes in fundamental frequency, tempo, and/or pauses. The acoustic phonetic data and its role in the development of the phrase boundary algorithm are discussed.