Joan Bachenko William A. Gale
AT&T Bell Labs., 600 Mountain Ave., Murray Hill, NJ 07974-0536
Studies of interstress intervals tend to be more suggestive than conclusive because they rely on relatively few speech samples. The study reported here is based on observations taken from 32 000 intervals in the read speech of 106 speakers. A phone recognizer was used to label the onset times of each phone; intervals were identified as the span between stressed vowel onsets and each interval was classed according to its structure (the number of consonants and reduced vowels it contained) as well as duration. The data showed strong regularities in the dependence of time on classification. A model mixing duration, interval structure, and prior probabilities was then constructed and tested on phone lattices; the lattices were generated by the phone recognizer for speech from the resource management task. When durations were fixed but interval structure varied, prior probabilities pruned incorrect answers significantly better than chance; the mixed model's improvement was inconclusive. However, when both durations and interval structure were varied, the mixed model was clearly superior. The results indicate that interval duration and structure might be a link between speech and grammar in human and machine speech recognition.