ASA 129th Meeting - Washington, DC - 1995 May 30 .. Jun 06

1aSC35. Generating gestural scores from articulatory data using temporal decomposition.

Michael J. Collins

Stanley C. Ahalt

Ashok K. Krishnamurthy

Dept. of Elec. Eng., The Ohio State Univ., Columbus, OH 43210

Through empirical investigations, the automatic generation of gestural scores for articulatory data corresponding to consonant--vowel--consonant (CVC) tokens was studied. The articulatory data consist of the movements of flesh points measured using an x-ray microbeam, and is first ``warped'' to resemble the vocal tract variables of constriction location and constriction degree. The multichannel warped data are then analyzed using temporal decomposition. The resulting target functions provide candidates for gestures, from which the best candidates are chosen statistically by examining the magnitudes of the elements of their associated reconstruction weights. Onset and durational analysis of the candidate target functions result in the gestural score. Human and Elman Recurrent Neural Network recognition tests are performed to ascertain the accuracy of the generated gestural scores. Comparisons with ``correct'' gestural scores are also performed. The results of this work should provide a stepping stone for future acoustic and articulatory based recognizers employing the same strategy.