2pSC8. Detection of sentence accents in a speech recognition system.

Session: Tuesday Afternoon, May 14

Time: 3:00

Author: Per Sautermeister
Author: Bertil Lyberg
Location: Telia Res. AB, S-136 80 Haninge, Sweden


Speech recognition systems do not usually utilize prosodic information, i.e., information signaled by segmental duration and the fundamental frequency contour of the speech signal. The acoustic manifestation of prosody is, more often than not, considered as a disturbance in current statistical approaches to the speech recognition problem. The detection and transformation of sentence accent in, e.g., spoken language translation systems, will enable stress on a certain word in one language to be transformed into a suitable representation of corresponding constituents in the other language and satisfy the same semantic goal. In this study, a system for automatic detection of sentence accents to be used in speech recognition systems, is presented. The fundamental frequency is extracted from the speech signal and an estimated frequency declination is subtracted from the actual fundamental frequency in order to give a normalized representation of the variations. These fundamental frequency variations are given in musical intervals. The interpretations of sentence accents are carried out from this normalized manifestation of the fundamental frequency. Both the system architecture and some preliminary results will be shown.

from ASA 131st Meeting, Indianapolis, May 1996