4aSC6. An application of Dempster and Shafer's probability theory to speech recognition.

Session: Thursday Morning, December 5


Author: Tetsunori Kobayashi
Location: Dept. of EECE, Waseda Univ., 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169 Japan


The Dempster and Shafer probability theory is applied to calculation of the likelihood function for speech recognition. HMM is now the major technique for speech recognition. It, however, has some limitations because it adopts the Bayse-based likelihood function. In the Bayse theory, the values of likelihood functions are valid only in comparative situations. The values themselves are meaningless. Therefore, they are not applicable to spotting or branch pruning. Besides, there is no guiding principle for merging the likelihoods from different information sources. To solve these problems, the Dempster and Shafer probability is adopted. The DS theory can combine likelihood functions from multiple sources. It also treats the information of ``do not know'' or ``cannot decide.'' The values themselves can represent the certainty of the evidence. Here, it is adopted in three parts. First, the frame-level phonetic likelihood functions are calculated by merging the likelihood functions from multiframe features. Second, the segment-level phonetic likelihood functions are calculated by merging segmental and durational likelihood functions. Finally, the phonetic sequence is decided by using multi-segmental-level likelihood functions. All merging processes are performed by the DS theory. Thus the new speech recognition method, whose likelihood function is applicable to spotting and branch pruning, is realized.

ASA 132nd meeting - Hawaii, December 1996