Y. Chun Kuo
Janet C. Rutledge
Dept. of Elec. Eng. and Comput. Sci., Northwestern Univ., Evanston, IL 60208-3118
A connected-phoneme hidden Markov model (HMM) is proposed to perform automatic segmentation and labeling. Individual phonetic models are first created by a left-to-right HMM. The large connected-phoneme HMM is formed by grouping all these phonetic models together. Therefore, each state of this big HMM uniquely represents an English phoneme. The large connected-phoneme HMM is not trained by the Viterbi algorithm since the most probable state sequence dose not necessarily yield the correct segmentation and labeling. Learning vector quantization (LVQ2) is used to train the connected-phoneme HMM such that the phoneme confusions can be reduced. The proposed algorithm has two potential advantages over the existing speech recognition schemes. (1) With the aid of the unique representation of each state of the big HMM, more insight into speech characteristics can be gained, which is essential for the improvement of speech recognizers. Errors caused by insertion, deletion, and substitution can be properly analyzed and adjusted. (2) The computation load for the LVQ2 training is considerably less than the Viterbi training. Therefore, the LVQ2 training is also more suitable for a limited speech data base.