Lawrence R. Rabiner
Jay G. Wilpon
AT&T Bell Labs., Rm. 2D-538, 600 Mountain Ave., Murray Hill, NJ 07974
Connected digit recognition is a problem that has received a lot of attention over the past several years because of its importance in providing speech recognition services (e.g., catalog ordering, credit card entry, all digit dialing of telephone numbers, etc.). Although a number of systems have been described that provide very high string accuracy on a standard database of connected digits (i.e., the TI database), most of these systems require a great deal of computation to provide high performance. Most recently, there has been a renewed interest in connected digit recognition systems based on discrete density models using VQ codebooks (e.g., the work of Normandin and colleagues at CRIM in Montreal) where the computation is significantly lower than that required for continuous density models, and the robustness to variations in talkers, background, microphones, etc., has the potential to be high. In this study, the effect of multiple codebooks, multiple models, and codebook weighting on the performance of a standard hidden Markov model recognizer using the TI connected digits database is examined. It is shown that a single codebook can provide high string recognition accuracy, and that multiple codebooks provide accuracy comparable to that of continuous density models.