Michael D. Mattei
Bellarmine College, Newburg Rd., Louisville, KY 40205
James H. Graham
Ahmed H. Desoky
Univ. of Louisville
Very large dictionary (over 5000 words) real-time lexical access for continuous speech recognition continues to be a difficult problem for speech researchers. One search technique that seems to overcome many of the limitations of hidden Markov models and beam search is based on inverted file database management. This inverted file search technique is able to operate in real time on a microcomputer with less than 8 MIPS processing speed using dictionaries well over 5000 words. This paper presents an overview of the technique along with specific performance statistics. The results are obtained using a 33 000-word dictionary on a 386 microcomputer. The inputs to the process are 100-word informal speech passages of continuous phonemes with no syllable or word boundary information. Although a simple search termination heuristic is employed, reasonably accurate word identification results are obtained with no post processing grammatical analysis.