2pSP15 Spoken language identification with phonological and lexical

ASA 127th Meeting M.I.T. 1994 June 6-10

2pSP15. Spoken language identification with phonological and lexical models.

Shubha Kadambe
James L. Hieronymus

Rm. #2D-444, AT&T Bell Labs., 600 Mountain Ave., Murray Hill, NJ 07974

A language identification (LID) system that uses phonemotactic models in addition to phoneme models to identify languages is described. The proposed LID system is trained and tested using the OGI multilanguage telephone speech database. The continuous density second-order ergodic variable duration hidden Markov phonemic models are trained for each language using a high accuracy phoneme recognition system developed at Bell Laboratories. The phonemotactic models for each language are trained using a text corpora of about ten million words and grapheme to phoneme converters. The language L[sub i] of an incoming speech signal x is hypothesized as the one that produced the highest likelihood f(x|(lambda)[sub i])f((lambda)[sub i]|L[sub i]) for all the phonemic models (lambda)[sub i] of a given set with the phonemotactic constraint. Initially, this LID system was trained and evaluated for English/Spanish language identification and the language identification was 83% correct (79% on English and 88% on Spanish). Results for four languages will be presented. The discriminative power of this LID system can be improved by mapping the phoneme lattice onto a syllable or a word sequence using a lexical analyzer and a trigram syllable or word language model The language identification results with and without interfacing the lexical analyzer will be presented.