1aSC21. Automatic generation of word models using piecewise linear segment lattices.

Session: Monday Morning, December 2


Author: Hiroaki Kojima
Location: Electrotechnical Lab., 1-1-4 Umizono, Tsukuba, Ibaraki, 305 Japan
Author: Kazuyo Tanaka
Location: Electrotechnical Lab., 1-1-4 Umizono, Tsukuba, Ibaraki, 305 Japan


A framework for ``phonological concept formation'' has been proposed, aiming to generate robust speech recognition models [Kojima et al., Proc. ICSLP 92, Vol. 1, pp. 269--272 (1992)]. For this purpose, a ``piecewise linear segment lattice'' model is proposed. The structure is represented as a lattice of segments, each of which is represented as regression coefficients of feature vectors within the segment. Compared with typical stochastic models like HMM, the advantages are: (1) It needs fewer samples to learn; (2) it represents objects in voluntary precision; and (3) its structure can be dynamically changed by less calculation. An outline of the generation algorithm is as follows: (1) Dividing each sample into segments using DP, where the number of segments is decided based on an MDL-like criterion; (2) matching between the sequences of segments within the same word by DP; (3) modifying the division according to their matching scores; (4) picking up similar (i.e., near) subsequences and gathering them into a phonelike cluster. Speaker-independent isolated word recognition is carried out using the proposed models which are generated in several conditions. The results show that the recognition rate is improved by forming phonelike clusters.

ASA 132nd meeting - Hawaii, December 1996