Elmer L. Hixson
Dept. of Elec. Eng., Univ. of Texas at Austin, Austin, TX 78712
Due to the inherent redundancy of the speech data, the design of a redundancy reducing speech preprocessor is very important. Preprocessor design is also very important because it can greatly reduce the computational load on the later stages of speech processing. A special laboratory oriented method in speech data acquisition, which is called near-field spectral wave number estimation is implemented. In this method multiple microphones are used. The goal is to incorporate air flow velocity into speech feature vector. This extra feature is used in addition to the short time cepstrum of the sound data to make the final speech vectors. The speech vectors are then quantized into a determined number of categories using a self-organizing neural network. These quantized and extended vectors are then used for the modeling of higher speech constructs such as phonemes and words. The preprocessing scheme reduced the computational complexity considerably at the expense of slight reduction of the recognition accuracy.