Dept. of Linguist., Ohio State Univ., 222 Oxley Hall, 1712 Neil Ave., Columbus, OH 43210-1292
Individual differences in the speech acoustic waveform create complications for theories of human speech perception, and auditory word recognition, as well as for automatic computer speech recognition systems. Theories of vowel perception, often taking the form of scatter reduction techniques more or less related to possible auditory or cognitive mechanisms, have been proposed to deal with individual differences. Word recognition theorists are also beginning to grapple with the problem of talker variability and implicit memory for talker-specific acoustic patterns. Talker variability is also one of the central problems in automatic computer speech recognition, where the dichotomy between speaker-dependent and speaker-independent systems has recently been augmented by new hybrid speaker-adaptive systems. Recently it has become apparent that all three of these areas of research are converging on the same conclusion---that speaker variability is best handled by rich, speaker-specific representations. In speech perception, this viewpoint has been called ``indirect'' speaker normalization. In the theory of word recognition, exemplar-based models of memory have been proposed to account for talker variability, and in automatic speech recognition a similar strategy is evident in multimodel systems and data augmentation approaches to speaker adaptation.