James S. Magnuson
Howard C. Nusbaum
Dept. of Psychol., Univ. of Chicago, 5848 S. University Ave., Chicago, IL 60637
Recognition performance for speech is generally worse for utterances produced by a mix of several talkers compared to utterances produced by a single talker. This performance impairment can be attributed to those aspects of talker normalization used to determine the vocal characteristics of the talker each time the talker changes. The present study investigated the size and nature of talker differences that may affect normalization. Spoken words were generated by a text-to-speech system for matched pairs of synthetically defined talkers. All but two of these pairs differed only in average fundamental frequency. One remaining pair of talkers differed in perceived gender but both talkers had the same average pitch; the other pair differed in both gender and pitch. Response times in a speeded word recognition task were compared for blocks of stimuli produced by a single talker and blocks of stimuli produced by a mix of one of the pairs of talkers. The results are important for understanding how listeners use pitch differences between talkers during talker normalization.