ITDs, grouping and auditory attention (Chris Darwin )

Subject: ITDs, grouping and auditory attention
From:    Chris Darwin  <cjd(at)BIOLS.SUSX.AC.UK>
Date:    Fri, 16 Oct 1998 11:44:06 +0100

On Fri, 16 Oct 1998 11:05:01 +0200 tp(at)DAT.DTU.DK (Torben Poulsen) wrote:robh >Bob Bolia wrote... >..... Has anyone ever done any cocktail party experiments with talkers >separated by as little as 1degree? There are some recent experiments from Summerfield and Culling at IHR, and by our group at Sussex, that provide some experimental evidence on the use of ITDs in grouping and in the use of spatial location cues by ITD alone in auditory attention. The bottom line is that although ITD cannot be used by itself to selectively group simultaneous sounds, the location (specified by ITD alone) of sounds that can be grouped by other cues allows good selective attention. For pitch-processed natural sentences, selective attention performance to synchronised words on the same Fo embedded in a sentence is around 85% at =B145 us and has asymptoted at =B190 us. There is also independent evidence from Nick Hill (using the Jeffress / Trahiotis & Stern effect) that perceptual grouping precedes the localisation of complex sounds. Details follow for afficionados. =20 Apologies to non-afficionados. ABSENCE OF GROUPING OF SIMULTANEOUS SOUNDS BY ITD ALONE. The seminal paper is: Culling, J. F. & Summerfield, Q. (1995). Perceptual separation of concurrent speech sounds: absence of across-frequency grouping by common interaural delay. Journal of the Acoustical Society of America 98, 785-797.=20 They showed that if listeners hear four formant-like noise bands that can make different pairs of vowels depending on how they are combined, the vowels that listeners hear are not influenced by which pairs of noise bands have the same ITD. This is true for ITDs of around =B1600 us. A similar conclusion - confirming that ITD alone cannot produce simultaneous grouping - also comes from: Hukin, R. W. & Darwin, C. J. (1995). Effects of contralateral presentation and of interaural time differences in segregating a harmonic from a vowel. Journal of the Acoustical Society of America 98, 1380-1387.=20 The story becomes a bit more complicated when ITD is combined with other cues. The general conclusion is that ITD can augment segregation that has been produced by other cues, but is very ineffective for simultaneous sounds just by itself. Darwin, C. J. & Hukin, R. W. (1998). Perceptual segregation of a harmonic from a vowel by inter-aural time difference in conjunction with mistuning and onset-asynchrony. Journal of the Acoustical Society of America 103, 1080-1084.=20 GROUPING PRECEDES LOCALISATION OF COMPLEX SOUNDS The main evidence here is that the across-frequency comparison of ITDs that was demonstrated by Jeffress and by Trahiotis and Stern, is disrupted by grouping cues such as inharmonicity and onset time.=20 Hill, N. I. & Darwin, C. J. (1993). Effects of onset asynchrony and of mistuning on the lateralization of a pure tone embedded in a harmonic complex. Journal of the Acoustical Society of America 93, 2307-2308.=20 The effect of onset asynchrony and mistuning on the binaural processing of multi-tone stimuli was investigated using a paradigm derived from that of Trahiotis and Stern [C. Trahiotis and R. M. Stern, J. Acoust. Soc. Am. 86, 1285-1293 (1989)]. A tonal complex comprising harmonics 2 to 8 of 100 Hz and presented with an IDT of 1.5 ms gave rise to a single image lateralized towards the ear receiving the leading signal. However, when the central 500-Hz component was delayed by 40 ms relative to the flanking tones, it was heard out as a separate tone shifted towards the opposite side of the head. Similar effects were observed when the 500-Hz component was mistuned from the flanking complex, with shifts of +-3% being sufficient for the mistuned component to be lateralized in the vicinity of the mid-line. The results demonstrate that both onset asynchrony and mistuning influence which frequency components IDT information is integrated across.=20 ATTENDING BY ITD TO SPEECH Rob Hukin and I have a paper showing that listeners can selectively attend to an attended sentence specified by ITD (Darwin, C. J. & Hukin, R. W. (in press). Auditory objects of attention: the role of interaural time-differences. Journal of Experimental Psychology: Human Perception and Performance). I would be happy to send the paper to anyone requesting it - JEP publication lag is very long! Edited abstract follows. It shows that listeners can use small differences in ITD between two sentences to track a particular sentence over time, in the absence of talker or fundamental frequency differences between the sentences. They are able to say which of two, short, constant, target words is part of the attended sentence at a level well above chance when one sentence was presented with an ITD of +45=B5s and the other with an ITD of -45=B5s. At =B191 =B5s, performance has almost asymptoted at over 90% correct. By contrast, when a difference in fundamental frequency (Fo) of 1, 2 or 4 semitones is the only cue that defines which is the correct target word, performance is only just above chance. When both cues are present, ITD very substantially dominates over Fo. The other two experiments provide evidence that listeners in the first experiment were not explicitly tracking components that shared a common ITD. Their inability to segregate a harmonic from a target vowel by ITD was not substantially changed by the vowel being placed in a sentence context, where the sentence shared the same ITD as the rest of the vowel. By contrast, segregation of the harmonic from the vowel by an interaural level difference (ILD) was larger than segregation by ITD for the isolated vowel, and was very substantially increased by a sentence context that had the same ILD as the rest of the vowel. The results of these experiments support the contention that, in following a particular auditory sound source over time, listeners attend to perceived auditory objects at particular azimuthal positions rather than attending explicitly to those frequency components that share a common ITD. Some of the arguments on these issues are summarised in Darwin, C. J. (1997). Auditory grouping. Trends in Cognitive Science 1, 327-333.=20 End of self publicity. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Chris Darwin, Experimental Psychology,=20 University of Sussex, Brighton BN1 9QG, UK Phone: +44-1273-678409; FAX: +44-1273-678611 Email: c.j.darwin(at) Home Page: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Email to AUDITORY should now be sent to AUDITORY(at) LISTSERV commands should be sent to listserv(at) Information is available on the WEB at

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University