[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ITDs, grouping and auditory attention

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: ITDs, grouping and auditory attention
From: Chris Darwin <cjd@xxxxxxxxxxxxxxxx>
Date: Fri, 16 Oct 1998 11:44:06 +0100
Comments: cc: Quentin Summerfield <aqs@ihr.mrc.ac.uk>, John Culling <CullingJ@cardiff.ac.uk>, Nick Hill <nih1@pump2.york.ac.uk>, Rob Hukin <robh@biols.susx.ac.uk>
In-reply-to: <003701bdf8e4$0f91fea0$884ee182@tp.dat.dtu.dk>
Reply-to: cjd@xxxxxxxxxxxxxxxx
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

On Fri, 16 Oct 1998 11:05:01 +0200 tp@DAT.DTU.DK (Torben Poulsen)
wrote:robh

>Bob Bolia wrote...
>..... Has anyone ever done any cocktail party experiments with talkers
>separated by as little as 1degree?

There are some recent experiments from Summerfield and Culling at IHR,
and by our group at Sussex, that provide some experimental evidence on
the use of ITDs in grouping  and in the use of spatial location cues by
ITD alone in auditory attention.

The bottom line is that although ITD cannot be used by itself to
selectively group simultaneous sounds, the location (specified by ITD
alone) of sounds that can be grouped by other cues allows good selective
attention.  For pitch-processed natural sentences, selective attention
performance to synchronised words on the same Fo embedded in a sentence
is around 85% at ą45 us and has asymptoted at ą90 us. There is also
independent evidence from Nick Hill (using the Jeffress / Trahiotis &
Stern effect) that perceptual grouping precedes the localisation of
complex sounds.

Details follow for afficionados.  

Apologies to non-afficionados.

ABSENCE OF GROUPING OF SIMULTANEOUS SOUNDS BY ITD ALONE.

The seminal paper is:
Culling, J. F. & Summerfield, Q. (1995). Perceptual separation of
concurrent speech sounds: absence of across-frequency grouping by common
interaural delay. Journal of the Acoustical Society of America 98,
785-797. 

They showed that if listeners hear four formant-like noise bands that
can make different pairs of vowels depending on how they are combined,
the vowels that listeners hear are not influenced by which pairs of
noise bands have the same ITD.  This is true for ITDs of around ą600 us.

A similar conclusion - confirming that ITD alone cannot produce
simultaneous grouping - also comes from:

Hukin, R. W. & Darwin, C. J. (1995). Effects of contralateral
presentation and of interaural time differences in segregating a
harmonic from a vowel. Journal of the Acoustical Society of America 98,
1380-1387. 

The story becomes a bit more complicated when ITD is combined with other
cues.  The general conclusion is that ITD can augment  segregation that
has been produced by other cues, but is very ineffective for
simultaneous sounds just by itself.

Darwin, C. J. & Hukin, R. W. (1998). Perceptual segregation of a
harmonic from a vowel by inter-aural time difference in conjunction with
mistuning and onset-asynchrony. Journal of the Acoustical Society of
America 103,  1080-1084. 

GROUPING PRECEDES LOCALISATION OF COMPLEX SOUNDS

The main evidence here is that the across-frequency comparison of ITDs
that was demonstrated by Jeffress and by Trahiotis and Stern, is
disrupted by grouping cues such as inharmonicity and onset time. 

Hill, N. I. & Darwin, C. J. (1993). Effects of onset asynchrony and of
mistuning on the lateralization of a pure tone embedded in a harmonic
complex. Journal of the Acoustical Society of America 93,  2307-2308. 
The effect of onset asynchrony and mistuning on the binaural processing
of
multi-tone stimuli was investigated using a paradigm derived from that
of
Trahiotis and Stern [C. Trahiotis and R. M. Stern, J. Acoust. Soc. Am.
86, 1285-1293 (1989)]. A tonal complex comprising harmonics 2 to 8 of
100 Hz and presented with an IDT of 1.5 ms gave rise to a single image
lateralized towards the ear receiving the leading signal. However, when
the
central 500-Hz component was delayed by 40 ms relative to the flanking
tones,
it was heard out as a separate tone shifted towards the opposite side of
the
head. Similar effects were observed when the 500-Hz component was
mistuned from
the flanking complex, with shifts of +-3% being sufficient for the
mistuned
component to be lateralized in the vicinity of the mid-line. The results
demonstrate that both onset asynchrony and mistuning influence which
frequency
components IDT information is integrated across. 

ATTENDING BY ITD TO SPEECH

Rob Hukin and I have a paper showing that listeners can selectively
attend to an attended sentence specified by ITD (Darwin, C. J. & Hukin,
R. W. (in press). Auditory objects of attention: the role of interaural
time-differences. Journal of Experimental Psychology: Human Perception
and Performance).  I would be happy to send the paper to anyone
requesting it - JEP publication lag is very long!    Edited abstract
follows.

It shows that listeners can use small differences in ITD between two
sentences to track a particular sentence over time, in the absence of
talker or fundamental frequency differences between the sentences.  They
are able to say which of two, short, constant, target words is part of
the attended sentence at a level well above chance when one sentence was
presented with an ITD of +45ľs and the other with an ITD of -45ľs.  At
ą91 ľs, performance has almost asymptoted at over 90% correct.  By
contrast, when a difference in fundamental frequency (Fo) of 1, 2 or 4
semitones is the only cue that defines which is the correct target word,
performance is only just above chance.  When both cues are present, ITD
very substantially dominates over Fo.  The other two experiments provide
evidence that listeners in the first experiment were not explicitly
tracking components that shared a common ITD.  Their inability to
segregate a harmonic from a target vowel by ITD was not substantially
changed by the vowel being placed in a sentence context, where the
sentence shared the same ITD as the rest of the vowel.  By contrast,
segregation of the harmonic from the vowel by an interaural level
difference (ILD) was larger than segregation by ITD for the isolated
vowel, and was very substantially increased by a sentence context that
had the same ILD as the rest of the vowel.  The results of these
experiments support the contention that, in following a particular
auditory sound source over time, listeners attend to perceived auditory
objects at particular azimuthal positions rather than attending
explicitly to those frequency components that share a common ITD.

Some of the arguments on these issues are summarised in Darwin, C. J.
(1997). Auditory grouping. Trends in Cognitive Science 1,  327-333. 

End of self publicity.

=============================================
Chris Darwin, Experimental Psychology, 
University of Sussex, Brighton BN1 9QG, UK
Phone: +44-1273-678409;  FAX: +44-1273-678611
Email: c.j.darwin@biols.susx.ac.uk
Home Page: http://www.biols.susx.ac.uk/Home/Chris_Darwin/
=============================================

Email to AUDITORY should now be sent to AUDITORY@lists.mcgill.ca
LISTSERV commands should be sent to listserv@lists.mcgill.ca
Information is available on the WEB at http://www.mcgill.ca/cc/listserv

References:
- MAA
  - From: Torben Poulsen

Prev by Date: MAA
Next by Date: granular synthesis and auditory segmentation
Previous by thread: MAA
Next by thread: UNmixing Sources, i.e., The Cocktail Party Effect
Index(es):
- Date
- Thread