[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How many sources can humans perceive? / Number of co ncurrent streams

Dear Al and others,

Thank you for clarifying the question by making an important distinction
between reporting the number of sources or individual streams based on one
time (on-line) listening vs. repeated listening not constrained by time.
In my preliminary highly informal experimenting with concurrent streams, I
combined up to 6 tone and noise pulse trains each having different rates
and pulse widths.  Several people who I asked to tell me the number of
independent streams were quite accurate when not constrained by time (as
Pierre Divenyi warned me they would be).  I did not control for level
effects specifically, but it did seem pretty clear that level would have an
effect on the ease of "hearing out" a particular stream.  On the other
hand, when my stimuli were bouncing ping-pong balls, it was much harder to
judge the number of bouncing balls beyond 2 or 3, and repeated listening
did not seem very helpful (although this was done very informally).  I have
not tried to offer people to listen to the bouncing patterns of individual
balls, and do not know if that would help their accuracy in estimating the
number.  Similarly, it is possible that having other bouncing objects (with
different resonant characteristics) could change this result.

During most of everyday listening that we do, we still typically do not
have the option of rewinding the auditory scene and listening to it again,
other than doing this in our memory.  Despite the proliferation of
monotonously repeating machinery and electro-acoustically reproduced sound
there is still a lot of dynamic changes in our acoustic environments.  In
order to function well from basic biological to social levels we need to
"perceive" sound as it happens.  Thus, while "matching a standard" method
can reveal what we are capable of when hearing out individual streams from
a mixture, it may not be an ideal way to learn about our everyday listening
abilities, unaided by technology.  One can, of course, argue that what we
are doing when listening to a mixture of sound sources is, in fact, some
kind of a matching procedure with "a standard" somehow stored in our
memory.  But I think even if that is the case, "the standard" is
represented at a different level of abstraction than in Al's "matching a
standard" example.  Also, this seems a very different take on the question
of How many streams/sound sources one can process concurrently, since only
one stream is actually being processed.   I realize that defining "process"
is a very thorny and probably decisive issue, just like definition of
"perceive".  I have not yet offer even a operational definition or an
experimental task, but I believe working on this would be pretty useful for
a variety of practical, if not theoretical, ends.  Although there may not
be an absolute concurrent stream processing limit, I am interested in
knowing how many concurrent but separate streams/sources people can handle
in real time and what the depth of processing constraints may be.
Hopefully, there is a method that can help to find a way out of this
terminological circularity.

Best regards,
Valeriy Shafiro
Communication Disorders and Sciences
Rush University Medical Center
Chicago, IL

office (312) 942 - 3298
lab    (312) 942 - 3316
email: valeriy_shafiro@rush.edu

-----"Al Bregman" <bregman@hebb.psych.mcgill.ca> wrote: -----

To: <Valeriy_Shafiro@rsh.net>, "AUDITORY" <auditory@lists.mcgill.ca>
From: "Al Bregman" <bregman@hebb.psych.mcgill.ca>
Date: 05/02/2004 06:08PM
Subject: Number of concurrent streams.

Dear Valeriy and List,

(Sorry if you get this twice.  Apparently the AUDITORY listserv
didn't like my new email address and may or may not have sent
this message to the list.)

I don't think that the question about how many sound sources
people can perceive in a mixture has a simple answer.

First of all, it depends on what you mean by "perceive".  If you
mean how many sound sources they can report, then it depends on
how much time they are given.  The longer they listen, the more
they can report.  This could mean that they can pay close
attention to only one sound at a time, but can shift their
attention from one source to another.  If you mean, "How many
could they report if given a sample of each one in turn and asked
whether they could detect it in the mixture, given all the time
they needed?", the question becomes one about blending and
segregation at a very basic level. (Let's call this last method,
"matching a standard").  Often I haven't been able to hear a
sound in a mixture until I knew what it was.

Assuming that one is using the "matching a standard" method,
then one's success will depend on what the sounds are and how
intense they are.  Obviously a weak sound may be hard to
detect -- it may be masked (psychoacoustically or
informationally) by the others.  Even if this is not true, the
similarity of the component sound sources plays an important
role.  Those researchers who have claimed that only a small
number of sources (3 or 4) can be detected are all referring to
sets of sounds that resemble one another, such as multiple
talkers or singers, or rhythmically playing instruments.

Consider the following set of sounds:
- a person talking
- randomly spaced hits on a bass drum (greatly attenuated)
- an ambulance siren
- jangling of a set of house keys,
- a pure tone playing Morse code,
- a person typing on an electric typewriter.
- the sound of a motorcycle whizzing by (greatly attenuated)

As long as the intensities were balanced appropriately, I think
you would eventually detect all of them by the method of matching
a standard.

As many of the list members know, I believe that there is an
perceptual stage of assigning of links among the parts of the
incoming mixture prior to further processing by the mechanisms
that we call attention.
If what we are asking about is whether this pre-attentive process
has some limit concerning the number of discrete subsets
(potential streams) it can form, we would have to observe its
operation without any contribution from attention.  I believe
this is impossible using the standard methods of psychoacoustics.
Rather, it has to be addressed using a physiological approach.
Some beginnings toward doing this have been carried out by Elyse
Sussman and by Claude Alain (working independently), and there
may be others who I don't know about.

By the way, I referred to the output of the pre-attentive
mechanism as "potential" streams because there is good reason to
believe that top-down processes play a big role in determining
the actually heard streams.

Sorry the answer couldn't have been simpler.


Albert S. Bregman,
Emeritus Professor
Psychology Dept., McGill University
1205 Docteur Penfield Avenue
Montreal, Quebec
Canada  H3A 1B1

    Voice: +1 (514) 398-6103
    Fax:     +1 (514) 398-4896

----- Original Message -----
From: "Valeriy Shafiro" <Valeriy_Shafiro@RUSH.EDU>
Sent: Friday, April 30, 2004 3:26 PM
Subject: Re: Computational ASA -- how many sources can humans

> I would like to ask a further question: Do we, in fact, know
how many
> independent sound sources in a mixture humans can perceive?
Thus far I
> know of only one research report where human listeners were
asked to
> identify sound sources in a recorded "real-world" sound mixture
(Ellis, D.
> P. (1996). Prediction-driven computational auditory scene
analysis).  We
> have been talking about this issue with Brian Gygi, and from
the few
> related reports that Brian found, it appears that humans may
not be that
> good in simultaneous perceiving independent sound sources.  For
> Jennifer Tufts and Tom Frank J. Acoust. Soc. Am. 101 , 3107
(1997) found
> that the accuracy of judging the number of talkers in a
multitalker mixture
> drops considerably when there are more than 3 talkers.  There
is also a
> report by David Huron (Music Perception, Vol. 19, No. 1 (2001)
pp. 1-64.,
> or on-line

>  ) that estimating the number of musical lines in
>  polyphonic music worsens considerably after 3.  Some anecdotal
> for this limit also comes from movie sound effect designers.
This is a
> citation from Walter Murch, a renown sound effect artist:
"There is a rule
> of thumb I use which is never to give the audience more than
> things to think about aurally at any one moment. Now, those
moments can
> shift very quickly, but if you take a five-second section of
sound and feed
> the audience more than two-and-a-half conceptual lines at the
same time,
> they can't really separate them out. There's just no way to do
it, and
> everything becomes self-canceling." (cited from
> http://www.filmsound.org/murch/waltermurch.htm)
> Any thoughts, comments, and references relevant to this issue
> appreciated.
> -------------------------------------------------------------
> Valeriy Shafiro
> Communication Disorders and Sciences
> Rush University Medical Center
> Chicago, IL
> office (312) 942 - 3298
>  lab    (312) 942 - 3316
 > email: valeriy_shafiro@rush.edu