[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CASA problems and solutions

Dear John and List:

I appreciate your creative contribution to CASA and to the discussion of
this list.  Your method seems like an excellent way of achieving your goal,
which you describe in your essay as, "Our design problem thus begins with
finding a way to use interaural time difference as the primary mode for
locating acoustic sources."  If I understand it correctly, your method
provides an accurate way of calculating differences in the point of spatial
origin of the sounds, rejecting reflections, in order to reconstruct them

It is an excellent beginning.  The reason I use the word "beginning" is that
for humans, and presumably other animals, the use of spatial position is
only one of many ways of solving the scene analysis problem in audition.
This becomes clear when you observe that sounds can be segregated even when
they come around corners, or are heard on a monophonic radio or by a
unilaterally deaf person.  I suspect that to replicate the full range of
human auditory scene analysis (ASA), the attempt to solve the problem
computationally (CASA) will have to use the same range of environmental

Apart from spatial origin, the following sorts of information are used by

(A) For integrating components that arrive overlapped in time:

    1.  harmonic relations
    2.  asynchrony of onset and offset
    3.  spectral separation
    4.  Independence of amplitude changes in different
         parts of the spectrum

(B) For integrating components over time:

    5.  Spectral separation
    6.  Separation in time (interacts with other factors)
    7.  Differences in spectral shape
    8.  Differences in intensity (a weak effect)
    9.  Abruptness/smoothness of transition from one sound
         to the next

(I have attached a 2-page summary of what is known about ASA in humans.  As
well as mentioning factors 1 to 9, it describes the effects of ASA on the
experience of the listener. I have used it as a handout in talks I have
given. It is in RTF format which should be readable by most versions of

I'm not sure whether your rejection of the Fourier method extends to all
methods of decomposing the input into spectral components.  However if it
does, we should bear in mind that factors 3, 4, and 5, 7, and probably 1,
listed above, are most naturally stated on a frequency x time
representation -- that is, on a spectrogram or something like it.

Furthermore, when you look at a spectrographic representation of an auditory
signal, the visual grouping that occurs is often directly analogous to the
auditory organization (provided that the time and frequency axes are
properly scaled).  Why would this be so if some sort of frequency axis were
not central to auditory perception, playing a role analogous to a spatial
dimension  in vision?  Perhaps the Fourier transform is not the best
approach to forming this frequency dimension, but something that does a
similar job is required.  Finally there is overwhelming physiological
evidence that the human nervous system does a frequency analysis of the
sound and retains separate frequency representations all the way to the

I understand that your goal is not necessarily to separate signals the way
people do, but the long-term goal of CASA should be to reproduce the full
range of accomplishments of human ASA.

Perhaps I have missed some of the consequences of your method.  If so I
would be happy to be corrected.

Best wishes,

Albert S. Bregman, Emeritus Professor
Dept of Psychology, McGill University
1205 Docteur Penfield Avenue
Montreal, QC, Canada H3A 1B1

     Phone:  +1 (514) 398-6103
     Fax: +1 (514) 398-4896
     Phone & Fax: +1 (514) 484-2592

----- Original Message -----
From: John K. Bates <jkbates@COMPUTER.NET>
Sent: 29-Jan-01 1:46 PM
Subject: CASA problems and solutions

> Dear List,
>    You may have noticed that contributions on CASA have dropped to near
> zero after the enthusiasm of the early 1990s. A few people have suggested

>   John Bates
> Time/Space Systems
> Pleasantville, New York

Attachment: Bregman handout.rtf
Description: MS-Word document