[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Granular synthesis and auditory segmentation

[BTW, for clarity of discussion: the following is about stationary
 sounds only, where onsets, decay etc do not play a role, because
 the discussion of temporal processing in non-stationary sounds is
 worth a separate discussion, and things are already quite tricky
 without involving non-stationary sound. I add this note because
 the original subject of granular synthesis, and certainly my own
 application of that, would normally deal with non-stationary sounds.]

I wrote (i.e., Peter Meijer)

> I'd love to hear about *psychophysical* auditory perception
> experiments that unambiguously demonstrate temporal processing
> in humans in the 3 to 5 kHz range! My expectation is that such
> results have not been found...

Peter Cariani replied

> Few psychophysical experiments unambiguously demonstrate that
> a particular neural mechanism is used, because there are
> many possible neural mechanisms that can carry out the
> same function. What they sometimes do, however, is show
> that sole reliance on a given kind of neural information is not
> sufficient to account for perceptual capability (which rules
> out that coding scheme) or that psychophysical judgements
> covary with the availability of  particular kinds of
> neural information (which suggests but does not prove that
> that particular information is used).

Granted! I agree with your refinements. Hence, I would now equally
love to hear about *psychophysical* auditory perception experiments
that (hopefully) unambiguously demonstrate (or at least make highly
plausible) that a "place theory of hearing" is *in*sufficient
(cannot fully account for what happens) in the 3 to 5 kHz range.
This is a much weaker requirement than formulated in my earlier

Jont Allen replied to the same section

> I think this question needs some clarification. If you beat two
> tones at 10 kHz, say beat 10 and 10.05 kHz tones, you will hear
> the 50 HZ beat. This is clearly due to "temporal processing"
> above 5 kHz.

I'll try to clarify: (neural) "temporal processing" above 5 kHz
is not needed for your example, because the mechanical filtering
and half-wave rectification of your 10 + 10.05 kHz tone gives a
strong 50 Hz component entering the auditory nerve. This 50 Hz
component can in fact be viewed as a demodulated envelope of the
original 10 + 10.05 kHz tone. The 50 Hz component is well within
regular neural bandwidth (we don't even need the "volley principle"
for that) and will most likely also be seen in the interspike
periodicities inside the auditory nerve, and it is well below the
3 to 5 kHz range I was attempting to formulate my hypothesis for.

In other words, I think the only "temporal processing" is here on
the 50 Hz accounted for by place theory and half-wave rectification:
no need for (neural) "temporal processing" above 5 kHz here.

What I basically want to know is what the "volley principle"
nerve frequencies in the 3-5 kHz range bring us functionally,
if that helps to clarify what I am after.

Peter Cariani added

>  -- one can still make good octave judgments if the upper tone
> is at 3 kHz, but this becomes guesswork by the time one gets up
> to 5 kHz.

OK, I like this one. That *could* be a good argument to make
temporal processing up to 3 kHz plausible for explaining this
psychophysically observable effect, because one most likely
needs temporal periodicity information to obtain an absolute
reference for making an octave detectable as something "special".
(Within a filter bank an octave would not appear as special.)

I checked Brian Moore's Psychology of Hearing again on this,
and even found (p. 209, 4th edition) a remark that ``octave
matches largely disappear above 5 kHz, the frequency at which
at which neural synchrony no longer appears to operate.''

I did a few informal listening experiments on this myself, and
found that I became a lot less accurate in finding the octave
from 1500 to 3000 Hz than I was in finding the octave from 1000
to 2000 Hz, so I tend to think that octave fitting largely starts
to break down somewhere between 2 kHz and 3 kHz? This is not a
scientific result, of course, but just my informal subjective
result. In other words, the question here becomes if the octave
matching breaks down in, say, the 2-3 kHz range or in the 3-5 kHz
range (as Brian Moore seems to suggest). If it is in the 3-5 kHz
range, that would indeed (probably) falsify my hypothesis, but...

Another important question that would need an answer before I
grant that my layman hypothesis has been falsified:

   Was this octave matching up to 3-5 kHz done with (very) low
   intensity tones such that nonlinearities can be neglected?

   If not, then nonlinear effects generate a 2.5 kHz combination
   tone from a { 2500, 5000 Hz } pair, and temporal processing
   up to "only" 2.5 kHz may then account for everything! In that
   case I would tend to maintain my hypothesis that consequences
   of temporal processing are not psychophysically observable in
   the 3-5 kHz range. Even if one tone from the pair was presented
   after the other, one has to be careful that in the above example
   the 5000 Hz harmonic of the 2500 Hz tone is not matched against
   the "pure" 5000 Hz tone in a way that regular "place theory"
   could easily account for.

> From 3kHz to 5 kHz the quality of timing information as
> well as tonality and frequency discrimination decline
> precipitously. At 3kHz there is considerable phase-locking;
> at 5 kHz it is much much weaker.

Is tonality above 3 kHz a quality that really requires an accuracy
that cannot be obtained/explained from place theory, possibly via
lower frequency combination tones?

> But why should the burden of proof be placed on just one
> putative coding scheme? What in your opinion is the
> unambiguous evidence in favor of some other (name your
> favorite) coding scheme in the 3-5 kHz range?

With the reformulated request I tend to maintain that the
place theory of hearing is sufficient to account for what
is observed psychophysically in the 3-5 kHz range, rather
than say that it is the only possible account. As a matter
of fact, temporal processing could in principle account for
everything, since it can encompass any type of filterbank.
However, if it really were that powerful, we would have no
need for a cochlea at all, and evolution would have had
little incentive to give us a cochlea. Moreover, the very
absence (?) of perception of effects that should be quite
easy to detect via temporal processing leaves the absence
of significant (neural) temporal processing in the 3-5 kHz
range rather plausible to me.

> 2. Phase locking and frequency discriminations covary.
> ...
> whereas rate place information shows the opposite trend,
> getting relatively better as frequency increases.

I don't understand this. One could have it both ways, depending
on how the cochlear filterbank is actually constructed. If it
were constructed to mimic Fourier analysis, it would even become
frequency independent. Do you imply that cochlear mechanical
filtering actually gives higher (relative?) accuracy at higher
frequencies? What do you mean by "relatively" in "relatively
better"? Also, covariation is a risky argument. In Holland
there is a clear covariation between stork population density
and local human family size, but most of us no longer conclude
that storks bring babies. Maybe I'm just missing your point...

Let me emphasize that I greatly appreciated the comments given
by Peter Cariani and Jont Allen, as these certainly help me
deepen my understanding of the topic and its many pitfalls.
If it really turns out that neural processing has significant
psychophysically observable effects up to 5 kHz, that would
be quite fascinating. Things like combination tones and harmonics
can rather easily fool us, though, since with nonlinear effects
we only need half the neural processing frequency (2.5 kHz) to
account for many psychophysically observable effects that would
at first sight *seem* to imply (neural) "temporal processing" up
to, say, 5 kHz.

Best wishes,

Peter Meijer

Soundscapes from The vOICe - Seeing with your Ears!

McGill is running a new version of LISTSERV (1.8d on Windows NT). 
Information is available on the WEB at http://www.mcgill.ca/cc/listserv