[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questioning the volley principle

Simple, rate-place representations of frequency don't work very well in
The image of auditory nerve fibers as narrowly tuned elements (high Q
only holds for low sound pressure levels and high-CF fibers (> 1 kHz).

At moderate to high sound pressure levels (> 60 dB SPL)
firing rates of auditory nerve fibers saturate -- even those of
low spontaneous rate and high threshold -- such that the
frequency response areas are very broad, typically an
octave or more.

For an idea of what individual ANF rate-frequency curves look
like at these levels look at Jerzy Rose's 1971 figure (1.15), as
reprinted in Moore's An Introduction to the Psychology
of Hearing. This is a low spont, high threshold fiber,
but by 65 dB SPL its response area is an octave wide.

Similarly population-wide rate profiles are very broad
at these levels for low- and moderate- frequencies
(< 2 kHz) -- see Kim & Molnar, J. Neurophysiology,
42(1): 1979.

Pure tone frequency discrimination is best around 1kHz
(Weber fractions near 0.1%) and declines dramatically
as one goes higher in frequency. However the sharpness of
neural tuning increases as one goes from 1 to 10 kHz.
All of the very highly tuned units that I have ever seen in the
physiological literature had CFs and BFs > 3 kHz.

Thus if one believed that the stimulus power spectrum is
represented in simple rate-place profiles in auditory maps,
one should expect frequency discrimination (and all sorts
of other auditory perceptual functions) to be best at
low SPLs (say 30-40 dB SPL) and for high frequencies
(> 5 kHz). One would also expect that perception of spectral
shape (vowel recognition) would be degraded at moderate
to high levels (60-100 dB SPL). Needless to say, these
general properties of rate-place representations don't
remotely agree, even qualitatively, with auditory peception.

In the best case scenario, in Siebert's pioneering decision-theoretic
analysis, where rate information is used in an optimal manner
by the auditory system, there is barely enough information to
account for human frequency discrimination. However, this model
also predicts that frequency discrimination (Weber fractions dF/F)
should improve with increasing frequency (> 2 kHz), whereas
just the opposite is the case. This and other models are discussed
in Bertrand Delgutte's review in the Auditory Computation (1986) book,
Springer. We should also note that Siebert's analysis was for
low SPLs in the linear, non-saturated range of ANF rate-level
functions --
it is unlikely in my opinion that there would be enough information at
moderate-to-high SPLs for a simple rate-place analysis.

Because of these disconnects between ANF sharpness of tuning and
frequency discirmination, both as a function of level and of frequency,
the theory of rate-place representations historically took two tacks to
deal with the dynamic range problem and the "hyperacuity" problem.

The first was to postulate that the auditory system uses different
successively smaller
subpopulations of ANFs as sound pressure levels increase. The second was
postulate that the auditory system analyzes the edges of rate-place
profiles (and also that central lateral inhibitory mechanisms could
sharpen up
these representations.) Each of these explanations has basic
difficulties -- I think
they may work for isolated 2AFC comparisons, but it is hard for me to
see how
a 1 kHz tone at 60 dB SPL can be accurately matched to another one at 80
using different subpopulations of neural elements or using edge
patterns. They might
explain how discriminations could be made, but they don't explain how
(perceptual equivalence, matching) arises.

Representations based on phase-locking and interspike intervals are
in order to explain the precision and stability of auditory percepts
over very
wide dynamic ranges. As Goldstein & Srulowicz (1979) showed, use of
interval information accounts
for the decline of frequency discrimination as frequencies increase from
1-10 kHz.

I think one way or another, we are stuck with an interval based
representation of
periodicities up to ~ 4-5 kHz and a much coarser rate-based code for all
that dominates where the temporal code is weak or nonexistent (> 4-5

--Peter Cariani

On Tuesday, November 12, 2002, at 10:43  AM, Matt Flax wrote:


I am not sure I see the need for the volley principal.

If the neurally transduced basilar/organ of the Corti signals are
frequency specific, then surely a neural impulse cognitively implies a
frequency ?
It is also known that neural firing rate (in afferents) is largely
determined by SPL or intensity. So a combination of frequency
selectivity (neuron selection) and firing rate determine sound frequency
and intensity.

Does anyone agree ?


WSOLA TimeScale Audio Mod  : http://mffmtimescale.sourceforge.net/
FFTw C++                   : http://mffmfftwrapper.sourceforge.net/
Vector Bass                : http://mffmvectorbass.sourceforge.net/
Multimedia Time Code       : http://mffmtimecode.sourceforge.net/