Re: Questioning the volley principle (Peter Cariani )

Subject: Re: Questioning the volley principle
From:    Peter Cariani  <peter(at)EPL.MEEI.HARVARD.EDU>
Date:    Tue, 12 Nov 2002 13:29:53 -0500

Simple, rate-place representations of frequency don't work very well in general. The image of auditory nerve fibers as narrowly tuned elements (high Q values) only holds for low sound pressure levels and high-CF fibers (> 1 kHz). At moderate to high sound pressure levels (> 60 dB SPL) firing rates of auditory nerve fibers saturate -- even those of low spontaneous rate and high threshold -- such that the frequency response areas are very broad, typically an octave or more. For an idea of what individual ANF rate-frequency curves look like at these levels look at Jerzy Rose's 1971 figure (1.15), as reprinted in Moore's An Introduction to the Psychology of Hearing. This is a low spont, high threshold fiber, but by 65 dB SPL its response area is an octave wide. Similarly population-wide rate profiles are very broad at these levels for low- and moderate- frequencies (< 2 kHz) -- see Kim & Molnar, J. Neurophysiology, 42(1): 1979. Pure tone frequency discrimination is best around 1kHz (Weber fractions near 0.1%) and declines dramatically as one goes higher in frequency. However the sharpness of neural tuning increases as one goes from 1 to 10 kHz. All of the very highly tuned units that I have ever seen in the physiological literature had CFs and BFs > 3 kHz. Thus if one believed that the stimulus power spectrum is represented in simple rate-place profiles in auditory maps, one should expect frequency discrimination (and all sorts of other auditory perceptual functions) to be best at low SPLs (say 30-40 dB SPL) and for high frequencies (> 5 kHz). One would also expect that perception of spectral shape (vowel recognition) would be degraded at moderate to high levels (60-100 dB SPL). Needless to say, these general properties of rate-place representations don't remotely agree, even qualitatively, with auditory peception. In the best case scenario, in Siebert's pioneering decision-theoretic analysis, where rate information is used in an optimal manner by the auditory system, there is barely enough information to account for human frequency discrimination. However, this model also predicts that frequency discrimination (Weber fractions dF/F) should improve with increasing frequency (> 2 kHz), whereas just the opposite is the case. This and other models are discussed in Bertrand Delgutte's review in the Auditory Computation (1986) book, Springer. We should also note that Siebert's analysis was for low SPLs in the linear, non-saturated range of ANF rate-level functions -- it is unlikely in my opinion that there would be enough information at moderate-to-high SPLs for a simple rate-place analysis. Because of these disconnects between ANF sharpness of tuning and frequency discirmination, both as a function of level and of frequency, the theory of rate-place representations historically took two tacks to deal with the dynamic range problem and the "hyperacuity" problem. The first was to postulate that the auditory system uses different successively smaller subpopulations of ANFs as sound pressure levels increase. The second was to postulate that the auditory system analyzes the edges of rate-place activation profiles (and also that central lateral inhibitory mechanisms could sharpen up these representations.) Each of these explanations has basic difficulties -- I think they may work for isolated 2AFC comparisons, but it is hard for me to see how a 1 kHz tone at 60 dB SPL can be accurately matched to another one at 80 dB SPL using different subpopulations of neural elements or using edge patterns. They might explain how discriminations could be made, but they don't explain how similarity (perceptual equivalence, matching) arises. Representations based on phase-locking and interspike intervals are necessary in order to explain the precision and stability of auditory percepts over very wide dynamic ranges. As Goldstein & Srulowicz (1979) showed, use of interval information accounts for the decline of frequency discrimination as frequencies increase from 1-10 kHz. I think one way or another, we are stuck with an interval based representation of periodicities up to ~ 4-5 kHz and a much coarser rate-based code for all frequencies that dominates where the temporal code is weak or nonexistent (> 4-5 kHz). --Peter Cariani On Tuesday, November 12, 2002, at 10:43 AM, Matt Flax wrote: > Hello, > > I am not sure I see the need for the volley principal. > > If the neurally transduced basilar/organ of the Corti signals are > frequency specific, then surely a neural impulse cognitively implies a > frequency ? > It is also known that neural firing rate (in afferents) is largely > determined by SPL or intensity. So a combination of frequency > selectivity (neuron selection) and firing rate determine sound frequency > and intensity. > > Does anyone agree ? > > Matt > -- > > > WSOLA TimeScale Audio Mod : > FFTw C++ : > Vector Bass : > Multimedia Time Code : >

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University