[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CBW, phase deafness, etc.

Dear Eckard,

I had imagined you, like Martin Luther, nailing your auditory theses on
the auditorium door, but little did I anticipate your moral wrath! (I'd
be interested in seeing the full set of theses). I'm not sure I
understand your moral/existential question or your parable. Perhaps it
would be more direct to simply state your beliefs and values and then
give whatever objections you have to particular research being done. I
am fully prepared  to justify my work and the work of many others both
on scientific and moral grounds, if need be. The situation is far from
your allegory of an evil physician (or Nazi doctors for that matter)
sacrificing children, and is more commensurate with the moral questions
involved with eating meat (am I to assume that you are a vegetarian?).

But perhaps one should not ask a question unless one really wants an
answer. I'm not clear on your "late understanding" of the (nature of?)
the data -- were you unaware that these pictures of the neural activity
in the auditory nerve were compiled from experimental data? I am also
unclear as to the nature of your question concerning variation of the
phase of the fundamental -- in our experiments stimuli were numerically
synthesized and delivered via a D/A converter that also sent a sync
pulse to an event timer, so the stimuli are always the same in every
(measurable) respect from presentation to presentation. The data are
based on many repetitions, but the discharges of the fibers are
describable via the interaction of stimulus-driven deterministic
processes and stochastic ones, such that the PST histogram is a fair
representation of the probability of a fiber's firing at a given time
relative to the onset of the stimulus. As far as we know, the individual
discharges of different auditory nerve fibers  innervating different
hair cells (e.g. with different CF's) occur independently of one
another, once one factors out the account the common, stimulus-driven
process. (As far as I know no one has seen intrinsic inter-fiber spike
correlations, e.g. in which  in the absence of a stimulus, a
"spontaneous"  discharge in fiber A increases the probability of a
discharge in fiber B at some later time). In each fiber, there are
short-range correlations that are due to recovery from the last action
potential, and some very weak, long range correlations that are seen in
pulse-count distributions (>>100 ms).

If this independence assumption is largely correct, as it appears to be,
then  the ensemble of PSTs is also a fair picture of  what is occuring
in the auditory nerve as a function of time. It is by far the most
detailed and precise account we have of the information that the brain
receives when an acoustic stimulus is presented to the ear. Having such
a picture of what is going on in the auditory nerve is indispensible if
we are to understand how we hear, what aspects of neural activity
translate into auditory percepts, and how to design prosthetic devices
that improve and restore hearing function. More of these neurograms need
to be constructed, displayed, and pondered.

Where do you come up with this assumption of phase-drift, if I
understand you correctly? The fibers faithfully register the phase
structure of the stimulus as it is presented to them after cochlear
filtering. The fine temporal structure of the stimulus is impressed on
the fine structure of the neural discharges, taking into account
cochlear filtering and the limits of phase-locking. The phase
information is there, and there are some situations where phase
transients can alter which patterns fuse together.

Regarding autocorrelation, which is not an "easy" or intuitive tool for
many (would that it were so!; it would be so much easier, believe me, to
go along with the spectrographic, frequency-domain perspective, and you
have pointed out some of the deficiencies in that worldview), there are
a number of points that need to be kept in mind.

As far as objections to population-interval representations go, the
conclusions of Kaernbach and Demany should not be taken at face value,
and certainly not in their entirety. It's an interesting and worthwhile
paper, but one should read it critically.

1) The "autocorrelation" model that they knocked down was not neural
model; it was a simple autocorrelation of the stimulus. Their model did
not compute a summary autocorrelation over all CFs, it did not take into
account the cochlear tuning, the broad, asymmetric nature of the tails
of tuning curves,  decline of phase locking with frequency, nor
spontaneous activity. The population-interval representations that we
estimated from neural data and Meddis and Hewitt estimated from their
computer simulations take into account all of these factors, some of
which can play a significant role when particular stimuli are
considered. The model that they knocked down was not the model that
anybody holds literally. It is true that the population-interval
distributions resemble stimulus autocorrelation functions in many
respects, especially for stimulus with components below 2 kHz, so I call
them "autocorrelation-like" representations but there are still
differences between the autocorrelation function and the interval-based
representation (see below).

2) The stimuli were harmonic complexes whose harmonics (F0=100 Hz) were
all above 5 kHz and were mixed with low-pass noise. The pitches produced
were weak without the intervening clicks. I made some high-pass click
trains with intervening clicks but without the LP noise, and the
intervening clicks do effectively mask the 100 Hz pitch of the HP click
train. Without the noise, the difference is like night and day. This is
a very valuable perceptual observation that K & D have made, that
intervening clicks mask out the pitch of the isochronous train if these
are high pass clicks. I also made click trains with harmonics of less
than 2 kHz, and in this case the intervening clicks do NOT effectively
mask the 100 Hz train.
A simple interpretation is that for the high-frequency harmonics, there
is a representation of the waveform envelope (based on first-order
intervals and modulation analysis), while for low frequencies, the
representation looks more like an autocorrelation (intervening clicks
don't disrupt the periodic pattern). K & D's demonstrations and
conclusions, right or wrong, apply to these pitches produced by high
harmonics, not to low ones (which yield the strongest pitches and by far
are most important for understanding speech and music).

3. K & D assumed that each of their clicks would give rise to a spike in
an auditory nerve fiber. It turns out that this may be an incorrect
assumption. I observed the responses of a few high CF auditory nerve
fibers to such stimuli. The fibers show plenty of 10 msec all-order
intervals when there are no intervening clicks, but do not show
prominent interval peaks at 10 msec when there are intervening clicks. I
believe that this is because when the intervening click comes just
before one in the isochronous pattern, large numbers of high CF fibers
reliably fire and are in refraction for the subsequent click. As a
result, the all-order interval distribution that is produced by such
stimuli is not what K & D supposed, and in this case, the all-order
interval distribution seems to follow the psychophysics.

4. So, what do we have here? The population-interval models assume that
some kind of analysis is performed on the population-interval
distribution that is the product of  many prior processes (e.g. cochlear
filtering, transduction, synatptic, spike initiation, and possibly even
the effects of efferents). For high-frequency harmonics, all-order
interval distributions reflect the shapes of envelopes rather than the
fine structure of the stimulus (as they do for low-frequency harmonics).
This representation takes into account these differences, and thus
provides a "unified" explanation for pitches produced (or masked) by
both low and high harmonics. K & D presented an interesting,
provocative, and useful demonstration, but (I think) their
interpretation had faults. It's been somewhat of a surprise to me how
fast and easily people have taken their (in my opinion, overdrawn)
conclusions at face value.

5. In short, lower frequency hearing has more autocorrelation-like
qualities (intervening clicks don't mask much; fine structure not
envelopes matters, phase is largely irrelevant for pitch and timbre),
while high frequency hearing has more modulation-like qualities
(intervening clicks mask, envelope matters, phase can change envelope
shape and modify pitch).
High frequency hearing looks a great deal like what we visualize the
situation in the electrically stimulated nerve to be: many fibers firing
at initial wavefronts and being together in refraction for subsequent
ones). The autocorrelation-like character of low-frequency hearing calls
periodicity representations based on modulation tuning into question --
intervening clicks, phase manipulations, and inharmonic tunings that
would be expected to disrupt representations based on first-order
intervals or modulation-tuned units (I'd appreciate counterarguments
here, perhaps I am mistaken).

6. There are interesting questions that concern how pitches created by
psychophysically resolved and unresolved harmonics relate to the
different means of generating interspike intervals (by means of
envelopes produced by interacting harmonics, by means of phase-locking
to the individual harmonics themselves). I have the impression that many
psychophysicists tacitly associate resolved harmonics with spectral
pattern mechanisms and unresolved harmonics with temporal ones
(following Schouten, perhaps, but not Licklider). There is a natural way
of making this distinction in interval-based theories: between intervals
that are produced by individual harmonics and those that are produced by
interacting harmonics. As one increases in absolute frequency above 2
kHz, phase-locking declines and intervals associated with envelopes
dominate. Likewise, as harmonic numbers increase, harmonic spacings
become smaller relative to tunings, their interactions prevail and
envelopes dominate. So, there can be a linkage between
psychophysically--resolved/unresolved harmonics and different modes of
generating all-order interspike intervals. Whether one wants to call
these "two mechanisms", or rather the consequence of a "unified
representation" depends on one's perspective.

I do believe that the auditory system has a unified, general purpose,
phylogenetically-primitive means of representing sounds, be it some kind
of central spectrum or central autocorrelation or central periodicity
maps. Just as there is an anatomical bauplan, there may be a
neurocomputational bauplan -- basic strategies for representing and
processing information. It is easy to give up on looking for underlying
order, harder to actually find it. And it is always tempting to
proliferate special-purpose mechanisms for this or that little function,
and to pass the integration and coordination buck upwards to omniscent
central processors somewhere in the cortex.

-- Peter Cariani

Autocorrelation and population-interval distributions: similarities and differences

First, the population-interval representations which I am discussing are
"autocorrelation-like" in many respects, but they are not identical in
all respects to the autocorrelation function, being the product of
cochlear and neural processes. The ways in which they resemble stimulus
autocorrelations lie in the positions of major and minor interval peaks.
The ways in which they differ have to do with neural absolute and
relative refractory periods (no intervals less than 700 usec), cochlear
filtering and nonlinearities, and half-wave rectification (no negative
amplitudes). Many of the nonlinearities in cochlear and neural processes
manifest themselves in altering the relative sizes of interval peaks
(but not their positions), and in some cases the introduction of
additional (small) interval peaks associated with cochlear distortion
products (at 1/(2f1- f2)). A representation system like this is very
well suited for estimation of frequency/periodicity over a wide dynamic
range -- the nonlinearities do not affect the positions of the interval
peaks on which those estimates are based. Thus whether and how cochlear
nonlinearities matter for some perceptual function depends crucially on
the nature of the neural representation involved in that function.

Eckard Blumschein wrote:
> Dear Peter Cariani and List,
> Do we live up to our responsibility? I remember of the reason why a
> physician committed an incredible crime. He performed deadly experiments
> with children just because he intended to become a professor. Even more
> tragically, the girls and boys were sacrificed for nothing. The
> unscrupulous experiments were based on wrong assumptions. Nonetheless, the
> doctor falsified his identity and managed getting recognized for a while.
> Cats are quite different from humans. However, I am not sure whether or not
> I myself might sometimes be to blame for carelessness that could cost
> further lives of animals.
> In particular, I asked for more data concerning "block-voting".
> Fortunately, Peter Cariani outed himself. I have to apologize not just for
> not mentioning him but also for late understanding the data by Miller and
> Sachs, and, of course, the similar ones by Delgutte et al., too. Maybe, I
> am just not aware of awareness of others concerning some consequences of
> two peculiarities. My first suspicion has proven correct. The figures by
> Secker-Walker and Searle or by Shamma are somewhat misleading since they
> are based on many repetitions but possibly suggest a snapshot. My second
> suspicion is, phase of fundamental might have varied each time. In
> principle, it would be possible to check this by means of a synchronized
> stimulus. Referring to my initial remark, I would not consider this
> necessary.
> I see a lot of consequences. CBW or more naturally speaking the width of
> neural tuning curves depends on variance of phase and approximately amounts
> half a period (i.e. 1/2CF) for periods below refractory time, even if
> frequency resolution at inner hair cells (notice, I am avoiding the term
> basilar membrane) might be much higher. Deafness against phase also becomes
> understandable, etc.
> Finally for this time, I would like to briefly take issue against
> application of autocorrelation function. I know, this easy tool was favored
> not just by Peter Cariani. Possibly we both can agree. I realized him
> writing autocorrelation-like representations and operations. What about me,
> I go along with Kaernbach/Demany who provided psychoacoustical evidence
> against autocorrelation theories in JASA (1999), more strictly speaking
> against perception of all-order inter click interval (ICI). Please forgive
> me my heretical mistrust in general suitability of any available
> mathematical tool in case of hearing. Since Müller (1838), I see all
> efforts doomed to failure, so far. Instead, I imagine the neurons to
> preferably detect coincidence of lowest order. For instance, a first order
> ICI dominates over any second or higher order ICI. A key to many keys might
> hopefully be my suggestion that tonal perception is based on zero order
> ICIs. I uttered this idea for the first time this year in Oldenburg after I
> got aware that atonal perception across all CFs starts to become gradually
> amenable as soon as period exceeds refractory time. In that case, the
> normally dominating zero order tonotopic intervals are presumably getting
> increasingly corrupted.
> Sincerely,
> Eckard Blumschein