[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Melodic consonance

Eckard Blumschein (>) wrote:
Alexandra Hettergot (>>) wrote:

> >I don't believe in the CBW concept's being essentially wrong (though I'd think other aspects (traditionally) apparent in music, as, e.g., timbral difference (i.e., due to both spectral and temporal structure), spatial distance, multichannel treatment, etc., worth to being considered in that respect, too).

> The latter is exactly what I would like to suggest. There are people like
> me who consider virtual pitch a plausible result of neural principles
> rather than a "gestalt" phenomenon. The same reasoning provides a
> functional understanding of the neural basis for sensations like
> consonance/dissonance on the most basic level, and for more emotional
> judgements like pleasantness at a higher level.

I think perhaps the conception of "gestalt" phenomena has become
corrupted over the years to mean its opposite. Kohler and others thought
in terms of psychoneural isomorphisms, between analog, iconic,
relational patterns of excitation and the percepts. Their perceptual
gestalts were holistic patterns that were thought to be reflections and
simple transformations of global patterns of neural activity. Kohler and
others thought in terms of field-like, spatiotemporal ionic current
patterns in tissues, but these mechanisms were falsified by Lashley and
Sperry, who implanted pieces of metal in the cortex that presumably
would disrupt these patterns and hence distort their perceptual
correlates, but did not. In the last 40 years, an associationistic
connectionist model has dominated thinking, and the idea of the
"perceptual gestalt" has become a global pattern that has been assembled
or inferred from the pieces (i.e. break down the sensory inputs into
perceptual atoms and then reassemble the atoms into unified objects and
wholes). Feature detectors and feature-bindings are bound up with these
assumptions. The Gestaltists, however, rejected this notion that the
system first breaks everything down into perceptual atoms and then
reconstructs wholes, and argued instead for relational representations
based on patterns (relational primitives). In this respect they have
commonalities with the Gibsonians in looking for iconic, analog patterns
and the means of extracting invariances from them. For the Gestaltists,
perceptual Gestalts were thought to be the experiential consequences of
the global features of these neural activity patterns.

In the context of summary autocorrelation and population-based
interspike interval-based models for pitch (e.g. Meddis & Hewitt,
Cariani & Delgutte), the representational primitives are interspike
intervals -- relations between events -- that form an iconic, analog,
representation that resembles the autocorrelation function of the
stimulus (taking into account cochlear filtering and the decline of
phase-locking with frequency and some temporal precedence effects in
high CF regions). The most common intervals in population-interval
distributions correspond to the pitches that are heard (in the few
exceptions, the pitch is an octave higher or lower). The pattern of
interval peaks in the distribution is a global feature of the activity
over the whole neural population, so in this case we have a perceptual
Gestalt that is based on spike-mediated relations, intervals, rather
than spatially-distributed currents (spike information is conveyed
axonally, so insertion of metal into tissues would likely not affect
such a system's functioning very much -- it would be possible to have
mechanisms based on spike correlations that had many of the same
properties as the Kohler's fields).

Rather than being a "pattern-completion" process, in which a "virtual"
pitch is inserted for the "missing fundamental," as it appears if we
look in the frequency domain, the process resembles the perception of
any other periodicity when seen in terms of autocorrelation-like
representations and operations.

> >btw, what will "local resonances" cause if not spectral pitches (due to spectral peaks) ?
> Well, local resonance actually performs a coarse spectral analysis.
> However, look at the neural pattern drawn by Secker-Walker and Searle
> (1990). Temporal period seems to be much more precise than what Vercoe
> named spectral "blockvoting". Furthermore, spectral resolution within
> cochlea cannot account for the astonishing frequency resolution of hearing.
> This also indicates that frequency discrimination by ear is based on
> perception of period rather than frequency.
> BTW, I have to correct myself. The paper by Shamma is based on the same
> original data by Miller and Sachs. So it would be highly desirable to have
> at least a second set of such data.

There was a series of papers by Delgutte in JASA in the mid-80's based
on a different (cat) dataset
that showed similar results. We also looked at population-coding of
vowels , and what one sees
is very much like the Miller & Sachs data -- there are broad swaths of
the auditory nerve array that are driven by the intense harmonics in a
formant region, so there may be 4-6 different time patterns in different
CF regions of the auditory nerve for a typical vowel. (I believe that
this is why only a few frequency channels are needed for speech
reception in quiet, e.g. Shannon's demonstration, and why implants work
as well as they do -- the channels themselves are not encoding
frequency, it is the temporal structure of the signals that they carry
that does this. If one believes this, then it becomes imperative to
raise the carrier frequencies of implant devices so that finer, higher
frequency temporal information can be conveyed by the system).

If you throw away all of the CF information, as Palmer did and we did,
to construct a population-interval distribution, then one has a very
nice, robust, precise representation of the formant structure in the
patterns of short intervals that are present. Timbre can also be encoded
purely temporally.

The rate-place patterns on the other hand are coarse, only allowing for
(poor) discriminations of  individual components when they are separated
by large fractions of an octave (say 1/3 - 1/2 octave). These profiles
only worsen at higher levels. This looks bad for spectral pattern
mechanisms for pitch that are based on rate-place profiles. The picture
for purely spectral peaks and spectral pitch more or less resembles that
for pure tone discrimination above 5 kHz -- coarse, musically atonal,
and vulnerable to changes in level. So there is a purely place-based
pitch percept, but may be much weaker and more ill-defined than what is
commonly imagined. Strong, precise and level-invariant pitch percepts
that have musical tonality (chroma, can support a melody) are always, to
my knowledge, associated with an abundance of phase-locked spike information.

-- Peter Cariani

Peter Cariani, Ph.D.
Eaton Peabody Laboratory of Auditory Physiology
Massachusetts Eye & Ear Infirmary
243 Charles St., Boston, MA 02114 USA

tel@EPL (617) 573-4243
tel@MGH (617) 726-5419
FAX (617) 720-4408

Email peter@epl.meei.harvard.edu
Web: www.cariani.com