[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Robust method of fundamental frequency estimation.

Arturo Camacho <acamacho@xxxxxxxxxxxx> wrote:

> > autocorrelation-based pitch models that can NOT be expressed in terms of the
> > spectrum.
> > For example, the Meddis & Hewitt or Meddis & O'Mard models, or
> > Slaney & Lyon models,
> > derived from Licklider's duplex theory, which do the ACF after what the
> > cochlea model does, which is a separation into filter channels and a
> If I am
> not wrong, what Slaney & Lyon?s model does is to apply a summary
> autocorrelation to the output of a gammatone filterbank (it does some
> extra steps, but the main idea is that one). Since this can be shown to be
> equivalent to applying autocorrelation to the original signal (use
> Wiener?Khinchin theorem and linearity property of Fourier Transform),


Your are wrong in your guess that to apply a summary autocorrelation to the
output of a filterbank is equivalent to applying autocorrelation to the original
According to the theorem you mentioned but perhaps not understood,
autocorrelation corresponds to performing cosine transform twice, i.e. back and
forth: A first cosine transform of a signal f_0(t) from time domain yields
F_0(omega) in frequency domain.
Subsequent second cosine transform of F_0(omega) yields a f_1(tau) in time
domain again.
These two steps together correspond to the autocorrelation function ACF
of the  o r i g i n a l  signal: f_0-->f_1(tau). Remember: ACF corresponds to
twice cosine transform, a first one and an inverting second one.

Bogert and Tukey called that inverted spec_trum a ceps_trum, inverting the order
of letters in the syllable spec into ceps.

This f_1(tau) is what perhaps comes close to a major part of auditory function
even if it is hard to abandon what we learned that we are hearing frequencies
and admit that autocorrelation lag is largely equivalent to frequency.

ACF of the spectrum F_0(omega) would correspond not to just two but to to three
cosine transforms in series and eventually result in a function F_1 of omega:

Brain cannot directly process functions of omega. In cat, there are about 33,000
T-multipolar chopper neurons of the ventral cochlear nucleus (VCN). T means they
immediately project to the IC via trapezoid body (TB). They might translate
place code into downsampled frequencies while preserving tonotopy at a time. At
least they show very regular responses with a highly reproducible pattern of
spike trains in which the interspike intervals are all about the same length.
Frequencies of chopper neurons are on average about three times lower than
average frequencies of firing within single auditory nerve fibers which already
tend to be considerably lower than each belonging characteristic frequency CF
for CFs in excess of 500 Hz.

Eckard Blumschein