[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The natural spectrogram, Re: Gaussian vs uniform noise audibility

At 10:05 AM 1/27/2004, Eckard Blumschein wrote:
There are many variants of desinging the windows and also many designs of
wavelets but there is only one physiological function of the inner ear and
only one corresponding natural spectrogram.
Yes, but I believe it is possible to configure and suitably process a
short-time Fourier transform (STFT) to approach this ideal.  What's wrong
with "one corresponding natural STFT"?   Something along these lines is
done in the best model of time-varying loudness perception I am aware of:

        AUTHOR = "Brian R. Glasberg and Brian C. J. Moore",
        TITLE = "A Model of Loudness Applicable to Time-Varying Sounds",
        JOURNAL = "Journal of the Audio Engineering Society",
        VOLUME = 50,
        NUMBER = 5,
       MONTH = "May",
        PAGES = {331--342},
        YEAR = 2002

At 10:05 AM 1/27/2004, Eckard Blumschein wrote:
At 09:13 27.01.2004 -0800, you wrote:
>Yes, a "sliding cosine transform" can be used in place of the usual
>"hopping short-time Fourier transform", and in that case, phase information
>is contained in the time variation of the sliding transform
>coefficients.  I didn't realize you were doing something like that,

I claim, you are doing the same, at least twice unconsciously in your inner
ears. I would however argue that neither magnitude-phase representation nor
time-frequency representation omit information while the usual spectrogram
is a faulty design that strips off phase. In other words, phase information
is merely a fictitious component that belongs to an inappropriate model of
the inner ear. I do not see any justification for attributing it to the
actual real-valued analysis.

so my
>argument was based on different assumptions.  Even the short-time Fourier
>transform hopping by half its window length each frame can be stripped of
>all phase information and still be used as the basis of a convincing sound
>synthesis, at least for smoothly changing sounds.

Yes, this is what the usual spectrogram does. Short-time means acceptable
with respect to temporal resolutiontoo while too short as to resolve low
frequency. Do you not believe that the natural spectrogram overcomes such
discrepancy, too? It is distinguished by: "no arbitrary window and no

There are many variants of desinging the windows and also many designs of
wavelets but there is only one physiological function of the inner ear and
only one corresponding natural spectrogram.


>At 03:08 AM 1/26/2004, Eckard Blumschein wrote:
>>At 12:06 23.01.2004 -0800, Julius Smith wrote:
>> >At 11:16 AM 1/23/2004, Eckard Blumschein wrote:
>> >>First of all, forget the wrong idea that the cochlea performs a complex
>> >>Fourier transform.
>> >
>> >This implies phase is discarded.
>>No! Do not consider me a moron. You and largely the rest of the world grew
>>up with the erroneous believe that there is no equivalent alternative to
>>complex spectral analysis. Complex calculus is indeed tremendously useful.
>>No matter whether one prefers magnitude and phase or real and imaginary
>>part, one always has to consider both constituents except for the case one
>>of them equals zero. Given, a function of time like 2A cos(omega t) does
>>not have any imaginary part at all. Entrance into complex plane is payed by
>>mandatory arbitrary omission of A exp(- i omega t) or A exp(i omega t).
>>Neither the magnitude A nor the phase omega t can be discarded.
>>At that point, you will object: Aren't anti-symmetrical functions, i.e.
>>functions of time with odd symmetry like sinus, also needed in frequency
>>No again, on condition, causality has been taken into account. In brief:
>>Future signals cannot be analyzed yet. Even sin(omega t) can be continued
>>as its mirror into fictive future time like an even function. Of course,
>>this wouldn't hold for its derivative or antiderivative. However, our topic
>>is just frequency analysis within cochlea.
>> >However, phase information does exist as
>> >the phase of the basilar membrane vibration,...
>>I don't take amiss this fallacy. It has to do with the missing natural
>>justification for fixing any reference point on the time scale. Our ears
>>are not synchronized with anything. When Descartes introduced Cartesian
>>coordinates, he imagined a spatially infinite world. Time is
>>correspondingly believed to also expand from minus infinite to plus
>>infinite. However, elapsed time definitely ends at the 'NOW' being the only
>>clever choice for a natural time scale. Take subsequent snapshots of a
>>sinusoid at NOW each. Try the same with any cochlear pattern. By chance,
>>you might observe sin or cos. In other words, so called linear phase is
>>arbitrary as is time. I don't deny that delay or according phase difference
>>is reasonable with respect to a second signal or a different reference.
>>Without such reference, a sinusoidal function cannot be a identified as
>>sin, cos or something complex in between, and the reference is lacking in
>>nature. The only natural reference is the NOW, which is steadily on the
>>move. This causes the trouble of permanently lagging window position in
>>case of arbitrarily centered complex Fourier transform.
>> >Since basilar membrane filtering is generally
>> >modeled as linear, any corresponding short-time-Fourier-transform would
>> >have to be complex to model basilar membrane filtering. Subsequent
>> >half-wave rectification does not eliminate all phase information,
>>An old specialist of power electronics like me cannot retrace how you
>>imagine rectification of a complex-valued function of time.
>>My wife is a teacher for adults. Perhaps she would more heedfully
>>anticipate what you and many others are feeling rather than thinking. I
>>will try and elucidate how engineers handle a similar case: Consider an
>>ideal sinusoidal voltage as a real input into a circuit that may also
>>contain a first (small) resistor and a reactance in series. Parallel to the
>>first resistor there are a diod and a much larger second impedance in
>>series. The voltage across the first resistor is a complex quantity with
>>respect to the source but pretty independent of the diod. However,
>>piecewise linear calculation requires to refer to the current through the
>>diod as a real one. In case of hearing, phase of the stimulus does not
>>matter since it anyway relates to an arbitrary reference.
>>As a rule, recognized experts like you tend to be cautious against
>>radically uncommon views. Therefore I would like to ask you: Look at
>>pattern of BM motion (e.g. T. Ren's) or of firing in the auditory nerve.
>>They do not resemble magnitude, nothing to say about phase. As far as I can
>>judge, they resemble the pattern of the natural (real-valued) spectrogram.
>>More in detail: Magnitude cannot account for the different patterns with
>>rarefaction vs. condensation clicks while positve and negative amplitudes
>>of the natural spectrogram clearly differ from each other.
>>In all, I didn't find any tenable argument in favor of complex cochlear
>>function. On the other hand, Fourier cosine transform, the natural
>>spectrogram and joint autocorrelation already resolved a lot of so far
>>poorly understood questions.
>>Incidentally, I recall a textbook denying any difference between time
>>domain and frequency domain. I do not fully share this opinion. In
>>particular, I consider it necessary to clearly distinguish between real
>>world and fictitious complex domain.
Julius O. Smith III <jos@ccrma.stanford.edu>
Assoc. Prof. of Music and (by courtesy) Electrical Engineering
CCRMA, Stanford University