[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: AUDITORY@xxxxxxxxxxxxxxx*Subject*: Re: The natural spectrogram, Re: Gaussian vs uniform noise audibility*From*: Julius Smith <jos@xxxxxxxxxxxxxxxxxx>*Date*: Thu, 29 Jan 2004 11:09:58 -0800*Comments*: To: Eckard Blumschein <Eckard.Blumschein@E-Technik.Uni-Magdeburg.DE>*Delivery-date*: Thu Jan 29 14:22:52 2004*In-reply-to*: <3.0.5.32.20040128104807.00ba47d8@dfnserv1.urz.uni-magdebur g.de>*References*: <3.0.5.32.20040127190530.00bad660@dfnserv1.urz.uni-magdebur g.de> <3.0.5.32.20040127190530.00bad660@dfnserv1.urz.uni-magdeburg.de> <3.0.5.32.20040128104807.00ba47d8@dfnserv1.urz.uni-magdeburg.de>*Reply-to*: Julius Smith <jos@xxxxxxxxxxxxxxxxxx>*Sender*: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

At 01:48 AM 1/28/2004, Eckard Blumschein wrote:

...So far I can neither imagine the STFT itself to be natural nor a spectrogram based on it. Wouldn't this require to naturally choose size of the window?

Yes -- and as a function of frequency. We normally call it a "multiresolution" STFT.

Wouldn't one have to decide further arbitrary parameters like the degree of overlap?

This is just a sampling-rate issue. If computational cost is no object, one can simply choose maximum overlap (i.e., a "sliding FFT" instead of a "hopping FFT"). On the other hand, FFT filter banks can usually be downsampled quite a lot and still give equivalent end results. In this context, your window is your anti-aliasing filter for downsampling. Reference: Jont B. Allen, "Short Term Spectral Analysis, Synthesis, and Modification by Discrete Fourier Transform", IEEE ASSP-25(3).

Doesn't any usual spectrogram incompletely represent the information?

STFTs are normally invertible, in my experience, even in the presence of aliasing due to downsampling (it gets canceled in the reconstruction). The classic spectrogram discards phase, so it is not exactly invertible. Of course, it is well known that phase can be reconstructed from STFT magnitude to a large extent for typical signals and analysis conditions.

Isn't the usual spectrogram subject to the notorious trade-off beween spectral and temporal resolution?

Well sure, but we can let the human ear tell us where to be on that trade-off.

Was there any physiological justification for STFT which could include the rectification? Is there close similarity to measurement of BM motion and neural pattern?

I don't understand the first question. My understanding of rectification that this is the nature of how the hair cells respond to basilar membrane vibration. Firing increases when the membrane pushes one way, but not the other. The STFT implements a filter bank, and the output of that filter bank can be rectified accordingly (applied to real time-domain signals at the STFT filter-bank output, of course).

I am sceptical in all of these and further details.

I suppose you're posting to the right list! Julius

**Follow-Ups**:**Re: The natural spectrogram***From:*Eckard Blumschein

**References**:**The natural spectrogram, Re: Gaussian vs uniform noise audibility***From:*Eckard Blumschein

**Re: The natural spectrogram, Re: Gaussian vs uniform noise audibility***From:*Eckard Blumschein

- Prev by Date:
**semantic satiation** - Next by Date:
**apparent increase in loudness** - Previous by thread:
**Re: The natural spectrogram, Re: Gaussian vs uniform noise audibility** - Next by thread:
**Re: The natural spectrogram** - Index(es):