Re: MFCC method

Actually, I do not find much logic behind taking the Fourier transform
(FT) of a log-amplitude spectrum transformed to a (quasi) logarithmic
scale, as done in MFCC. It is reasonable to take the FT of a
log-amplitude spectrum in the linear frequency scale (standard
cepstrum analysis) because this spectrum is often almost periodic (at
least for most naturally-occurring periodic signals). However, after a
(quasi-) logarithmic frequency scale transformation, I would rarely
expect the spectrum to be periodic (it will stretch as the frequency
increases), and therefore I do not find the logic behind trying to
represent it as a linear combination of sinusoids, as done implicitly
when taking a FT.


On Fri, Jan 9, 2009 at 8:36 AM, Richard F. Lyon <DickLyon@xxxxxxx> wrote:
> I agree that the Greenwood function would be more logical, but as Lazlo Toth
> points out, there's not really enough difference to matter in cases that
> have been looked at.
> Don Greenwood gave me a collection of his papers, which I'm supposed to put
> on-line some place and summarize better in wikipedia, too.  Thanks for
> pointing out that article; I might link some paper copies there, and refer
> to it from the mel article.
> http://en.wikipedia.org/wiki/Greenwood_Function
> And I want to thank Malcolm again since I might not have made it clear that
> the reason the info appears in wikipedia is because he sent it here, and
> sent me copies of the papers.  He has the sense to convert his hearing
>  library to digital form some years ago, so he can find things that I never
> can.
> Dick
> At 11:28 PM -0800 1/8/09, Arturo Camacho wrote:
>> Dear Dick,
>> The Wikipedia page that you mention says that the Mel scale
>> "approximates the human auditory system's response more closely than
>> the linearly-spaced frequency bands used in the normal cepstrum." If
>> that means that the Mel scale approximates better the tonotopic
>> response of the cochlea than the linear scale, I wonder if it would
>> not be an even better idea to use the Greenwood function (see entry in
>> Wikipedia), which was explicitly created with that purpose. (Recall
>> that the Mel scale was designed to represent equidistant steps in
>> pitch, but that does not necessarily corresponds with equidistant
>> tonotopic steps.)
>> Regards,
>> Arturo


