Re: MFCC method (Arturo Camacho )


Subject: Re: MFCC method
From:    Arturo Camacho  <acamacho@xxxxxxxx>
Date:    Fri, 9 Jan 2009 20:18:44 -0800
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

Actually, I do not find much logic behind taking the Fourier transform (FT) of a log-amplitude spectrum transformed to a (quasi) logarithmic scale, as done in MFCC. It is reasonable to take the FT of a log-amplitude spectrum in the linear frequency scale (standard cepstrum analysis) because this spectrum is often almost periodic (at least for most naturally-occurring periodic signals). However, after a (quasi-) logarithmic frequency scale transformation, I would rarely expect the spectrum to be periodic (it will stretch as the frequency increases), and therefore I do not find the logic behind trying to represent it as a linear combination of sinusoids, as done implicitly when taking a FT. Arturo On Fri, Jan 9, 2009 at 8:36 AM, Richard F. Lyon <DickLyon@xxxxxxxx> wrote: > I agree that the Greenwood function would be more logical, but as Lazlo Toth > points out, there's not really enough difference to matter in cases that > have been looked at. > > Don Greenwood gave me a collection of his papers, which I'm supposed to put > on-line some place and summarize better in wikipedia, too. Thanks for > pointing out that article; I might link some paper copies there, and refer > to it from the mel article. > http://en.wikipedia.org/wiki/Greenwood_Function > > And I want to thank Malcolm again since I might not have made it clear that > the reason the info appears in wikipedia is because he sent it here, and > sent me copies of the papers. He has the sense to convert his hearing > library to digital form some years ago, so he can find things that I never > can. > > Dick > > > At 11:28 PM -0800 1/8/09, Arturo Camacho wrote: >> >> Dear Dick, >> >> The Wikipedia page that you mention says that the Mel scale >> "approximates the human auditory system's response more closely than >> the linearly-spaced frequency bands used in the normal cepstrum." If >> that means that the Mel scale approximates better the tonotopic >> response of the cochlea than the linear scale, I wonder if it would >> not be an even better idea to use the Greenwood function (see entry in >> Wikipedia), which was explicitly created with that purpose. (Recall >> that the Mel scale was designed to represent equidistant steps in >> pitch, but that does not necessarily corresponds with equidistant >> tonotopic steps.) >> >> Regards, >> >> Arturo >> >> -- __________________________________________________ Arturo Camacho, PhD Alumni Computer and Information Science and Engineering University of Florida E-mail: acamacho@xxxxxxxx Web page: www.cise.ufl.edu/~acamacho __________________________________________________


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University