Re: MFCC method (Laszlo Toth )


Subject: Re: MFCC method
From:    Laszlo Toth  <tothl@xxxxxxxx>
Date:    Sat, 10 Jan 2009 15:50:59 +0100
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

On Fri, 9 Jan 2009, Richard F. Lyon wrote: > and the observations of others that the principal components > of a collection of vowel spectra on a warped frequency scale aren't > so far from the cosine basis functions. The warping of the frequency axis indeed invalidates the original motivation of cepstrum calculation: the deconvolution of pitch and the spectral envelope. Unfortunately, this is usually not emphasized in textboks. Furthermore, the conventional MFCC computation algorithm contains a (weighted) summation of spectral bands, which pretty much does the smoothing as well. So I think that what makes the cosine transform (or FFT) step practically useful is that it approximates a principal component analysis (as Dick Lyon said) -- and that it decorrelates the features. This is important because the MFCC features are in most cases modelled by Gaussians with diagonal covariance matrices. Laszlo Toth Hungarian Academy of Sciences * Research Group on Artificial Intelligence * "Failure only begins e-mail: tothl@xxxxxxxx * when you stop trying" http://www.inf.u-szeged.hu/~tothl *


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University