[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MFCC method



On Fri, 9 Jan 2009, Richard F. Lyon wrote:

> and the observations of others that the principal components
> of a collection of vowel spectra on a warped frequency scale aren't
> so far from the cosine basis functions.

The warping of the frequency axis indeed invalidates the original
motivation of cepstrum calculation: the deconvolution of pitch and the
spectral envelope. Unfortunately, this is usually not emphasized in
textboks. Furthermore, the conventional MFCC computation algorithm
contains a (weighted) summation of spectral bands, which pretty much does
the smoothing as well. So I think that what makes the cosine transform (or
FFT) step practically useful is that it approximates a principal component
analysis (as Dick Lyon said) -- and that it decorrelates the features.
This is important because the MFCC features are in most cases modelled by
Gaussians with diagonal covariance matrices.

               Laszlo Toth
        Hungarian Academy of Sciences         *
  Research Group on Artificial Intelligence   *   "Failure only begins
     e-mail: tothl@xxxxxxxxxxxxxxx            *    when you stop trying"
     http://www.inf.u-szeged.hu/~tothl        *