[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: AUDITORY@xxxxxxxxxxxxxxx*Subject*: Re: mfcc filters gain*From*: Toth Laszlo <tothl@xxxxxxxxxxxxxxx>*Date*: Wed, 3 Nov 2004 19:32:29 +0100*Comments*: To: Guillaume Lemaitre <lemaitre@IRCAM.FR>*Delivery-date*: Wed Nov 3 13:57:39 2004*In-reply-to*: <4189082B.2030800@ircam.fr>*References*: <4189082B.2030800@ircam.fr>*Reply-to*: Toth Laszlo <tothl@xxxxxxxxxxxxxxx>*Sender*: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

On Wed, 3 Nov 2004, Guillaume Lemaitre wrote: > In the Malcom Slaney's Matlab implementation of mel frequency cepstral > coefficients, triangular filters are normalized "so that each filter has > unit weight". I am wondering what does this normalization correspond to. A very good question (this means that I was also wondering about it and could not find the answer...). If you normalize then the resulting values remain comparable. If you don't, then the wider filters return larger values on the average. So from a practical point of view, normalization might turn out to be useful in certain cases. Probably it is good for the subsequent cosine transform in mfcc transformation. But from a theoretical point of view, a I think it is hardly explainable (???). > I am also wondering if some work has already be done to improve > mfcc-like processing. As it is suggested in [1], Moore's ERB scale or > Bark scale seems to be more appropriated than the mel scale, and > gammatone filterbank should be much more accurate (even if probably more > computationaly expensive) than a triangular filterbank ? > You will find quite many different scales in the literature, and sometimes even several different formulas for the same scale. I have tried a couple of them, and never found a significant difference in the recognition results. In my sceptic opinion, there are much bigger inaccuracies in current speech recognition technology, so these little differences doesn't really matter. Anyway, probably the most interesting idea in this field was when several authors tried to directly optimize the filters in order to achieve the best possible recognition. I have seen a couple of papers on this, but unfortunately don't have any references at hand... Laszlo Toth Hungarian Academy of Sciences * Research Group on Artificial Intelligence * "Failure only begins e-mail: tothl@inf.u-szeged.hu * when you stop trying" http://www.inf.u-szeged.hu/~tothl *

**References**:**mfcc filters gain***From:*Guillaume Lemaitre

- Prev by Date:
**mfcc filters gain** - Next by Date:
**FW: FW: mfcc filters gain** - Previous by thread:
**mfcc filters gain** - Next by thread:
**FW: FW: mfcc filters gain** - Index(es):