Re: Origin of the Mel frequency scale equation?

Gunnar Fant long ago suggested the Technical Mel(TM) scale and it is in
his 1974 Book "Speech Sounds and Features (MIT Press 1974).

   He gave us TM = [1000/log(2)]*log[1+f/1000],

     Where the log is to the base 10 and f is in Hz.

This fits the Mel Scale of Stevens and Volkman (1940) except at the very high frequencies. See the figure in Fant's book.

I have developed a quadratic that fits the 1940 scale very accurately and modification of the Fant equation that also fits the 1940 scale over its entire range.

I have no idea of where the Mel = C*log (1 + f/700) came from in the the speech-engineering literature. With the appropriate choice of C, I expect it doesn't differ too much from Fant's TM over the ranges of interest to speech engineers.

    Jim Miller

It might be:

Stevens, S., and J. Volkmann (1940).  The relation of pitch to
frequency:  A revised scale.  Am. J. Psychol. 53:329-353.


Christine Rankovic, PhD

Dear members of the list,

I am looking for the reference of first use of the equation

m = C log(1+f/700)

known as mel frequency scale transformation. In Wikipedia says that the
scale was originated by Stevens, Volkman and Newman in 1937 (J. Acoust.
Soc. Am 8(3) 185--190), but the paper only has tabulated data and no
equation. The paper by S.B. Davis & P. Mermelstein (1980), "Comparison of
parametric representations for monosyllabic word recognition in
continuously spoken sentences", IEEE Trans. on ASSP 28, 357-366 is usually
cited in the speech recognition community as origin of MFCCs, but the
equation is absent there as well.




