[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Origin of the Mel frequency scale equation?
Gunnar Fant long ago suggested the Technical Mel(TM) scale and it is in
his 1974 Book "Speech Sounds and Features (MIT Press 1974).
He gave us TM = [1000/log(2)]*log[1+f/1000],
Where the log is to the base 10 and f is in Hz.
This fits the Mel Scale of Stevens and Volkman (1940) except at
the very high frequencies. See the figure in Fant's book.
I have developed a quadratic that fits the 1940 scale very
accurately and modification of the Fant equation that also fits the
1940 scale over its entire range.
I have no idea of where the Mel = C*log (1 + f/700) came from in
the the speech-engineering literature. With the appropriate choice of
C, I expect it doesn't differ too much from Fant's TM over the ranges
of interest to speech engineers.
Quoting Christine Rankovic <rankovic@xxxxxxxxxxxxxxxx>:
It might be:
Stevens, S., and J. Volkmann (1940). The relation of pitch to
frequency: A revised scale. Am. J. Psychol. 53:329-353.
Christine Rankovic, PhD
----- Original Message ----- From: "Arturo Camacho" <acamacho@xxxxxxxxxxxx>
Sent: Monday, March 10, 2008 1:09 AM
Subject: Origin of the Mel frequency scale equation?
Dear members of the list,
I am looking for the reference of first use of the equation
m = C log(1+f/700)
known as mel frequency scale transformation. In Wikipedia says that the
scale was originated by Stevens, Volkman and Newman in 1937 (J. Acoust.
Soc. Am 8(3) 185--190), but the paper only has tabulated data and no
equation. The paper by S.B. Davis & P. Mermelstein (1980), "Comparison of
parametric representations for monosyllabic word recognition in
continuously spoken sentences", IEEE Trans. on ASSP 28, 357-366 is usually
cited in the speech recognition community as origin of MFCCs, but the
equation is absent there as well.
Arturo Camacho, PhD
Computer and Information Science and Engineering
University of Florida
Web page: www.cise.ufl.edu/~acamacho