[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Gunnar Fant's frequency map
Thank you so much for getting to the bottom of
this with a trip to the National Library.
That's really interesting that the first use of
the formula wasn't associated with the "mel"
scale per se. You wouldn't get that impression
from his 1959 paper (the one reprinted in the
1973 book), where he said:
"This formula, discussed in more detail earlier
(FANT, 1949), is a better mel approximation than
the Koenig scale which is exactly linear below
1,000 c/s and logarithmic above 1,000 c/s. The
significance of the mel scale for incremental
pitch judgments, masking, and intelligibility is
discussed by KOENIG (1949), MUNSON and GARDNER
(1950). Equal increments along the mel scale or
one of its technical approximations above
correspond closely to equal increments of
Did Fant comment on the Koenig scale in his 1949
report? Or he didn't know about it yet?
Was the Koenig scale pitched as a "mel" scale?
Does anyone have a copy of the Koenig paper?
W. Koenig (1949). "A new frequency scale for
acoustic measurements". Bell Telephone Laboratory
Record 27: 299-301.
At 10:31 AM +0000 3/18/11, Arne Leijon wrote:
Some time ago Richard Lyon asked about the origin of
Prof. Gunnar Fant's "frequency-to-mel" scale
x(f)= k log( 1+f/f_0 ), with f_0=1000 Hz,
most easily accessible in Ch 3 of Fant (1973), page 48.
Yesterday, I spent a couple of hours studying the original
lab report of Fant (1949) where he introduced and motivated this scale.
In this report, G Fant did not use the term "mel". It is quite clear that
he saw his scale as a cochlear map function, describing
the cochlear characteristic place of acoustic frequency components.
He discussed the scale mailny as a way to display speech power density spectra
in terms of power per length unit along the basilar membrane.
He motivated his choice of x(f) function with a
table of correction values in dB,
that would be needed to transform a power density spectrum measured with
constant bandwidths in Hz, into power per uniform steps along the x-scale.
In this table he presented these correction values as
L(f) = 10 log_10 ( BW(f) / BW( f_0) ), with f_0=1000 Hz,
using auditory bandwidth estimates BW(f) from four sources:
A: Difference Limens for frequency, ref Stevens and Davis (1938).
B: Critical Bandwidths, ref Fletcher (1929)
C: Bandwidths with equal intellibigility contributions, ref Beranek (1947)
D: Bandwidths with equal intelligibility
contributions, ref French & Steinberg (1947)
in comparison with the corresponding values derived from his proposed mapping.
Fant (1949) interpreted the sources A-D just as different methods to estimate
the same thing, namely, the cochlear map. He did
not discuss any use of his proposed scale x(f) as
a "Numerical Scale of Pitch" (in mels), as
suggested by Stevens & Davis (1938) in the pages
Fant did not refer to.
Fant also noted as an advantage of the scale,
that the vowel formant bandwidths are roughly
if measured in units of the proposed scale.
Fant, G. (1973). Acoustic description and
classification of phonetic units, chapter 3,
pages 3283. MIT Press, Cambridge, MA.
Fant, C. G. M. (1949). Analys av de svenska
konsonantljuden. Technical Report protokoll H/P
1064, LM Ericsson.
citing among others:
Stevens, S. and Davis, H. (1938). Hearing. New
York, pp. 94-99 and pp. 127-130.
Fletcher, H. (1940). Auditory patterns. Rev Mod Phys, 12, pp. 4765.
Beranek, L. L. (1947). The design of speech
communication systems. Proceedings of the
French, N. and Steinberg, J. (1947). Factors
governing the intelligibility of speech sounds.
J Acoust Soc Amer, 19(1):90119.