[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Frequency to Mel Formula
Your comments and questions are well taken. Stevens discarded the
"half-pitch" mel scale of 1937 completely, as I commented. Siegel
1964 or 65) did the only published effort I know about to replicate
the 1937 experiment of Stevens, as mentioned by Richard Warren and as
described also in Footnote 4 of my 1997 paper, where I present
arguments against half-pitch judgments that you and others have.
Stevens replaced the 1937 scale with the 1940 mel scale, that was
based on experiments intended to divide frequency intervals into four
(4 not 2) equal perceptual intervals. That scale appears to have been
methodologically biased, as I commented. That scale is also the one
that was approximated by Fant and used thereafter, as Pierre says, by
the "speech science and technology community". I agree with Pierre
that they have been ill-advised to use it, but no doubling or halving
was involved in arriving at the 1940 scale.
In 1940 Stevens, as you have noted, did include a separate experiment
(reported in the same paper as the equisection data) asking subjects
to make what he called half-pitch judgments, but Stevens provided 40
Hz signals to the subjects to provide them with "zero" pitch
approximations. That seems to have converted those subject judgments
into bisection experiments, i.e. 40 Hz on one end, standard tone on
the other, with subject setting the "middle" tone. In any case, when
Stevens provided the "zero point" you call for, the resulting pseudo
half-pitch data fit in with his 4 part equisection results, but (as he
says) they were not used to make the 1940 scale, which used 4 part
equisection data only.
The use of narrow noise bands signals to try equisection experiments
sounds like an experiment that could be tried, whether the outcome
would be useful or not.
On 29 Jul, 2009, at 8:15 PM, Richard F. Lyon wrote:
Certainly the circular or helical aspect of pitch is crucial, in
many aspects of pitch perception. But there's also this one-
dimensional scale that can be valid in some contexts. I hadn't said
or known anything about this "half-pitch" concept, which would
certainly bring in the whole octave equivalence complication. But
is that what was used for the mel-scale tests and such? I didn't
think so. Rather, the idea was to subdivide intervals into
perceptually equal intervals ("equisection"). Of course, if the
intervals are like 2 octaves or such, or the subject is musically
savvy, that's going to bias the judgements based on the pitch
circularity. But if the signals are something like narrow noise
bands, maybe it would be possible to do the task while avoiding
those cues of "consonance" and such?
The "half pitch" idea presumes a well-defined, or well-perceived at
least, zero point, as well as a nonlinear mapping to try to get at.
Plus it puts the likely result right where the octave is, at least
for low frequencies. Did anyone actually use that approach?
Richard Warren and Snorre Farner say several studies did so; I'm
surprised; it seems like a bad idea. Wouldn't you almost always get
a result of half pitch equal to half frequency? Is that the
explanation for why the linear-to-log breakpoint ended up so high?
Or did they really do equisection of intervals defined by two
nonzero tone frequencies?
Stevens says they did both, but the curve he plots show only the