[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Terhardt's theory and the tritone paradox

Bruno Repp writes:

> What is not clear to me, however, is how
> the auditory system could get shaped by exposure to the multitude of
> speech sounds generated by vocal tracts of all sizes. The idea seems
> appealing, but the mechanism is quite opaque.

The spectral dominance region is presumably determined by the spectral
characteristics of environmental sounds -- in particular, of speech sounds. In a
Gibsonian or ecological approach, the whole speech-hearing system is integrated,
tuned to itself and to the natural environment by a combination of (social)
evolution and (social) life experience. As a result, the auditory system "knows"
that certain frequency ranges are more likely to provide useful information
about sound sources in the environment, or about the meaning of speech, than

Mechanism? A neural-net model of spectral dominance would presumably involve
units that are tuned to given spectral frequencies. The more you excite a given
synapse during exposure to speech, the stronger the connection, and the higher
the corresponding spectral frequency weight.

> And until this mechanism
> is clarified, the proposed explanation is merely a description pushed
> up one level (i.e., from the actual behavior to the hypothetical spectral
weighting function).

I think the point here is that the same algorithm, including the spectral
weighting function, can explain a wide range of different pitch effects -- pitch
of pure tones, harmonic complex tones, residue tones, bells, chords, etc. (see
TSS82a for details). It's no mean feat to create a single algorithm that
approximately fits a wide range of data.

>  Richard also neglects to mention that, in my study, I found strong
> effects of spectral envelope for many listeners. This result is indeed
> atypical compared to what Deutsch and Pollack reported, and also with
> respect to Terhardt's model, but it is a finding in need of explanation.
> Apparently, spectral envelope effects do surface under certain conditions,
> and the Terhardt model does not allow for that.

Deutsch used two spectral envelopes in her study that were exact spectral
transpositions of each other, differing by an interval of 6 semitones. If you
plug all 12 OCTs for each of Deutsch's envelope into TSS82b, the two calculated
pitch distributions differ in average ("centroid") pitch by about 2 semitones.
The result is less than 6, because the spectral dominance region in the model is
fixed. (The center of the calculated pitch distribution pitch is still
"approximately 300 Hz" in both cases.)

> Mark Pitt (in a personal
> communication related to this exchange) has suggested that the effect may
> be related to the inability of musically untrained listeners to separate
> pitch and timbre--again only in certain situations.

That sounds feasible, but is possible to make definite predictions on this
basis? I think it depends on one's operational or experimental definition of
pitch and timbre.


On the subject of pitch/timbre confusions, Joyce Tang Boyland writes:

> As a native Chinese speaker, perceptual psychologist, and quasi-linguist,
> I've been trying to suggest for a long time that this perception of
> higher f0 is actually due to a (female) dialect/style whose vowels are
> "timbrally brighter".

I would be very interested to see experimental data (or even just "strong
anecdotal evidence") that would support such an assertion.


Pierre Divenyi writes:

> Do I understand
> it correctly, Richard, that the results are expected to change
> as a function of the frequency region which the subject listens in?

I'm not sure. My first reaction is yes, but not because of spectral dominance.
It is probably most appropriate to regard (or define) the spectral dominance
region to be independent of attention, and assume that attention only affects
the array of pitches and saliences at the final output of TSS82b's model.
Attention can affect either the relative salience of different pitch regions, or
of virtual and spectral pitches (e.g., hearing out harmonics).

> ...why not to pass the whole sequence through some bandpass filter
> and see what happens when we take away from the subject the freedom
> to pick the listening band that supposedly lies closest to his idiosyncratic
> or cultural proclivities?

Such an experiment would extend Deutsch's comparison between two spectral
envelopes. Testing many envelopes would provide extra data by which to check the
shape of the spectral dominance region in the model -- either on average or for
individual listeners -- by comparing experimental results with model output.


Greg Sandell writes (private letter, reproduced by permission!):

> My main difficulty is that your argument seems to imply that
> Shepard OCTs are not octave-ambiguous.

According to my experimental data (Parncutt, 1989, 1993), the number of pitches
you hear in an isolated presentation of an OCT varies considerably depending on
the experimental context (that is, the other sounds in the design). Apart from
that, it seems that most people usually hear either one or two pitches in an
isolated OCT. Of course the octave register of the single pitch is ambiguous.
Terhardt's algorithm estimates the *most likely* octave register, and predicted
results of the tritone-paradox experiment are based on that register.

>"Pitch units" remains rather vague to me

It's analogous to loudness in phon. Phon is like equivalent dB, pitch-units is
like equivalent hertz. Terhardt (1974) and TSS82 use a pure comparison tone with
a constant SPL of 60 dB.

> Then you are saying that Shepard OCTs are not ambiguous in their
> relative pitch distances from one another?  (Of course, that's counter to the

> very point of Shepard's demonstration.)

The up/down relationship between two successive tritone-related OCTs is
ambiguous, but in general one direction predominates. Deutsch's and Repp's
experiments measure the *probability* that one direction will be chosen, for a
given listener. I would like to be able to calculate this probability for a
given pair of OCTs and a given listener.

In Shepard's demonstration your attention is attracted to the spectral pitches
because they are continually moving. You hear individual spectral pitches
disappear at the top and new ones appear at the bottom. In the tritone paradox
experiment, OCTs are presented individually and quickly, so you are more likely
to hear a single virtual pitch.

> Are you saying that an octave-scale of Shepard OCTs will map
> more consistently to one particular octave-run of twelve sine-tone

> match tones than the octave above or below it?

Yes, but: When you present OCTs in an octave run, there is a strong
context-serial-streaming effect. The octave register in which you hear pitch in
that case in not necessarily the same as when the tones are presented
individually in isolation. (I have ignored serial effects in my present
explanation of the tritone paradox, possibly at my peril.)

> Don't you expect that the makeup of a Shepard OCT is such that the
> profile will evince more ambiguity among octave-related candidates
> than tones with natural harmonic profiles (i.e. sloping)?

Yes. In both cases it is nevertheless still possible to hear a single virtual
pitch, and it is still possible (usually) to find out experimentally which of a
set of possible virtual pitches is the main one.


In conclusion, doomed@titanic.atlantic.** writes:

> Mind you, I think the whole
> tritone/nationality thing sounds utterly crackpot, but there's just
> enough foundation to the idea to make it worth considering.

billyJoel@myFanClub.nyc.edu replies:

> You may be right! I may be crazy.
> But it just may be a lunatic you're looking for.

Richard Parncutt


Parncutt, R. (1989). Harmony: A Psychoacoustical Approach. Springer-Verlag,

Parncutt, R. (1993). Pitch properties of chords of octave-spaced tones.
Contemporary Music Review (In press since 1991!)

Terhardt, E. (1974). Pitch, consonance, and harmony. Journal of the Acoustical
Society of America, 55, 1061-1069.

Terhardt, E., Stoll, G., & Seewann, M. (1982a). Pitch of complex tonal signals
according to virtual pitch theory: Tests, examples and predictions. Journal of
the Acoustical Society of America, 71, 671-678. [TSS82a]

Terhardt, E., Stoll, G., & Seewann, M. (1982b). Algorithm for extraction of
pitch and pitch salience from complex tonal signals. Journal of the Acoustical
Society of America, 71, 679-688. [TSS82b]