[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
effect of phase on pitch
Pondering the evolutionary origins of the ear's "phase deafness" in most
naturally occurring sounds, I have come up with the following argument. Does
it make sense? Is there other literature on this subject that I have missed?
Richard Parncutt, Lecturer in Psychology of Music and Psychoacoustics,
Unit for the Study of Musical Skill and Development, Keele University.
Post: Dept of Psychology, Keele University, Staffordshire ST5 5BG, GB.
Tel: 01782 583392 (w),01782 719747 (h). Email: firstname.lastname@example.org.
Fax: +44 1782 583387. URL: http://www.keele.ac.uk/depts/ps/rpbiog.htm.
In everyday listening environments, phase relationships are typically
jumbled unrecognizably when sound is reflected off environmental objects;
that is, when reflected sounds of varying amplitudes (depending on the
specific configuration and physical properties of the reflecting materials)
are added onto sound traveling in a direct line from the source. Thus, phase
information does not generally carry information that can reliably aid a
listener in identifying sound sources in a reverberant environment
(Terhardt, 1988; see also Terhardt, 1991, 1992). This is a matter of
particular concern in an ecological approach, as non-reverberant
environments are almost non-existent in the real world (anechoic rooms,
mountain tops). On the other hand, again in real acoustic environments,
spectral frequencies (that is, the frequencies of isolated components of
complex sounds, or clear peaks in a running spectrum, forming frequency
trajectories in time-varying sounds) cannot be directly affected by
reflection off, or transmission through, environmental obstacles. They might
be indirectly affected as a byproduct of the effect that such manipulations
can have on amplitudes (e.g., a weakly defined peak could be pushed sideways
if amplitudes increased on one side and decreased on the other), but such
phenomena could hardly affect audible sound spectra.
So for the auditory system to reliably identify sound sources, it needs to
ignore phase information, which is merely a constant distraction, and focus
as far as possible on a signal's spectral frequencies (and to a lesser
extent on the relative amplitudes of individual components, keeping in mind
that these, too, are affected by reflection and transmission). The ear's
phase deafness with regard to pitch perception is thus a positive attribute.
In fact, it may be regarded as an important phylogenetic achievement - the
result of a long evolutionary process in which animals whose ears allowed
phase relationships to interfere with the identification of dangerous or
otherwise important sound sources died before they could reproduce. If this
scenario is correct, then it is no surprise that we are highly sensitive to
small changes in frequency, and highly insensitive to phase relationships
within complex sounds.
Straightforward evidence of the ear's insensitivity to phase in the sounds
of the real human environment has been provided by Heinbach (1988). He
reduced natural sounds including speech (with or without background noise
and multiple speakers) and music to their spectral contours, which he called
the part-tone-time-pattern. In the process, he completely discarded all
phase information. The length of the spectrum analysis window was carefully
tuned to that of the ear, which depends on frequency. Finally, he
resynthesized the original sounds, using random or arbitrary phase
relationships. The resynthesized sounds were perceptually indistinguishable
from the originals, even though their phase relationships had been shuffled.
It is nevertheless possible to create artificial stimuli for which clear,
significant perceptual effects of phase relationships on perception can be
demonstrated. For example, Patterson (1973, 1987) demonstrated that
listeners can discriminate two harmonic complex tones on the basis of phase
relationships alone. Moore (1977) demonstrated that the relative phase of
the components affects the pitch of harmonic complex tones consisting of
three components; for each tone, there were several possible pitches, and
relative phase affected the probability of a listener hearing one of those
as 'the' pitch. Hartmann (1988) demonstrated that the audibility of a
partial within a harmonic complex tone depends on its phase relationship
with the other partials. Meddis & Hewitt (1991b) succeeded in modeling these
various phase effects, which (as Moore, 1977, explained) generally apply
only to partials falling within a single critical band or auditory filter.
In an ecological approach, the existence of phase sensitivity in such
stimuli (or such comparisons between stimuli) might be explained as follows.
These stimuli (or stimulus comparisons) do not normally occur in the human
environment. So the auditory system has not had a chance to'learn' (e.g.,
through natural selection) to ignore the phase effects. As hard as the ear
might 'try' to be phase deaf in the above cases, some phase sensitivity will
always remain, for unavoidable physiological reasons.
There could, however, be some survival value associated with the ability to
use phase relationships to identify sound sources during the first few tens
of ms of a sound, before the arrival of interference from reflected waves in
typical sound environments. On this basis, we might expect phase
relationships at least to affect timbre, even in familiar sounds. Supporting
evidence for this idea in the case of synthesized musical instrument sounds
has recently been provided by Dubnov & Rodet (1997). In the case of speech
sounds, Summerfield & Assmann (1990) found that pitch-period asynchrony
aided in the separation of concurrent vowels; however, the effect was
greater for less familiar sounds (specifically, it was observed at
fundamental frequencies of 50 Hz but not 100 Hz). In both cases, phase
relationships affected timbre but not pitch.
The model of Meddis & Hewitt (1991a) is capable of accounting for known
phase dependencies in pitch perception (Meddis & Hewitt, 1991b). This raises
the question: why might it be necessary or worthwhile to model something
that does not have demonstrable survival value for humans (whereas music
apparently does have survival value, as evidenced by the universality of
music in human culture). As Bregman (1981) pointed out, we need to "think
about the problems that the whole person faces in using the information
available to his or her sense organs in trying to understand an environment"
(p. 99). From this point of view, the human ear might be better off without
any phase sensitivity at all. Bregman goes on to say that "Because
intelligent machines are required actually to work and to achieve useful
results, their designers have been forced to adopt an approach that always
sees a smaller perceptual function in terms of its contribution to the
overall achievement of forming a coherent and useful description of the
environment." So if one were building a hearing robot, there would be no
point in incorporating effects of phase on pitch perception, if such effects
did not help the robot to identify sound sources.
Bregman, A.S. (1981). Asking the 'What for?' question in auditory
perception. In M. Kubovy & J. R. Pomerantz (Eds.), Perceptual organization
(pp. 99-118). Hillsdale, N.J.
Dubnov, S., & Rodet, X. (1907). Statistical modeling of sound
aperiodicities. Proceedings of the International Computer Music Conference,
Thessaloniki, Greece, (pp. 43-50).
Hartmann, W. (1988). Pitch perception and the segregation and integration of
auditory entities. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.),
Auditory function (pp. 623-645). New York: Wiley.
Heinbach, W. (1988). Aurally adequate signal representation: The
Part-Tone-Time-Pattern. Acustica, 67, 113-121.
Meddis, R., & Hewitt, M.J. (1991a). Virtual pitch and phase sensitivity of a
computer model of the auditory periphery. I: Pitch identification. Journal
of the Acoustical Society of America, 89, 2866-2882.
Meddis, R., & Hewitt, M.J. (1991b). Virtual pitch and phase sensitivity of a
computer model of the auditory periphery II: Phase sensitivity. Journal of
the Acoustical Society of America, 89, 2883-2894.
Moore, B.C.J. (1977). Effects of relative phase of the components on the
pitch of three-component complex tones. In E. F. Evans & J. P. Wilson
(Eds.), Psychophysics and physiology of hearing (2nd ed.) (pp. 349-362). New
Patterson, R.D. (1973). The effects of relative phase and the number of
components on residue pitch. Journal of the Acoustical Society of America,
Patterson, R.D. (1987). A pulse ribbon model of monaural phase perception.
Journal of the Acoustical Society of America, 82, 1560-1586.
Summerfeld, Q., & Assmann, P. F. (1990). Perception of concurrent vowels:
Effects of harmonic misalignment and pitch-period asynchrony. Journal of the
Acoustical Society of America, 89, 1364-1377.
Terhardt, E. (1988). Psychoakustische Grundlagen der Beurteilung
musikalischer Kl nge. In J. Meyer (Ed.), Qualit tsaspekte bei
Musikinstrumenten (pp. 9-22). Celle: Moeck.
Terhardt, E. (1991). Music perception and sensory information acquisition:
Relationships and low-level analogies. Music Perception, 8, 217-240.
Terhardt, E. (1992). From speech to language: On auditory information
processing. In M. E. H. Schouten (Ed.), The auditory processing of speech
(p. 363-380). Berlin: Mouton de Gruyter.