Subject: effect of phase on pitch From: "R. Parncutt" <psa03(at)CC.KEELE.AC.UK> Date: Thu, 5 Feb 1998 10:49:54 +0000
Pondering the evolutionary origins of the ear's "phase deafness" in most naturally occurring sounds, I have come up with the following argument. Does it make sense? Is there other literature on this subject that I have missed? _____________________________________________________________________ Richard Parncutt, Lecturer in Psychology of Music and Psychoacoustics, Unit for the Study of Musical Skill and Development, Keele University. Post: Dept of Psychology, Keele University, Staffordshire ST5 5BG, GB. Tel: 01782 583392 (w),01782 719747 (h). Email: r.parncutt(at)keele.ac.uk. Fax: +44 1782 583387. URL: http://www.keele.ac.uk/depts/ps/rpbiog.htm. In everyday listening environments, phase relationships are typically jumbled unrecognizably when sound is reflected off environmental objects; that is, when reflected sounds of varying amplitudes (depending on the specific configuration and physical properties of the reflecting materials) are added onto sound traveling in a direct line from the source. Thus, phase information does not generally carry information that can reliably aid a listener in identifying sound sources in a reverberant environment (Terhardt, 1988; see also Terhardt, 1991, 1992). This is a matter of particular concern in an ecological approach, as non-reverberant environments are almost non-existent in the real world (anechoic rooms, mountain tops). On the other hand, again in real acoustic environments, spectral frequencies (that is, the frequencies of isolated components of complex sounds, or clear peaks in a running spectrum, forming frequency trajectories in time-varying sounds) cannot be directly affected by reflection off, or transmission through, environmental obstacles. They might be indirectly affected as a byproduct of the effect that such manipulations can have on amplitudes (e.g., a weakly defined peak could be pushed sideways if amplitudes increased on one side and decreased on the other), but such phenomena could hardly affect audible sound spectra. So for the auditory system to reliably identify sound sources, it needs to ignore phase information, which is merely a constant distraction, and focus as far as possible on a signal's spectral frequencies (and to a lesser extent on the relative amplitudes of individual components, keeping in mind that these, too, are affected by reflection and transmission). The ear's phase deafness with regard to pitch perception is thus a positive attribute. In fact, it may be regarded as an important phylogenetic achievement - the result of a long evolutionary process in which animals whose ears allowed phase relationships to interfere with the identification of dangerous or otherwise important sound sources died before they could reproduce. If this scenario is correct, then it is no surprise that we are highly sensitive to small changes in frequency, and highly insensitive to phase relationships within complex sounds. Straightforward evidence of the ear's insensitivity to phase in the sounds of the real human environment has been provided by Heinbach (1988). He reduced natural sounds including speech (with or without background noise and multiple speakers) and music to their spectral contours, which he called the part-tone-time-pattern. In the process, he completely discarded all phase information. The length of the spectrum analysis window was carefully tuned to that of the ear, which depends on frequency. Finally, he resynthesized the original sounds, using random or arbitrary phase relationships. The resynthesized sounds were perceptually indistinguishable from the originals, even though their phase relationships had been shuffled. It is nevertheless possible to create artificial stimuli for which clear, significant perceptual effects of phase relationships on perception can be demonstrated. For example, Patterson (1973, 1987) demonstrated that listeners can discriminate two harmonic complex tones on the basis of phase relationships alone. Moore (1977) demonstrated that the relative phase of the components affects the pitch of harmonic complex tones consisting of three components; for each tone, there were several possible pitches, and relative phase affected the probability of a listener hearing one of those as 'the' pitch. Hartmann (1988) demonstrated that the audibility of a partial within a harmonic complex tone depends on its phase relationship with the other partials. Meddis & Hewitt (1991b) succeeded in modeling these various phase effects, which (as Moore, 1977, explained) generally apply only to partials falling within a single critical band or auditory filter. In an ecological approach, the existence of phase sensitivity in such stimuli (or such comparisons between stimuli) might be explained as follows. These stimuli (or stimulus comparisons) do not normally occur in the human environment. So the auditory system has not had a chance to'learn' (e.g., through natural selection) to ignore the phase effects. As hard as the ear might 'try' to be phase deaf in the above cases, some phase sensitivity will always remain, for unavoidable physiological reasons. There could, however, be some survival value associated with the ability to use phase relationships to identify sound sources during the first few tens of ms of a sound, before the arrival of interference from reflected waves in typical sound environments. On this basis, we might expect phase relationships at least to affect timbre, even in familiar sounds. Supporting evidence for this idea in the case of synthesized musical instrument sounds has recently been provided by Dubnov & Rodet (1997). In the case of speech sounds, Summerfield & Assmann (1990) found that pitch-period asynchrony aided in the separation of concurrent vowels; however, the effect was greater for less familiar sounds (specifically, it was observed at fundamental frequencies of 50 Hz but not 100 Hz). In both cases, phase relationships affected timbre but not pitch. The model of Meddis & Hewitt (1991a) is capable of accounting for known phase dependencies in pitch perception (Meddis & Hewitt, 1991b). This raises the question: why might it be necessary or worthwhile to model something that does not have demonstrable survival value for humans (whereas music apparently does have survival value, as evidenced by the universality of music in human culture). As Bregman (1981) pointed out, we need to "think about the problems that the whole person faces in using the information available to his or her sense organs in trying to understand an environment" (p. 99). From this point of view, the human ear might be better off without any phase sensitivity at all. Bregman goes on to say that "Because intelligent machines are required actually to work and to achieve useful results, their designers have been forced to adopt an approach that always sees a smaller perceptual function in terms of its contribution to the overall achievement of forming a coherent and useful description of the environment." So if one were building a hearing robot, there would be no point in incorporating effects of phase on pitch perception, if such effects did not help the robot to identify sound sources. Bregman, A.S. (1981). Asking the 'What for?' question in auditory perception. In M. Kubovy & J. R. Pomerantz (Eds.), Perceptual organization (pp. 99-118). Hillsdale, N.J. Dubnov, S., & Rodet, X. (1907). Statistical modeling of sound aperiodicities. Proceedings of the International Computer Music Conference, Thessaloniki, Greece, (pp. 43-50). Hartmann, W. (1988). Pitch perception and the segregation and integration of auditory entities. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.), Auditory function (pp. 623-645). New York: Wiley. Heinbach, W. (1988). Aurally adequate signal representation: The Part-Tone-Time-Pattern. Acustica, 67, 113-121. Meddis, R., & Hewitt, M.J. (1991a). Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification. Journal of the Acoustical Society of America, 89, 2866-2882. Meddis, R., & Hewitt, M.J. (1991b). Virtual pitch and phase sensitivity of a computer model of the auditory periphery II: Phase sensitivity. Journal of the Acoustical Society of America, 89, 2883-2894. Moore, B.C.J. (1977). Effects of relative phase of the components on the pitch of three-component complex tones. In E. F. Evans & J. P. Wilson (Eds.), Psychophysics and physiology of hearing (2nd ed.) (pp. 349-362). New York: Academic. Patterson, R.D. (1973). The effects of relative phase and the number of components on residue pitch. Journal of the Acoustical Society of America, 53, 1565-1572. Patterson, R.D. (1987). A pulse ribbon model of monaural phase perception. Journal of the Acoustical Society of America, 82, 1560-1586. Summerfeld, Q., & Assmann, P. F. (1990). Perception of concurrent vowels: Effects of harmonic misalignment and pitch-period asynchrony. Journal of the Acoustical Society of America, 89, 1364-1377. Terhardt, E. (1988). Psychoakustische Grundlagen der Beurteilung musikalischer Kl nge. In J. Meyer (Ed.), Qualit tsaspekte bei Musikinstrumenten (pp. 9-22). Celle: Moeck. Terhardt, E. (1991). Music perception and sensory information acquisition: Relationships and low-level analogies. Music Perception, 8, 217-240. Terhardt, E. (1992). From speech to language: On auditory information processing. In M. E. H. Schouten (Ed.), The auditory processing of speech (p. 363-380). Berlin: Mouton de Gruyter.