[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Importance of "phase" in sound recognition

Actually the firing pattern of the auditory nerves is phase-locked to the stimuli, at least for frequencies which do not exceed ~3kHz. The reason is that the ion channels in the hair cells are widened when the associated point on the basilar membrane is at the crest of the wave (think of a buckled rod - the 'outer' part is decompressed and the inner part is compressed). When the ion channel is widened, ions flow more easily and the potential develops faster. At higher frequencies the capacitance of the cell, which acts as a low-pass filter, decreases the phase-locking.

So the ear preforms analysis both in the spectral domain and the time domain. (Of course the perception of phase is another question).

On Sun, Oct 10, 2010 at 11:35 PM, John Bates <jkbates@xxxxxxxxxxx> wrote:



Here's something else to consider for your research.


Traditionally, it has been dogma that the cochlea responds only to a sound’s amplitude spectrum; therefore we should not hear changes caused by varying phase. Yet it has been shown repeatedly that we do hear changes in sounds as their phase spectrum is varied. How can this be?


Let's look at the problem: In terms of spectral analysis, we find that as we vary the phase the amplitude spectrum is invariant. Therefore, we conclude that the perceived changes in the sound are associated with changes in the phase spectrum. Somehow, the ear must be responding to a supposedly irrelevant phase spectrum. But where is the evidence?


Here’s an idea: If we look at the signal's waveform, we notice that its pattern also varies in accord with the phase variations. Thus, it would appear that in lieu of a phase analyzer, the ear "reads" waveforms. As absurd as this might seem, how else could the sound changes be heard? We are thus convinced that the cochlea must be processing a phase/waveform source. Now we ask, “What is the most available and usable _expression_ of waveform?”


Spatial patterns can be described in terms of their inflection points, in our case, having time-space locations identified by sequences of real and complex zeros, readily obtained physically by finding the waveform derivatives. (H. Voelker and A. Requicha) By using delay lines to preserve past events for present use (the cochlea?), meaningful temporal patterns in the stream of zeros (pitch?) can be recognized. Information such as amplitude and direction of arrival can be associated with patterns of events that are referenced to the zeros. In simple terms; the ear processes sound in the time domain, not the frequency domain. The trick is to find out how the ear does these things. And keep in mind that they are done in real time and are synchronized with the signal waveform.


So, there you are: The most likely answer for you, that I can see, is that the cochlea and its various parts must derive meaningful information from signal waveforms by recognizing patterns in the temporal sequences of their zeros.


John Bates

From: emad burke
Sent: Tuesday, October 05, 2010 11:23 AM
Subject: About importance of "phase" in sound recognition

Dear List,

I've been confused about the role of "phase" information of the sound (eg speech) signal in speech recognition and more generally human's perception of audio signals. I've been reading conflicting arguments and publications regarding the extent of importance of phase information. if there is a border between short and long-term phase information that clarifies this extent of importance, can anybody please introduce me any convincing reference in that respect. In summary I just want to know what is the consensus in the community about phase role in speech recognition, of course if there is any at all.