[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About importance of "phase" in sound recognition

Hello Emad,

I just did an experiment on my Yamaha DX11 synthesizer. I programmed it to generate sine tones (i.e., "pure" tones). White key C1: 440.0 Hz; black key C#1: 440.0 Hz, too; white key D1: 441.6 Hz.

First case: first press key C1, keep it pressed, and then press key C#1 (giving the same frequency of 440.0 Hz). The loudness differs strongly from one try to the next, i.e., the loudness depends on the relative phase of the two sine-tones. That result is plausible: If the phase happens to be zero or 2pi or 4pi, etc., then there is constructive interference. If the phase is pi or 3pi or 5pi, etc., then the interference is destructive.

Second case: first press key C1, keep it pressed, and then press key D1 (441.6 Hz). What one hears does not differ from one try to the next. One hears 1.6 beats per second.


----Ursprüngliche Nachricht----
Von: emad.burke@xxxxxxxxx
Datum: 05.10.2010 19:03
An: <AUDITORY@xxxxxxxxxxxxxxx>
Betreff: Re: About importance of "phase" in sound recognition

Hi Kevin,

thanks for the reply. the phase definition that I'm talking about is more of the third definition of yours. I'm exactly talking about what is called  "in-sensitivity to phase". I'm talking about the phase information that is discarded in the process of MFCC feature extraction and it has been proven to be succesfull feature set for speech recognition. The "insensitivity to phase" that implicitly implies that if you change the order (precedence) of travelling waves in each cochlear channel among each other, it will not affect the perception and you can add random phases to different channels without affecting the perception(?).

Now this was on one hand. On the other hand, couple of years ago there was a publication by a mathematician (pete-cassaza) that kind of reinforced the argument of phase insensitivety of speech recognition, but this time mathematically; very briefly stating that if you have a redundant set of magnitude coeeficients, then phase doesnt matter at all, and as they say in the paper this mathematically confirms the belief in the speech recognition community over the years about phase insensitivity, ...

And also there are some papers on the opposition side as well. This basically is the source of my confusion.


Reinhart Frosch,
Dr. phil. nat.,
CH-5200 Brugg.
reinifrosch@xxxxxxxxxx .