[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About importance of "phase" in sound recognition

Hello (again) Emad,
Today I redid the synthesizer experiment described yesterday, not with sine tones, but with harmonic complex tones (voice C25, harmonica). First case: press white key C1 (440.0 Hz), keep it pressed, then press black key C#1 (440.0 Hz, too). Now from one try to the next, both the loudness and the timbre changed strongly. The period of a 440-Hz-tone is T = 1/440 = 0.002272727... second. If, e.g., the delay between the two tones happens to amount to 238.5 T = 0.542045 second, or to 391.5 T = 0.889773 second, then the fundamental (also called first partial or first harmonic) is extinguished, but the first overtone (the second partial or harmonic) is enhanced. Second case: press key C1 (440.0 Hz), keep it pressed, then press key D1 (441.6 Hz). There ist no difference from one try to the next (1.6 beats per second).

----Ursprüngliche Nachricht----
Von: emad.burke@xxxxxxxxx
Datum: 05.10.2010 19:03
An: <AUDITORY@xxxxxxxxxxxxxxx>
Betreff: Re: About importance of "phase" in sound recognition

Hi Kevin,

thanks for the reply. the phase definition that I'm talking about is more of the third definition of yours. I'm exactly talking about what is called  "in-sensitivity to phase". I'm talking about the phase information that is discarded in the process of MFCC feature extraction and it has been proven to be succesfull feature set for speech recognition. The "insensitivity to phase" that implicitly implies that if you change the order (precedence) of travelling waves in each cochlear channel among each other, it will not affect the perception and you can add random phases to different channels without affecting the perception(?).

Now this was on one hand. On the other hand, couple of years ago there was a publication by a mathematician (pete-cassaza) that kind of reinforced the argument of phase insensitivety of speech recognition, but this time mathematically; very briefly stating that if you have a redundant set of magnitude coeeficients, then phase doesnt matter at all, and as they say in the paper this mathematically confirms the belief in the speech recognition community over the years about phase insensitivity, ...

And also there are some papers on the opposition side as well. This basically is the source of my confusion.

Emad [...]


Reinhart Frosch,
Dr. phil. nat.,
CH-5200 Brugg.