[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About importance of "phase" in sound recognition

On 5 Oct 2010 at 19:03, emad burke wrote:

>  I'm exactly talking about what is called
> "in-sensitivity to phase". I'm talking about the phase information that is
> discarded in the process of MFCC feature extraction and it has been proven
> to be succesfull feature set for speech recognition. The "insensitivity to
> phase" that implicitly implies that if you change the order (precedence) of
> travelling waves in each cochlear channel among each other, it will not
> affect the perception and you can add random phases to different channels
> without affecting the perception(?).

One classical way to demonstrate this insensitivity is to 
build up a wave from several component frequencies, and 
listen to the sum.  Then change only the relative phases 
and see if you can detect a difference.  It turns out that 
you can't, most of the time.  (This assumes that you turn 
the sound off while you are making the changes... it is 
easy to hear dynamic changes.)

You do need to use a bit of caution:  Different phase 
relations can cause large differences in waveform peak 
heights, and the larger peaks can produce distortion due to 
nonlinearities in the speaker, the ear, or even the air 
itself. So you might hear a difference that isn't really 
due to phase as such, just added components due to 
distortion.  But this is not a problem for "reasonable" 
listening levels.

You can use my Daqarta software to demonstrate the 
insensitivity for yourself with any Windows system.  
Click the Generator button to get a default 440 Hz sine, 
and adjust the controls for a comfortable level. 

In the Generator dialog, click on the left Waveform 
Controls button (midway down the dialog) and you will get 
the control dialog for the left Stream 0.  (There are four 
streams per channel, labeled 0-3, which are summed together 
by default.)

Set the Level for Stream 0 to (say) 50%, since the total 
for all streams must be no more than 100%.  (If you want to 
use four equal-amplitude components, then set each to 25%. 
Here I assume you will set up the first four components of 
a square wave.)

Now at the top of this dialog click on '1' to change to 
Stream 1, and set its Tone Freq to 3 * 440 = 1320. Set its 
Level to 1/3 * 50 = 16.67%.  Now toggle Stream On near the 
top of the dialog to add it to the output.

Repeat as needed for Streams 2 and 3.

At this point all components are in phase.  To set any 
component to an arbitrary phase, click on its Tone Freq 
button to bring up the control dialog, and adjust Main 
Phase as desired.  

Please let me know if there are any questions of problems.

Best regards,

Bob Masta
            D A Q A R T A
Data AcQuisition And Real-Time Analysis
Scope, Spectrum, Spectrogram, Signal Generator
    Science with your sound card!