[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: About importance of "phase" in sound recognition
On 5 Oct 2010 at 19:03, emad burke wrote:
> I'm exactly talking about what is called
> "in-sensitivity to phase". I'm talking about the phase information that is
> discarded in the process of MFCC feature extraction and it has been proven
> to be succesfull feature set for speech recognition. The "insensitivity to
> phase" that implicitly implies that if you change the order (precedence) of
> travelling waves in each cochlear channel among each other, it will not
> affect the perception and you can add random phases to different channels
> without affecting the perception(?).
One classical way to demonstrate this insensitivity is to
build up a wave from several component frequencies, and
listen to the sum. Then change only the relative phases
and see if you can detect a difference. It turns out that
you can't, most of the time. (This assumes that you turn
the sound off while you are making the changes... it is
easy to hear dynamic changes.)
You do need to use a bit of caution: Different phase
relations can cause large differences in waveform peak
heights, and the larger peaks can produce distortion due to
nonlinearities in the speaker, the ear, or even the air
itself. So you might hear a difference that isn't really
due to phase as such, just added components due to
distortion. But this is not a problem for "reasonable"
You can use my Daqarta software to demonstrate the
insensitivity for yourself with any Windows system.
Click the Generator button to get a default 440 Hz sine,
and adjust the controls for a comfortable level.
In the Generator dialog, click on the left Waveform
Controls button (midway down the dialog) and you will get
the control dialog for the left Stream 0. (There are four
streams per channel, labeled 0-3, which are summed together
Set the Level for Stream 0 to (say) 50%, since the total
for all streams must be no more than 100%. (If you want to
use four equal-amplitude components, then set each to 25%.
Here I assume you will set up the first four components of
a square wave.)
Now at the top of this dialog click on '1' to change to
Stream 1, and set its Tone Freq to 3 * 440 = 1320. Set its
Level to 1/3 * 50 = 16.67%. Now toggle Stream On near the
top of the dialog to add it to the output.
Repeat as needed for Streams 2 and 3.
At this point all components are in phase. To set any
component to an arbitrary phase, click on its Tone Freq
button to bring up the control dialog, and adjust Main
Phase as desired.
Please let me know if there are any questions of problems.
D A Q A R T A
Data AcQuisition And Real-Time Analysis
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!