[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: [AUDITORY] About importance of "phase" in sound recognition

Thank you.

I had thought that phase was part of a model related to the interpretation of sound. In my classes I often speak of displacement in time so that sound is only about amplitude -- which is what Hugh Le Cain once told me. Looking at the loudspeaker cone, or the tympanic membrane, this certainly seemed to make sense to me. I also learned that Fourier analysis was one way, but not the only way of analyzing sound.

If I understand correctly, if the answer to a question is to "listen", then the response will be statistical in nature, ie psychoacoustic. And my reading of the subject line is that this was a question about psychoacoustics.

The original question was:
> In summary I just want to know what is the consensus in the community about phase role in speech recognition, of course if there is any at all.

My first response was to try to clarify whether the writer was talking about "time delay", or the idea of some parts of some (ideal) signal being out of phase. My reading didn't get to the second part of the question / answer which for me is found in the idea that the response will be statistical in nature. I may be able to recognize speech (itself under-defined) in conditions that others can't.

I hear speech in orchestral playing, and in drums. A short study of north Indian drumming may allow the listener to detect speech patterns in tabla playing. I hear it.

I think the "perceptual" issue arises because that was the question.


On 2010, Oct 9, at 4:15 PM, James Johnston wrote:

> To the below. I'm describing how to make a signal for which phase is audible. The fact I'm using an FFT to generate the signal is, frankly, not relevant to this discussion. I could as well just describe it as the sum of sines with different signs on the amplitude.
> Why the "perceptual" issue even arises here, except in that you LISTEN to the results, is beyond me.
> Given that I've been doing signal processing for some 35 years, I dare say that I'm well aware of the very basic propreties of the various Fourier transforms. This is also irrelevant, 
> just
> That's what's relevant. I repeat, make two signals, one sines of 500-4 500 and 500+4 Hz, amplitudes .25 1 .25 and then make another amplitude .25 1 -.25, or alternatively -.25 1 .25.  Make sure you don't clip this when you render it, (i.e. apply proper gain scaling) and LISTEN.
> I repeat LISTEN.
> When you're done, you have listened to two signals with the same amplitude spectrum, and different phase spectra.
> That's the only point. If you hear a difference, phase is audible. If you don't, then maybe it's not.
> There's no "perceptual bias" here, only a signal generator that you apply to your own perception, and a signal generator that goes directly to the heart of the question "is phase audible".  Phase has a definition. I am addressing, directly the actual definition of phase, which is the only relevance of Fourier mathematics here. And phase, I gather, is the debate.  I PRESUME we all use the standard definition of phase?
> MY GOODNESS. Are we to argue about the existance of the Dirac Delta next?
> jj
> ________________________________________
> But the Fourier transform as used here is a 1-1 transform, without
> redundancy.  All reconstruction from magnitude methods rely on
> redundancy - Griffin & Lim use FFT blocks that overlap fully, and the
> algorithms by Cassaza et al for polynomial time inversion rely on N^2
> magnitude coefficients.
> The Fourier transform is a projection of a signal onto infinite-length
> sinusoids, (or in the case of the STFT, a circulant projection onto
> short-time sinusoids) which is not very perceptually based.
> Joe.
> --
> Joachim Thiemann :: http://www.tsp.ece.mcgill.ca/~jthiem
> Notice:
> This message and any included attachments are intended only for the use of the addressee, and may contain information that is privileged or confidential. If you are not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please destroy the original message and any copies or printouts hereof.