PITCH DETECTION PROBLEM SOLVED ("soundmathtech.com" )


Subject: PITCH DETECTION PROBLEM SOLVED
From:    "soundmathtech.com"  <terez(at)SOUNDMATHTECH.COM>
Date:    Fri, 18 Jul 2003 11:04:36 -0700

Hello, everybody, I believe that the "hardy perennial" pitch determination problem of speech (and not only speech) signal processing has its final solution now. The solution is very fundamental and general in nature (and, also, quite simple) and has nothing to do with doing FFT or correlation of a signal. It is based on state-space embedding of a signal - a concept originally introduced more than 20 years ago for analyzing nonlinear and chaotic time series. I presented the new method at ICASSP 2002 in Orlando, Florida. The ICASSP paper and the Matlab demo program are available from http://www.soundmathtech.com/pitch It was also posted to Google comp.dsp and comp.speech.research groups. To date, we had quite a large number of downloads of the paper and demo from all over the world. I believe that some people in this group have actually read the paper and downloaded our demo too. Our US patent application has somewhat more detailed description of the new pitch determination method. It can be accessed online at http://www.uspto.gov/patft (Pub. No.: 20030088401) You need not worry about the patent, if you use our method for non-commercial purposes, such as research or education. As far as software is concerned, you will see it in the near future, possibly as an open-source C code downloadable from our www site. We do not have the resources of AT&T or Microsoft, so our development process is slow. We also do not want to release insufficiently tested or sub-optimal implementation of otherwise extremely robust and general method, so you can wait for our implementation or implement it yourself (in its very basic form our method can be programmed using. e.g Matlab, in 20 minutes or so). I would like to re-iterate once more that, in our opinion, all conventional methods of pitch detection (e.g. correlation-, spectrum- or cepstrum-based) are not viable anymore: they cannot even approach the level of robustness and generality demonstrated by our new method. It is only a matter of time (and the availability of quality software implementations) until all those algorithms (ESPS get_f0 etc.) are replaced with a standard algorithm implementing our method (or, more likely, with several different implementations intended for different purposes, e.g. low-bit-rate vocoders, tools for speech analysis, etc.) Best regards, Dmitry Terez SoundMath Technologies, LLC


This message came from the mail archive
http://www.auditory.org/postings/2003/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University