[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Detection of harmonics and rhythmic structure

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Detection of harmonics and rhythmic structure
From: Paul Boersma <paul.boersma@xxxxxxxxxx>
Date: Fri, 12 May 2000 02:06:30 +0200
Comments: cc: Brian Gygi <bgygi@INDIANA.EDU>
In-reply-to: <Pine.GSO.3.96.1000511141153.28455w-100000@kate.ucs.indiana.edu>
References: <Pine.GSO.3.96.1000511141153.28455w-100000@kate.ucs.indiana.edu>
Reply-to: Paul Boersma <paul.boersma@xxxxxxxxxx>
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>

Brian Gygi wrote:
> Does anyone know some good algorithms for determining a) the presence or
> absence of harmonics in a signal (non-speech), and b) whether the signal
> is discrete or rhythmic (repetitive)?  I can imagine that these two
> questions are related, one is in the frequency domain and one in the time
> domain.  I have fooled around with autocorrelations, but want to be able
> to extract a number that would capture the amount of either harmonic or
> rhythmic structure in a signal.

The Praat program can perform a harmonicity (= harmonics-to-noise ratio) analysis,
which measures the degree of periodicity of a sampled signal. You specify
the minimum and maximum frequency, and the algorithm will
look for repetitive wave shapes in between. It works for
non-speech as well as for speech, because the thing measured
is mathematical shape similarity (autocorrelation-based
or crosscorrelation-based). The algorithm I use is by far the
most accurate (or sensitive, if we talk about detecting noise in periodicity)
of all the known algorithms, and is described in a 1993 paper,
downloadable from my home page. The accuracy derives from
regarding a sampled signal analytically as a sum of sinc functions,
from using a Gaussian analysis window, from dividing the
autocorrelation of the windowed signal by the autocorrelation
of the window itself, and from taking negative lags into account
so that the algorithm is accurate for repetition rates
up to 80% of the Nyquist frequency.

The algorithm gives a number in dB, which is equivalent to
a relative periodicity power of 1/(1+10^(-dB/10)):
   90 dB   0.999999999 (i.e. almost perfectly periodic)
   60 dB   0.999999
   30 dB   0.999
   10 dB   0.91
     0 dB   0.5 (i.e. as much harmonic power as noise power)
  -10 dB  0.09
For voicing measurements in speech, this method is used for
the voiced/unvoiced decision, with the criterion usually near 0 dB.

If you want to use the algorithm for analysing repetitive noises
(i.e. with different phase structure in each "period"),
you can take the following steps:
1. Square the signal, i.e. multiply it by itself,
   e.g. by using the formula "self*self" or "self^2" in Praat.
   Before you do that, however, make sure that the signal is band-limited
   to half the Nyquist frequency, i.e. one quarter of the sample rate!
2. Smooth the squared signal by convolving it with a Gaussian window
   (or multiply by a zero-centered Gaussian in the frequency domain,
    i.e. "To Spectrum", "Formula... self*exp(-(x/50)^2)", "To Sound").
3. Do the analysis ("To Harmonicity"). The result is a HNR as
   a function of time. You can draw or query it.

Best wishes,
   Paul
--

Paul Boersma
Institute of Phonetic Sciences, University of Amsterdam
Herengracht 338, 1016CG Amsterdam, The Netherlands
http://www.fon.hum.uva.nl/paul/

References:
- Detection of harmonics and rhythmic structure
  - From: Brian Gygi

Prev by Date: Re: Detection of harmonics and rhythmic structure
Next by Date: Re: Detection of harmonics and rhythmic structure
Previous by thread: Detection of harmonics and rhythmic structure
Next by thread: Re: Detection of harmonics and rhythmic structure
Index(es):
- Date
- Thread