[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Intermediate representation for music analysis

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Intermediate representation for music analysis
From: Hugh McDERMOTT <hughm@xxxxxxxxxxxxxx>
Date: Tue, 18 Jul 2006 17:18:38 +1000
Delivery-date: Tue Jul 18 03:31:21 2006
In-reply-to: A<44BB51DE.9058.2F652D@localhost>
Reply-to: Hugh McDERMOTT <hughm@xxxxxxxxxxxxxx>
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
Thread-index: AcappMoZhiVqLq68RhK6idg8FV4YFwAlDkqw
Thread-topic: Intermediate representation for music analysis

I would add to this that, using an FFT, it is quite easy to measure the
component frequencies of a complex signal with precision that is finer
than the bin spacing. One just needs to estimate the rate of change of
the phase of a component within a bin. This technique, which has been
described in the context of the so-called phase vocoder algorithm,
permits the frequency of each signal component resolved by the FFT to be
estimated more precisely than the limit apparently imposed by the FFT
bin spacing in the frequency domain.

Best regards,

Hugh McDermott, PhD
Principal Research Fellow
Department of Otolaryngology
The University of Melbourne
384 - 388 Albert Street,
East Melbourne.  3002
Australia.
Phone: +61 3 9929 8665
Fax: +61 3 9663 6086
E-mail: hughm@xxxxxxxxxxxxxx
Web page: http://www.medoto.unimelb.edu.au/people/mcdermoh/


-----Original Message-----
From: AUDITORY Research in Auditory Perception
[mailto:AUDITORY@xxxxxxxxxxxxxxx] On Behalf Of Bob Masta
Sent: Monday, 17 July 2006 11:01 PM
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Intermediate representation for music analysis

Note that no matter what sort of analysis you do, the frequency
resolution is determined by the reciprocal of the analysis window
duration.  So if you want fine resolution for the low frequencies, you
need a long sample set, even if you only need much coarser resolution at
the high frequencies (due to the log nature of hearing).
So, why not just take a long FFT?  Even though they have linear
frequency spacing, FFTs have been heavily optimized for efficient
computation.  I wonder if it might be better using a conventional FFT
and lumping some upper bins together to form quasi-log bands, rather
than using a less-efficient log-spaced filter bank.

There is one weakness to that approach, however, in that if you set the
overall FFT length so that the lowest band you want to handle is just
exactly matched by the lowest FFT spectral line width, then the next
spectral line will be at *twie* that... there will be no nice
fractional-octave alignment.  If you really need that,
a log filter bank may be best.   

However, the way I have seen this handled is to assume (hope?) that
there will be plenty of upper harmonics in the signal, many of which
will fall into regions of the FFT where the resolution (considered on an
octave basis) is much higher.  By looking at a few of these upper
harmonics, it was possible to figure out what the actual fundamental
frequency was to similarly-high resolution.

Best regards,

Bob Masta

audioATdaqartaDOTcom

Follow-Ups:
- Re: Intermediate representation for music analysis
  - From: Ilya Sedelnikov

References:
- Re: Intermediate representation for music analysis
  - From: Bob Masta

Prev by Date: Re: Intermediate representation for music analysis
Next by Date: musical training and pitch resolution
Previous by thread: Re: Intermediate representation for music analysis
Next by thread: Re: Intermediate representation for music analysis
Index(es):
- Date
- Thread