[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Intermediate representation for music analysis

Note that no matter what sort of analysis you do, the frequency
resolution is determined by the reciprocal of the analysis window
duration.  So if you want fine resolution for the low frequencies,
you need a long sample set, even if you only need much coarser
resolution at the high frequencies (due to the log nature of hearing).
So, why not just take a long FFT?  Even though they have linear
frequency spacing, FFTs have been heavily optimized for efficient
computation.  I wonder if it might be better using a conventional
FFT and lumping some upper bins together to form quasi-log bands,
rather than using a less-efficient log-spaced filter bank.

There is one weakness to that approach, however, in that if you
set the overall FFT length so that the lowest band you want to
handle is just exactly matched by the lowest FFT spectral line
width, then the next spectral line will be at *twie* that... there will be
no nice fractional-octave alignment.  If you really need that,
a log filter bank may be best.   

However, the way I have seen this handled is to assume (hope?)
that there will be plenty of upper harmonics in the signal, many
of which will fall into regions of the FFT where the resolution
(considered on an octave basis) is much higher.  By looking at
a few of these upper harmonics, it was possible to figure out
what the actual fundamental frequency was to similarly-high resolution.

Best regards,

Bob Masta