[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Acoustical similarity



there may be something of interest to you in Villa Pulkki's approach at
HUT.Fi. he algorithmically separates source and reflected sound to good
effect.
regards
ppl

>>> "Bruno L. Giordano" <bruno.giordano@xxxxxxxxxxxxxxx> 05/02/2007
15:45 >>>
Hello James,

I use features for the quantification of the acoustical correlates of 
behavioral data.

However, the very process of the definition of the features requires 
making guesses (i.e., assumptions) about what is perceptually
relevant.

While this problem might not be incredibly pressing for the researcher

working on simple, synthetic stimuli, it can become painful when 
perception of complex everyday sounds is of interest. Indeed, given the

informational richness of these latter, it is possible that the 
researcher, in the features-definition process, does not capture all of

what is used by a listener.

Therefore the idea of general metrics.

	Bruno



James McDermott wrote:
>> From:    "Bruno L. Giordano"
>>
>> I am looking for "general" metrics of the acoustical (not
perceived)
>> similarity between mono signals independent of a features
extraction
>> stage (e.g., peak level, harmonicity etc.).
>>
>> Ideally, this metric would operate on a low-level representation of
the
>> signal (ideally the waveform).
>>
> 
> Hi Bruno,
> 
> I am doing work which involves measuring similarity for machine
> learning applications. One standard method (eg in evolutionary
> computation) is to take a mean square error over the magnitude or
> power spectrum: ie for two signals x and y of length N, window them
> and take the DFT of each window and then take the magnitude of each
> bin, to produce two sequences of spectra, X_i and Y_i: the distance
is
> then
> 
> d(x, y) = sum_i (sum_n (X_i[j] - Y_i[j]) ^2)
> 
> You can indeed define a purely time-domain distance measure:
> 
> d(x, y) = sum_n (x[n] - y[n]) / N
> 
> but it seems to be pretty useless: eg if we construct y by
> phase-inverting x, we get a very large distance between them, even
> though they sound exactly the same.
> 
> As you know, in other applications (such as automatic
classification),
> the extraction of features is more common.
> 
> I'd be interested to hear more about your application and why you
> don't want to extract features?
> 
> James
> 

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________