[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: multidimensional scaling of timbre

One way to test for timbre dimensions beyond the commonly found
MDS correlates, spectral centroid and rise time, is to generate a
group of sounds with the same centroid and rise time and apply
MDS on these stimuli. 

This was originally done in 

Lakatos, S. and Beauchamp, J. (2000). "Extended perceptual spaces 
for pitched and percussive timbres" (A), J. Acoust. Soc. Am., Vol. 
107, No. 5, Pt. 2, p. 2882.


Beauchamp, J. and Lakatos, S. (2002). "New Spectro-Temporal Measures 
of Musical Instrument Sounds Used for a Study of Timbral Similarity 
of Rise-Time- and Centroid-Normalized Musical Sounds", Proc. 7th Int. 
Conf. on Music Perception & Cognition, Univ. of New South Wales, 
Sydney, Australia, pp. 592-595.

However, the results, although interesting, were not very conclusive.

In 2006 some colleagues and I did an experiment on a different set
of tones (all pitched), and I presented the results at the fall 2006 
ASA meeting.

Both static and dynamic musical tones where generated. All tones
were equalized for pitch, loudness, duration, rise time, decay
time, and average spectral centroid. The dynamic tones (10 of
them) were derived from orchestral instrument tones. Even though
they were equalized for the parameters mentioned above, they
retained significant spectral variation (flux). The static tones
were derived from the dynamic ones but had no spectral variation
or amplitude variation (other than onset and offset ramps).

We used two different MDS programs: SPSS and MatLab.

The spectral correlates we tested were: 1) ratio of even harmonic
rms amplitude to odd harmonic rms amplitude; 2) spectral irregularity;
3) spectral centroid variation; and 4) spectral incoherence.

The results we found were:

1) MDS and SPSS gave similar results in terms of the placement of
   the instruments in 2D and 3D MDS spaces.
2) Even/odd ratio correlated consistently well with a dimension of
   2D and 3D solutions (0.68 <= R <= .82).
3) Spectral irregularity correlated moderately well (.69 <= R <= .82)
   except for the dynamic tone 2D solutions (.39 <= R <= .40).
4) For dynamic tones, spectral centroid variation correlated better 
   (.68 <= R <= .83) than spectral incoherence (.53 <= R <= .83)..
5) The 2D solution was adequate for the static tones but a 3D solution
   was necessary for the dynamic tones.
6) Instrument clusters resulting from x-y plots of the first two 
   components of a PCA solution agree well with the 2D MDS results and 
   results for static and dynamic tones are nearly identical.

So far the paper in question exists only as an abstract:

Beauchamp, J. W.; Horner, A. B.; Koehn, H.-F.; Bay, M. (2006). 
"Multidimensional scaling analysis of centroid- and attack/decay-
normalized musical instrument sounds" (A) J. Acoust. Soc. Am., 
Vol. 120, No. 5, Pt. 2, p. 3276.

although I can supply a PDF of the power point talk if anyone is


James W. Beauchamp                                                
Professor Emeritus of Music and Electrical & Computer Engineering
University of Illinois at Urbana-Champaign
email: jwbeauch@xxxxxxxx (also: jwbeauch@xxxxxxxxxxxxxxxxxxxxxx)
phone: +1-217-344-3307 (also: 217-333-3691)
WWW:  http://ems.music.uiuc.edu/beaucham

Original message:
>From: Paul Iverson <p.iverson@xxxxxxxxx>
>Date: Mon, 20 Oct 2008 10:26:54 +0100
>To: AUDITORY@xxxxxxxxxxxxxxx
>Subject: Re: [AUDITORY] multidimensional scaling of timbre
>I don't know of a paper on this topic, but here are some impressions.
>It is clear that people change their attention to dimensions depending  
>on the set. For example, if there are large pitch variations among the  
>stimuli, listeners' ratings are dominated by that dimension,  whereas  
>they attend more specifically to timbre dimensions if that pitch  
>variation is removed. On the face of it, it thus seems very plausible  
>that listeners can only attend to a small number of dimensions at a  
>time. It is certainly the case that higher dimensions in MDS solutions  
>become progressively less interpretable, which suggests that they may  
>just be modeling noise.
>That being said, it is hard to say whether this reveals an attentional  
>limitation or is a measurement issue with rating scales and the MDS  
>procedure. For example, I find that there is always much more  
>unaccounted variance when I am using real recordings than when I am  
>using a set (usually speech stimuli) that have been synthesized to  
>vary on only a small number of dimensions. For example, a set that has  
>been synthesized to be two dimensional will fit into a two-dimensional  
>solution far better than a set of natural recordings will fit into a  
>two-dimensional solution. I think that this indicates that listeners  
>are indeed sensitive to higher dimensions in the natural stimuli; the  
>unaccounted variance in such MDS experiments is not just noise.  
>However, the relative contribution of those dimensions begins to be  
>small enough to merge with the level of noise in the data, such that  
>they can no longer be modeled very well by MDS. That is, there is  
>usually enough gain to measure only a few of the most influential  
>dimensions that drove the rating-scale judgements.
>Best regards,
>Paul Iverson, Ph.D.
>UCL Division of Psychology & Language Sciences
>Chandler House
>2 Wakefield Street
>London WC1N 1PF