Re: [AUDITORY] Measuring perceptual similarity

Dear Pat,

in addition to what pointed out by Daniel, I personally favour pairwise dissimilarity ratings over sorting unless the number of stimuli is so large that it is not possible to acquire a full dissimilarity matrix in one humane experimental session (<= ~40 stimuli). As you might have read, dissimilarity ratings produces estimates of the distances that are much more reliable (because of the larger number of stimulus playback involved, I suppose), much less distorted (cf. binarization of dissimilarities in free sorting and skewed distribution of hierarchical sorting dissimilarities), and much more indicative of stimulus features than the more efficient sorting methods.

Alternative methods come to mind that rely on the placement of stimuli on a visual space and consider the inter-stimulus distances as estimates of the perceptual dissimilarities (e.g., Harbke, 2003; Kriegeskorte and Mur, 2012). Importantly and unsurprisingly, these "direct MDS" methods bias the perceptual space towards a 2D representation (see Harbke, 2003) and for this reason are a suboptimal choice for the discovery of perceptually relevant stimulus features.

In short, there is no free meal in the behavioural estimation of distances: if your goal is accuracy, methods that are less efficient from the time allocation point of view are, in my opinion, still the best option.

Best,

Bruno

@MastersThesis{harbke2003evaluation,

author = {C. R. Harbke},

title = {Evaluation of data collection techniques for multidimensional scaling with large stimulus sets},

school = {Washington State University, Department of Psychology},

year = {2003},

}

@article{kriegeskorte2012inverse,

title={Inverse MDS: Inferring dissimilarity structure from multiple item arrangements},

author={Kriegeskorte, Nikolaus and Mur, Marieke},

journal={Frontiers in psychology},

volume={3},

pages={245},

year={2012},

publisher={Frontiers}

}

On 20 March 2018 at 12:29, Oberfeld-Twistel, Daniel <oberfeld@xxxxxxxxxxxx> wrote:

Thanks for sharing the references!

In my view, MUSHRA cannot be recommended for studying musical similarity.

The method is designed to identify differences between stimuli on a defined dimension (which is audio quality in the MUSHRA recommendation, although this rating method could also be used for evaluating other perceptual dimensions).

In the MUSHRA method, listeners are NOT asked to rate the similarity of the stimuli, however. While in principle information about similarity could be deduced in an indirect manner from the ratings obtained with MUSHRA (similar mean ratings = high similarity), this would require that you can specify the perceptual dimension on which your stimuli differ or are similar (say, rhythm, tempo, consonance/dissonance, mood etc.).

If that is not possible, the other approaches that were suggested like triadic tests or MDS can be used *without* having to specify which exact dimension the similarity judgments should refer to, and to identify structures in the (dis-) similarity ratings.

In addition, I could imagine that the MUSHRA concepts of a high-quality “reference” and a low-quality “anchor” do not easily apply to the experiments you have in mind.

Best

Daniel

---------------------------------

Dr. Daniel Oberfeld-Twistel

Associate Professor

Johannes Gutenberg - Universitaet Mainz

Institute of Psychology

Experimental Psychology

Wallstrasse 3

55122 Mainz

Germany

Phone ++49 (0) 6131 39 39274

Fax ++49 (0) 6131 39 39268

http://www.staff.uni-mainz.de/oberfeld/

https://www.facebook.com/WahrnehmungUndPsychophysikUniMainz

From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> On Behalf Of Pat Savage
Sent: Tuesday, March 20, 2018 6:19 AM
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Measuring perceptual similarity

Dear list,

Thanks very much for all of your responses. I’m summarizing below all the reference recommendations I received.

I still want to more fully read some of these, but so far my impression is that the Giordano et al. (2011) paper gives a good review of the benefits and drawbacks of previous methods, but since that was published MUSHRA seems to have become the standard method for these types of subjective perceptual similarity ratings.

Please let me know if I seem to be misunderstanding anything here.

Cheers,

Pat

--

Flexer, A., & Grill, T. (2016). The Problem of Limited Inter-rater Agreement in Modelling Music Similarity. Journal of New Music Research, 45(3), 1–13.

P. Susini, S. McAdams, S. Winsberg: A multidimensional technique for sound quality assessment. Acta Acustica united with Acustica 85 (1999), 650–656.

Novello, A., McKinney, M. F., & Kohlrausch, A. (2006). Perceptual evaluation of music similarity. In Proceedings of the 7th International Conference on Music Information Retrieval. Retrieved from http://ismir2006.ismir.net/PAPERS/ISMIR06148_Paper.pdf

Michaud, P. Y., Meunier, S., Herzog, P., Lavandier, M., & D’Aubigny, G. D. (2013). Perceptual evaluation of dissimilarity between auditory stimuli: An alternative to the paired comparison. Acta Acustica United with Acustica, 99(5), 806–815.

Wolff, D., & Weyde, T. (2011). Adapting Metrics for Music Similarity Using Comparative Ratings. 12th International Society for Music Information Retrieval Conference (ISMIR’11), Proc., (Ismir), 73–78.

B. L. Giordano, C. Guastavino, E. Murphy, M. Ogg, B. K. Smith, S. McAdams: Comparison of methods for collecting and modeling dissimilarity data: Applications to complex sound stimuli. Multivariate Behavioral Research 46 (2011), 779–811.

P. Y. Michaud, S. Meunier, P. Herzog, M. Lavandier, G. d’Aubigny: Perceptual evaluation of dissimilarity between auditory stimuli: an alternative to the paired comparison. Acta Acustica united with Acustica 99 (2013), 806–815.

Collett, E., Marx, M., Gaillard, P., Roby, B., Fraysse, B., & Deguine, O. (2016). Categorization of common sounds by cochlear implanted and normal hearing adults. Hearing Research, 335, 207–219.

International Telecommunication Union. (2015). ITU-R BS.1534-3, Method for the subjective assessment of intermediate quality level of audio systems. ITU-R Recommendation, 1534–3, 1534–3. Retrieved from https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1534-3-201510-I!!PDF-E.pdf

Lavandier, M., Meunier, S., & Herzog, P. (2008). Identification of some perceptual dimensions underlying loudspeaker dissimilarities. Journal of the Acoustical Society of America, 123(6), 4186–4198.

DZHAFAROV, E. N., & OLDENBURG, H. C. (2006). RECONSTRUCTING DISTANCES AMONG OBJECTS FROM THEIR DISCRIMINABILITY. Psychometrika, 71(2), 365–386.

Software:

Various:

http://www.ak.tu-berlin.de/menue/digitale_ressourcen/research_tools/whisper_listening_test_toolbox_for_matlab/

MUSHRA:

https://github.com/audiolabs/webMUSHRA

Free-sorting:

http://petra.univ-tlse2.fr/spip.php?rubrique49&lang=en

---
Dr. Patrick Savage
Project Associate Professor

Faculty of Environment and Information Studies

Keio University SFC (Shonan Fujisawa Campus)
http://PatrickESavage.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Bruno L. Giordano, PhD – CR1

Institut de Neurosciences de la Timone (INT)

UMR 7289, CNRS and Aix Marseille Université

http://www.int.univ-amu.fr/

Campus santé Timone

27, boulevard Jean Moulin

13385 Marseille cedex 5

Www: http://www.brunolgiordano.net