Re: identification test procedure (Pierre Divenyi )


Subject: Re: identification test procedure
From:    Pierre Divenyi  <pdivenyi(at)MARVA4.NCSC.MED.VA.GOV>
Date:    Fri, 9 Oct 1998 10:30:33 -0700

--=====================_332608125==_.ALT Content-Type: text/plain; charset="us-ascii" At 10:43 AM 10/9/98 -0500, James W. Beauchamp wrote: >Folks, > >A colleague of mine is using a particular procedure to test how well >synthetic piano tones resemble "original" piano tones. The "original" tones >are actually analyzed and resynthesized using a phase vocoder method (in >order to eliminate background noise) and the synthetic tones are >resynthesized with a fair amount of data reduction. Pitches between E1 and >D7 of equal numbers of original and synthetic tones are presented in random >order to the listener, and the listener's task is to choose which tones are >synthetic. My colleague's contention is that if the listener were to simply >guess, the score would be 50%, and that any score less than 50% indicates >poor ability to distinguish, thus proving the efficacy of the data >reduction method. Indeed, the scores ranged from 15% to 45%. > >My question is -- Does this really work? If the listener tries to maximize >his/her score, then he/she might tend to guess, moving the score towards >50%. On the other hand, if the subject is truly honest and he/she cannot >distinguish, he/she would choose no tones and would score 0%. I guess it >would depend on how the subjects are instructed to make their decisions, >but then can you trust them to always follow instructions? Shouldn't the >false positives also be reported? > >Is there a way to properly evaluate the results of this test? > >What is the best paradigm to test the efficacy of a data reduction method? > >Jim Beauchamp >University of Illinois at Urbana-Champaign >j-beauch(at)uiuc.edu > >McGill is running a new version of LISTSERV (1.8d on Windows NT). >Information is available on the WEB at http://www.mcgill.ca/cc/listserv > Jim, The biggest problem with this method is that it does not distinguish between discrimination and response bias. Try to use a confidence rating-scale ROC method (described at length in the Green and Swets Signal Detection Theory and Psychophysics book [1966], among other places). Actually, you are trying to evaluate the null hypothesis that the d' for discriminating the two pianos will be negative, which is considered hairy by many psychophysicists. (A negative d' is as good an indicator of discriminability as the positive, except that the listener is standing on his/her head.) Why don't you just tell them that they will hear two pianos, A and B, and ask them later whether they thought A or B to be the "true" instrument? Psychophysically speaking, this would lead to a cleaner experiment. Pierre **************************************************************************** Pierre Divenyi Experimental Audiology Research (151) V.A. Medical Center, Martinez, CA 94553, USA Phone: (925) 370-6745 Fax: (925) 228-5738 E-mail : pdivenyi(at)marva4.ebire.org **************************************************************************** --=====================_332608125==_.ALT Content-Type: text/html; charset="us-ascii" <html><div>At 10:43 AM 10/9/98 -0500, James W. Beauchamp wrote:</div> <div>&gt;Folks,</div> <div>&gt;</div> <div>&gt;A colleague of mine is using a particular procedure to test how well</div> <div>&gt;synthetic piano tones resemble &quot;original&quot; piano tones. The &quot;original&quot; tones</div> <div>&gt;are actually analyzed and resynthesized using a phase vocoder method (in</div> <div>&gt;order to eliminate background noise) and the synthetic tones are</div> <div>&gt;resynthesized with a fair amount of data reduction. Pitches between E1 and</div> <div>&gt;D7 of equal numbers of original and synthetic tones are presented in random</div> <div>&gt;order to the listener, and the listener's task is to choose which tones are</div> <div>&gt;synthetic. My colleague's contention is that if the listener were to simply</div> <div>&gt;guess, the score would be 50%, and that any score less than 50% indicates</div> <div>&gt;poor ability to distinguish, thus proving the efficacy of the data</div> <div>&gt;reduction method. Indeed, the scores ranged from 15% to 45%.</div> <div>&gt;</div> <div>&gt;My question is -- Does this really work? If the listener tries to maximize</div> <div>&gt;his/her score, then he/she might tend to guess, moving the score towards</div> <div>&gt;50%. On the other hand, if the subject is truly honest and he/she cannot</div> <div>&gt;distinguish, he/she would choose no tones and would score 0%. I guess it</div> <div>&gt;would depend on how the subjects are instructed to make their decisions,</div> <div>&gt;but then can you trust them to always follow instructions? Shouldn't the</div> <div>&gt;false positives also be reported?</div> <div>&gt;</div> <div>&gt;Is there a way to properly evaluate the results of this test?</div> <div>&gt;</div> <div>&gt;What is the best paradigm to test the efficacy of a data reduction method?</div> <div>&gt;</div> <div>&gt;Jim Beauchamp</div> <div>&gt;University of Illinois at Urbana-Champaign</div> <div>&gt;j-beauch(at)uiuc.edu</div> <div>&gt;</div> <div>&gt;McGill is running a new version of LISTSERV (1.8d on Windows NT). </div> <div>&gt;Information is available on the WEB at <a href="http://www.mcgill.ca/cc/listserv" EUDORA=AUTOURL>http://www.mcgill.ca/cc/listserv</a></div> <div>&gt; </div> <br> <div>Jim,</div> <br> <div>The biggest problem with this method is that it does not distinguish between discrimination and response bias. Try to use a confidence rating-scale ROC method (described at length in the Green and Swets Signal Detection Theory and Psychophysics book [1966], among other places). Actually, you are trying to evaluate the null hypothesis that the d' for discriminating the two pianos will be negative, which is considered hairy by many psychophysicists. (A negative d' is as good an indicator of discriminability as the positive, except that the listener is standing on his/her head.) Why don't you just tell them that they will hear two pianos, A and B, and ask them later whether they thought A or B to be the &quot;true&quot; instrument? Psychophysically speaking, this would lead to a cleaner experiment.</div> <br> Pierre <br> <br> ****************************************************************************<br> Pierre Divenyi<x-tab>&nbsp;&nbsp;</x-tab>&nbsp;&nbsp; <x-tab>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</x-tab>&nbsp;&nbsp; Experimental Audiology Research (151)<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; V.A. Medical Center, Martinez, CA 94553, USA<br> Phone: (925) 370-6745&nbsp;&nbsp; <br> Fax:&nbsp;&nbsp;&nbsp;&nbsp; (925) 228-5738<br> E-mail : <x-tab>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</x-tab>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <font color="#0000FF"><u>pdivenyi(at)marva4.ebire.org<br> </font></u><font color="#000000">****************************************************************************<br> </font></html> --=====================_332608125==_.ALT-- McGill is running a new version of LISTSERV (1.8d on Windows NT). Information is available on the WEB at http://www.mcgill.ca/cc/listserv


This message came from the mail archive
http://www.auditory.org/postings/1998/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University