Re: identification test procedure (William L Martens )

Subject: Re: identification test procedure
From:    William L Martens  <wlm(at)U-AIZU.AC.JP>
Date:    Sun, 11 Oct 1998 19:27:50 +0900

To Jim Beauchamp: I do that kind of experiment quite often, and I always use receiver operating characteristic (ROC) analysis. If you need Matlab scripts or C programs that do such analyses (separating sensitivity and bias), I would be happy to send them to you. Pierre Divenyi wrote: > The biggest problem with this method is that it does not distinguish > between discrimination and response bias. Yes, this is the crux of the matter. However, I do not agree that changing from the identification task to the discrimination task is a good answer, unless you think you want to know about discrimination rather than identification. In musical applications of synthetic timbre, you are rarely interested in whether the listener can discriminate between a real and synthetic tone; rather you would like to know whether they will identify what they are hearing as a bona fide piano or not. Using the confidence rating-scale ROC method is good, but maybe not necessary. You can reduce the problem with correct guesses by including more tones in a given trial (though this begins to feel like discimination rather than identification!). Five tones would reduce the guessing rate to 20%. The Green and Swets (1966) Signal Detection Theory and Psychophysics is a good introduction, but more practical examples of SDT application can be found in Swets (1964). For example, the chapter by Clark (see ref below) seems helpful for the piano tone study: Two tones followed by a judgment of which is most realistic might work, but you could glean more info from each trial if you played five and then asked them to give their second choice as well. If you do want to include more than two intervals in your identification test, you might play five tones and ask the listener to identify which of the five sounded most synthetic and which sounded most realistic. Clark, F. R. (1964) Confidence Ratings, Second-Choice Responses, and Confusion Matrices in Intelligibility Tests. In: J. A. Swets (Ed.), Signal Detection and Recognition by Human Observers (pp. 620-648). New York: John Wiley & Sons. Pierre Divenyi also wrote: > Actually, you are trying to evaluate the null hypothesis I agree also that it is somewhat problematic when proving the null hypothesis is the proof of the success of your data reduction scheme. The problem is one of how good people are at discriminating subtle differences like those between different types of pianos. So if your synthetic piano tone is close to one of five different pianos that you could let them hear, would it satisfy you to know that know that it is as good a candidate for the concept "piano" as any of the others? Why should you regard the failure to identify the difference between one real and one synthetic set of tones as the primary indicator of success? I think that the discussion of this issue is quite useful, especially if you have practical applications in mind. For your information: The Swets (1964) and Green and Swets (1966) are available from for relatively quick delivery: "Signal Detection and Recognition by Human Observers" John A. Swets; Hardcover; (at) $54.95 each (Usually available in 4-6 weeks) "Signal Detection Theory and Psychophysics" David M. Green, John A. Swets; Hardcover; (at) $54.95 each (Usually available in 4-6 weeks) Unfortunately, another good one may not be so easy: "Detection Theory : A User's Guide" Neil A. MacMillan, C. Douglas Creelman; price currently unknown (Out of print; availability varies) Regards, -- William L. Martens, Ph.D. EMAIL: wlm(at) Human Interface Lab URL: University of Aizu TEL: [+81](242)37-2762 Aizu-Wakamatsu 965-8580, Japan FAX: [+81](242)37-2549 McGill is running a new version of LISTSERV (1.8d on Windows NT). Information is available on the WEB at

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University