[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Back to the piano ...


I want to thank everyone who responded to my original post; your comments
were most useful.

After reviewing my friend's description of his project, I have to report
that I misunderstand one aspect of his reported data calculation.

To review, N original piano tones performed at various pitches together
with N synthetic tones, which were "imitations" of the original tones, were
arranged in random order and presented to each subject. On each trial the
subject had to report whether he believed the sound to be an original tone
or a synthetic tone (no uncertain choice allowed). Then the percentages of
correctly identified synthetics ("hits") and incorrectly catagorized
originals ("false alarms") were tabulated. What was reported was the
difference between these two scores, called the "discrimination score".
(The idea was to "penalize" the subject for guessing; I think this is an
old trick, frequently used on multiple choice exams.)

If H is the percent hits, and F is the false alarm rate, then the
discrimation score would be

       D = H - F

If the subject is guessing with a consistent bias, theoretically D = 0.

If the subject perfectly discriminates between the original and synthetic
tones (or does perfect catagorization), we have H = 100 and F = 0, so
the discrimation D = 100%.

Otherwise, ordinarily H - F = D > 0.

In my friend's case the results were 15% < D < 45%, depending on the pitch
of tones. (My own belief is that this dependence is due to the complexities
of the tones rather than some pitch-related perceptual phenomenon. Indeed,
the higher notes, having fewer harmonics, had the lowest scores, whereas
the lower notes had the highest scores.)

What threw me off was my friend's assertion that D = 50% could be taken as
the "indistinguishability threshold", and since the results were below that
threshold, the synthetic tones were therefore considered to be
"indistinguishable" from the originals. However, it seems to me that D = 50%
is rather arbitrary, and my friend's scores do in fact indicate that some
degree of discrimination is going on. Would you agree?

Now, getting back to the comments on my previous post, which are still
germane: A lot of people proposed using ROC curves. For example,
Timothy Justus remarked:

>By knowing both the hit rate and the false alarm rate (or the miss and
>correct rejection rates, since these are 1 minus the other two,
>respectively) you can plot a receiver operating characteristic (ROC) which
>can separate effects of sensitivity and decision criterion. The probability
>of hits is plotted against the probability of false alarms. A point falling
>near the diagonal would represent chance performance - they can't tell the
>difference - whereas points away from this line show discrimination.

I assume that the different points on the curve must come from data points
generated by different subjects, who would be assumed to have different
biases. Is that correct?

Anyway, doesn't presenting the D scores (as I defined it above) accomplish
the same thing? If a point is above the diagonal, then D = H - F > 0.
The advantage of the D scores is that they are probably easy to understand
by the typical reader of a sound synthesis article.

Another way to present the data would be percent correct, P. In this case,

      P = .5*(H + (1-F)) = .5 + .5*D

So we see that this is just a rescaling of the D data where, in this case,
100% corresponds to perfect discrimination (or catagorization) and 50% is
the guessing threshold.

So it seems to me that D and P are valid ways to report these results. The
single-interval forced-choice task is simple, and the math is simple and
easily understood. But does it pass muster in the psycho-acoustic world?

Jim Beauchamp
University of Illinois at Urbana-Champaign

McGill is running a new version of LISTSERV (1.8d on Windows NT). 
Information is available on the WEB at http://www.mcgill.ca/cc/listserv