[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: within subject comparisons

I encountered  these types of difficulties in my dissertation work a few
years ago. We (Chuck Watson and I) had used Levitt staircase techniques
to measure spectral-shape discrimination thresholds for a large group of
listeners and four different spectral profiles. We used long tracking
histories normally about 2,000 trials. Certainly, if one listener's
tracking history was completely higher or lower than another listeners'
tracking history for 2,000 trials, these listeners had different
thresholds. The same applies within a single listener on two types of
spectral profiles. But, we only had one estimate of threshold from this
long history-- the mean of the last 140 reversals. What Chris says is
true, these are not independent observations since the level visited on
any trial is dependent on the level visited on the previous trial. Have
we lost all the statistical power in 2,000 observations?

To get 'quasi-independent estimates, we invented a method called the
'mean of multiple samples' (MMS method). We calculated the mean of 10
reversals (this is one estimate) then skipped over 20 reversals and
calculated the mean of the next 10 (a second estimate), etc. From the
last 1/3 of the history we could get 5 thresholds estimates and have a
reasonable estimate of within-listener variance and have some
independent basis for an ANOVA analysis. This is not ideal as we had to
discard some data, but the threshold estimates were nearly the same as
the mean of 140 reversals (r=0.96). Note that we didn't selectively
discard data, we only took an estimate of thresholds for all the
listeners and spectral profiles at the 5 pre-selected times during

Ward Drennan
MRC Institute of Hearing Research, Scottish Section
>Al points out that if you provide the reader
> with error bars and descriptive statistics then common sense should do
> the rest. I think this may a reasonable approach, but I think it could
> only be reasonable if the estimates of the variance that are used to
> generate standard error are accurate (which requires independence). If
> the data are highly autocorrelated then the variance will be
> underestimated, the error bars will be misleading, and any conclusions
> that the reader might draw from them based on common sense may be
> incorrect.
> A good example of this is can be found in adaptive staircase
> A threshold point is often taken as the mean of a prescribed number of
> reversals. The variance of this mean is not very useful as an estimate
> of the standard error of the threshold because of the high
> autocorrelation between the stimulus level at each turnaround.
> Typically, the variance within a staircase will underestimate the
> standard error of the threshold across staircases. If, for some
> we were unaware of this correlation and reported the standard error of
> the threshold based on this variance we would be overstating the
> magnitude of the effect and the reader, equipped with only the
> descriptive statistics, will be misled.