Re: within subject comparisons (Ward Drennan )

Subject: Re: within subject comparisons
From:    Ward Drennan  <ward(at)IHR.GLA.AC.UK>
Date:    Mon, 30 Oct 2000 09:50:19 -0000

I encountered these types of difficulties in my dissertation work a few years ago. We (Chuck Watson and I) had used Levitt staircase techniques to measure spectral-shape discrimination thresholds for a large group of listeners and four different spectral profiles. We used long tracking histories normally about 2,000 trials. Certainly, if one listener's tracking history was completely higher or lower than another listeners' tracking history for 2,000 trials, these listeners had different thresholds. The same applies within a single listener on two types of spectral profiles. But, we only had one estimate of threshold from this long history-- the mean of the last 140 reversals. What Chris says is true, these are not independent observations since the level visited on any trial is dependent on the level visited on the previous trial. Have we lost all the statistical power in 2,000 observations? To get 'quasi-independent estimates, we invented a method called the 'mean of multiple samples' (MMS method). We calculated the mean of 10 reversals (this is one estimate) then skipped over 20 reversals and calculated the mean of the next 10 (a second estimate), etc. From the last 1/3 of the history we could get 5 thresholds estimates and have a reasonable estimate of within-listener variance and have some independent basis for an ANOVA analysis. This is not ideal as we had to discard some data, but the threshold estimates were nearly the same as the mean of 140 reversals (r=0.96). Note that we didn't selectively discard data, we only took an estimate of thresholds for all the listeners and spectral profiles at the 5 pre-selected times during training. Ward Drennan MRC Institute of Hearing Research, Scottish Section >Al points out that if you provide the reader > with error bars and descriptive statistics then common sense should do > the rest. I think this may a reasonable approach, but I think it could > only be reasonable if the estimates of the variance that are used to > generate standard error are accurate (which requires independence). If > the data are highly autocorrelated then the variance will be > underestimated, the error bars will be misleading, and any conclusions > that the reader might draw from them based on common sense may be > incorrect. > > A good example of this is can be found in adaptive staircase techniques. > A threshold point is often taken as the mean of a prescribed number of > reversals. The variance of this mean is not very useful as an estimate > of the standard error of the threshold because of the high > autocorrelation between the stimulus level at each turnaround. > Typically, the variance within a staircase will underestimate the > standard error of the threshold across staircases. If, for some reason, > we were unaware of this correlation and reported the standard error of > the threshold based on this variance we would be overstating the > magnitude of the effect and the reader, equipped with only the > descriptive statistics, will be misled. >

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University