Re: Time and Space (Neil Todd )


Subject: Re: Time and Space
From:    Neil Todd  <todd(at)HERA.PSY.MAN.AC.UK>
Date:    Tue, 10 Jun 1997 14:04:15 +0100

Dear Peter and Alain On Tue, 10 Jun 97 02:07:26 EDT Peter Cariani wrote > I looked up your papers again in the IMPC proceedings. I do > remember thinking that the IC modulation tunings were way > sharper than anything I had seen in the literature, looking at > the first paper (fig. 2) there are multiple, sharp peaks > for Fm's of 200, 400, and 800 Hz. Note that this figure is showing the output of a population of modulation filters summed across cochlear channels - a summary modulation spectrum (SMS). Figure 2g shows the SMS for a complex tone with a fundamental of 200 Hz. It would be extremely worrying if it *did'nt* show peaks at 400, 600 and 800! > Looking at Fig. 2 in the second > paper -- the one on pitch shift -- you get an estimated > pitch of 195 Hz for the perfectly harmonic case (800,1000, > 1200). For the shifted case, (860, 1060, 1260), de Boer's > rule would predict pitches at 215 and 172 Hz, whereas > your plot shows a global maximum at 205 Hz. The peak at > 185 Hz is yet smaller than one at 195 Hz. What psycho- > physical results were these estimated pitches compared > against? According to be Boer's data (1956) for *five* component inharmonic (taken from Plomp (1976) Aspects of Tone Senation. Academic Press) Centre Component Lower pitch Upper pitch 1060 - 215 1100 182 220 1160 195 - Now remember I used a very crude quantization of 5 Hz steps so there was always going to be some error, but nevertheless if you look at figure 2 in terms of the centroids of the peaks then my results for *three* components are approximately Centre Component Lower pitch Upper pitch 1060 - 210 1100 185 215 1160 195 - So the picture really is not as bad as you make out. These results I generated with one attempt, one particular training set of 10 harmonic complexes with equal weight components. I could have messed around with training sets of different timbres, different numbers of harmonics, etc. etc. I could have tweaked around with a whole load of different parameters which would have given me slightly different numbers. But I chose not to do so because it seemed clear to me that there was a definite effect which was reasonably close to the data. What I had demonstrated was that harmonic series pattern matching of a *time-place* representation could show the right kind of pitch shift. > But for this stimulus (900, 1100, 1300) I would expect > v. weak low pitches: a pitch at the true fundamental > (100 Hz), and fainter ambiguous pitches around 185 Hz and 222 Hz. > Your model doesn't predict any of these discrete pitches-- > the estimates have a broad, high-pass character starting > with about 190 Hz. The model doesn't seem to work...... In figure 3 which you are referring to, I did *not* show the output of the pattern matching component, but the SMS. What this showed was that although the modulation spectrum only responds strongly to first order intervals, it does in fact also show some weak sensitivity to fine structure. In this case I focussed around the 200 Hz area and showed that there was a shift upwards in the SMS alone. If I had in fact combined this with the harmonic series patttern matcher it would have shifted upwards the pitch estimates closer to the experimental data. The reason that you did not see anything around 100 Hz was because I did not *look* at this range. However, it may interest you to note the following. "It is remarkable that the complex 900+1100+1300+1500+1700 Hz, consisting of odd harmonics of a fundamental of 100 Hz, did *not* [my italics] have a pitch corresponding to that frequency, though be Boer explicitly searched for it." [Plomp. (1976) Aspects. p 119.] To return to the main point though, my conclusion was that a full acount of pitch shift required a hybrid of two mechanisms (a) a low-level temporal mechanism and (b) a central pattern matching matching mechanism. According to Plomp on p. 120 "De Boer proved mathematically that both approaches are equivalent. He explained that small deviations of the data points from the theoretical lines by introducing a weighting factor in favour of the lower partials. De Boer suggested, although this has been overlooked by later investigators, that *both* [his italics] mechanisms may play a part in pitch sensation, the *spectral* [his italics] one for low harmonic numbers and the *temporal* [his italics] one for high harmonic numbers." I agree with de Boer. My only difference is that the central matching is carried out on a spectro-temporal representation, rather than simply a spectral one. > All I am suggesting, as gently as possible, is that you should > pay more attention to the nature of the elements that are > supposed to be carrying out your informational operations -- > just check to see if they in any way resemble what is seen > physiologically. On p.177 of the ICMPC96 procs. I wrote "There are many aspects of the model which are unrealistic. Perhaps the most obvious is the use of linear filters for cell modulation response properties in contrast to the known non-linear behaviour of auditory neurones (Hewitt et al, 1992). The only reason to favour the linear approach over a full Hodgkin-Huxley model is one of computational tractability. Even as it stands, the model typically uses 32 cochlear channels, about 1000 ICC cells, about 1000 MGB cells and 30, 000 cortical cells. This is approximately the correct ratio for the auditory system, although the actual number is out by a factor of 1000. The model thus runs at the limits of conventional computing. Despite, these any many other limitations, the following three papers (Todd, these procedings, a,b,c) examine the how the interaction of the central processes may provide an account for some psychophysical phenomena of pitch, time and auditory grouping." I think it is clear from the quote above that I am aware of the physiological limitations of my model. However, I am also aware of the computational impossibility of constructing a model of the auditory system, including the cortex, without making some modelling approximations. Perhaps you should have a go at modelling yourself sometime? One of the basic principles of any kind of theoretical work is that you start out your models as simple as possible, to at least get them up an running, so that you can make some comparisons with the data. Then when the model breaks down you learn something. That's how theoretical science makes progress. These are basic principles which, as an X-theoretical physicist, are second nature to me, but I can understand why experimentalists have a problem with modelling. > so it's time to either go back and > recheck for errors of implementation or time to rethink your > basic assumptions. No, I don't think so. In the spirit of de Boer I will continue to model the interaction of low-level and central cortical mechanisms. On Tue, 10 Jun 1997 08:40:33 Alain de Cheveigne wrote > >It is known that there are some small > >discrepancies between current timing models and experimental > >data. e.g. the slope of the pitch shift in the case of Meddis > >and Hewitt (1991) and the predicted mistuning of the > >fundamental in the case of Hartmann and Doty (1996). It may be > >that the hybrid model proposed here may account for these small > >discrepancies of the timing models, but this requires testing." > The phenomenon reported by Hartmann and Doty (JASA, 1996) had nothing to do with a virtual pitch shift..... By the above I did not mean to imply that the Hartmann and Doty (1996) phenomenon was shift of virtual pitch, but rather to give another example of pure timing models which don't make perfect agreement with the data, and, in agreement with de Boer, to suggest that pitch phenomena require a central pattern mechanism as well a timing mechanism. Neil


This message came from the mail archive
http://www.auditory.org/postings/1997/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University