Re: Time and space (Neil Todd )

Subject: Re: Time and space From: Neil Todd <todd(at)HERA.PSY.MAN.AC.UK> Date: Mon, 9 Jun 1997 15:00:33 +0100 Dear Pierre On Fri Jun 6 20:39 BST 1997 Pierre Divenyi wrote: > Dear Neil > Indeed, it would be nice if the state-of-the-brain were as you describe it: > low-level time and frequency analysis represented orthogonally in the > cortex. While it is true that Gerald Langner has found single units in > his animals, and that the human MEG data look at least consistent, > he would be the first to say loud that your generalization of the data > are scarcely more than wishful thinking. Yes indeed, and he did say something like that to me on a recent visit to Darmstadt. > In particular, the time ranges > some of us have been talking about during the present exchange of views > go down to very-very low frequencies (=periods as long as 100-200 ms) > which, as far as I know, have not been found to be well represented at > the CN or the IC -- but correct me if I am wrong. If you look back at all my previous messages you will see that I have consistently argued that as far a temporal frequency is concerned, there is a need for three temporal frequency dimensions. (i) cochleotopic (approx. 30 - 10, 000 Hz), (ii) periodotopic (approx. 10 Hz - 1000 Hz) and (iii) a time-scale dimension (approx. 0.5 - 20 Hz). It is this third temporal frequency dimension that takes care of your long periods, and also discrimination (see Todd and Brown, 1996) and streaming (see Todd, 1996) I would argue. In fact in the most recent version of my model (to appear in Todd and Lee, in Steve Greenberg's edited book) a cortical unit requires 9 parameters to describe its receptive field. If we assume that the cortex does have access to somekind of 2-D cochleotopic/periodotopic array (whether it is sub-cortical in origin or not) then clearly this will be changing over time. A unit may be labeled according its CF and BMF and clearly its activity is time dependent. So, in terms of modelling we may consider the cortex to have 3 input dimensions (i) fc, (ii) fm and (iii) time t. If we assume that , like the visual cortex, this input is decomposed by populations 3-D spatio-temporal filters, then effectively the flow of information can be represented by appropriately tuning and orienting these filters in a 3-D scale-space. This requires 6 other parameters (iv) cochleotopic spatial freq. (cycles per oct.) (v) periodotopic spatial freq. (cycles per oct.) (vi) time-scale freq. (Hz - the same dimension as above) the centriod in scale-space and (vii) cochleotopic space-constant (viii) periodotopic space-constant (ix) time-constant the spatio-temporal window. This model generates a number of different response types including AM and FM, low-pass and band-pass, spatial and temporal, which can be used to described primitive RFs for stationary and moving pitch and timbral acoustic features. As I said in the previous message, recent physiology (both Schreiner and Shamma) has demonstrated dynamic spatio-temporal RFs > Furthermore, even if all you say about auditory time/frequency analysis > in the cortex, there is still the phenomenon Al and I were referring to > to explain: temporal (i.e., envelope-) patterns marked by signals in > different frequency bands tend to divide into two streams and suffer > a loss of discriminability. In Todd and Brown (1996) we showed that one could certainly account for the shape of the psychophysical law for time interval discrimination in terms of a population of assumed cortical band-pass AM sensitive cells. In the streaming model [Todd, N.P.McAngus (1996) An auditory cortical theory of auditory stream segregation. Network : Computation in Neural Systems. 7, 349-356.] this population would be bifurcated, thus reducing sensitivity since there would be less neurones in each stream with which to make a discrimination. > Having done lots of experiments on the latter, > and having tried to model the situation, the phenomenon in question > looks as if what we hear (=the perceived temporal patterns) were mediated > by an extra stage whenever the markers do not activatte the same > pool of neurons. Again, I would not object to the view that this extra > stage is also located at a subcortical level, but you must admit that > the data are not there to support the view (at least I haven't seen > them in the time range we are talking about). Thus, a more parsimonious > explanation, to my mind at least, would be to make the cortex responsible > for keeping track of envelope timing information altogether. Further, I have showed that one can account for the shape of the psychophysical law of pure tone AM detection which has two points of maximum sensitivity, one about 3 Hz and another about 300 Hz. This severe departure from Weber's Law I have suggested is because AM detection is mediated by two separate populations - one cortical the other subcortical [Todd, N.P.McAngus (1996) Time discrimination and AM detection. J. Acoust. Soc. Am. 100(4), Pt. 2, 2752.]. As far as I am aware I do not know of any other model which accounts for the hump in the middle - unless I am also wrong? Neil Dear Peter On Thu Jun 5 22:42 BST 1997 Peter Cariani wrote: > Lastly, in Bertrand's and my work in the auditory nerve > it also became apparent to me that what is needed for > periodicity pitch is an autocorrelation-like analysis of > of periodicity pitch, ..... > My impression is that the > modulation detector idea will not work for the pitch > shift of AM tones (first effect of pitch shift, or > de Boer's rule)...... > I think these are critical issues for models based on > time-to-place that I tried to bring up in Montreal last > summer (I didn't do it to give you a hard time, I promise). Yes, it is very clear that you are a strong proponent of this class of model and certainly it is true that power spectrum type models only respond strongly to first order intervals. However, if you had actually read the proceedings (and listened to what I said then) you would have noticed that I did indeed address these issues. Autocorrelation models are good, but they are not perfect. To save you looking up your copy of the Proc. ICMPC96 I quote below the relevant section. "The next phenomenon we consider is that of virtual pitch shift. Virtual pitch shift is described as the shift in perceived pitch of a complex in which all the partials have been shifted by a constant amount, so that they are no longer harmonic. In the literature there are generally two distinct accounts of pitch shift (Hartmann and Doty, 1996). Proponents of temporal theories argue that pitch shift can be only accounted for in the fine temporal structure of the AN response, since the envelope remains invariant. Earlier place-pattern accounts are that pitch shift results from the disparity between a central pattern and the excitation pattern, i.e. a best fitting harmonic series. It is of interest to see if pitch shift can be predicted by [place-time] pattern matching against the sensory memory traces without the spiking component. The signals for this example were obtained from Example 21 of the ASA Auditory Demonstrations CD and consist of the 4th, 5th and 6th harmonics of a 200 Hz complex. As would be expected (without spiking) as the three partials shift upwards their interaction terms remain relatively invariant at 200 Hz. However, the pitch strength as measured by response of the recognition space (Figure 2) does indeed show the correct type of pitch shift including pitch ambiguity. The strongest response is obtained for the harmonic complex 800, 1000, 1200 Hz, although the estimated pitch is a little less than 200 Hz. The 860, 1060, 1260 Hz complex produces a pitch estimate at about 210 Hz. The 900, 1100, 1300 Hz complex produces an apparent bimodal response with one estimate at about 215 Hz and another at about 185 Hz. The 960, 1160, 1360 Hz complex also gives a bimodal response with estimates of about 230 Hz and 190 Hz. Clearly this simple pattern matching model does seem to give an account of pitch shift without fine structure. How then may this be reconciled with the fine structure account? In order to investigate the effect of fine structure in the model, a shifted complex (900, 1100, 1300 Hz) was presented to the model including the spiking component (see Figure 3). It is clear from figure 3 that although fine structure appears to be only weakly represented in the model it does have the effect of shifting the interaction term to about 215 Hz. It appears then that both fine structure and central matching contribute to pitch shift. It may be that the central auditory system combines envelope and neural timing information (Cooke and Brown, 1994) into a single image which is available for learning and recognition. It is known that there are some small discrepancies between current timing models and experimental data. e.g. the slope of the pitch shift in the case of Meddis and Hewitt (1991) and the predicted mistuning of the fundamental in the case of Hartmann and Doty (1996). It may be that the hybrid model proposed here may account for these small discrepancies of the timing models, but this requires testing." [Proc. ICMPC 96. p. 180-181.] To be quite honest, I am quite agnostic as to what mechanism is responsible at a sub-cortical level for the time to place mapping. Both power specta and autocorrelation (which are Fourier twins don't forget) do this. Actually, the most neurologically plausible model I have seen is Gerald Langner's (neither autocorrelation nor power spectrum) since it includes both DCN and VCN components. Either way it does not alter the cortical model I have proposed. Neil

This message came from the mail archive
http://www.auditory.org/postings/1997/
maintained by:

DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University