Subject: Re: Rhythm From: Neil Todd <TODD(at)FS4.PSY.MAN.AC.UK> Date: Sat, 22 Nov 1997 13:34:22 GMT
Dear Bill and List, You may be interested in a neuroanatomically and neurophysiologically plausible account of rhythm perception, but first it is important to clear up a few points of terminology in this area which, historically, has been rather muddled. The following are related, but quite distinct phenomena. (1) recognition of metrical categories (2) temporal grouping (3) discrimination of filled time durations (4) discrimination of single or multiple empty intervals (event rate) (5) tempo or beat rate discrimination The distinction between (1) and (2) is very important because it is quite possible to have ametrical and nonperiodic rhythms, e.g. bird song or plainchant. That is, we may hear events as being grouped in time without associating a well-defined beat or quantising into categories (1:2, 1:3, etc). The distinction between (3) and (4) is very important because (a) music has articulation, i.e. staccato-legato variation, and (b) filled and empty durations are perceptually quite different, e.g. the filled duration illusion. The distinction between (4) and (5) is essential because it is possible to discriminate say an 8 Hz and a 10 Hz click train without "hearing" a beat. For the last few years I have been promoting a sensory-motor theory of rhythm, time and beat perception. This theory accounts for the complex of phenomena above by the interaction of: (A) sensory systems; (B) the motor system; and (C) an interpretation, planning and control system. The first tenet of this theory is that what unites all our senses is that they code for change in the physical environment. The existence of multi-modal cortex and multi-sensory binding (e.g. audio-visual speech) implies that our senses share a common form of temporal coding. This common code, it is suggested, is manifest in the cerebral cortex in the form of dynamic spatio-temporal receptive fields (RFs) , i.e. receptive fields that code for motion. One way of viewing a dynamic RF is in the form of a filter which has both a temporal and a spatial tuning. The cortex is populated by many millions of neurones possessing these dynamic RFs having a distribution of tunings over a range of spatial and temporal scales. The second tenet is that our sensory systems and the neurodynamics of the motor system are adapted or tuned in to the temporal properties of the external environment and the biodynamics of the body. This implies both that the temporal sensitivity of any sensory system will reflect the temporal properties its environment, and that motor expectations will reflect the biodynamics of the body. The temporal sensitivity of any sensory system depends on the distribution of temporal tunings of the dynamic RFs in the cortex. In the case of the motor system, two important structures are the cerebellum, which receives inputs from multi-modal sensory cortex, and the basal ganglia, which receives inputs from the entire neocortex. Multi-modal sensory cortex in turn also receives reciprocal inputs from the premotor cortex, thus completing the sensory-motor loop, so that the motor system influences the interpretation of sensory information. Although the functions of the cerebellum and basal ganglia are still very much a mystery, both are known to be implicated in motor timing. One well-established role that the cerebro-cerebellum is thought to play, is that of a feedforward control mechanism, representing an internalisation of the biodynamics of the body. An implication of the motor aspect of the second tenet, is that the natural propensity of the body to rhythmicity will also be reflected in any internal feedforward model. Three important evolutionarily primitive cyclical behaviours which correspond with the biodynamics of the body are: mastication, locomotion and respiration. In addition to cerebellar control these primitive actions are also controlled by means of low-level spinal circuits referred to as central pattern generators (CPGs). Although CPGs may be viewed as clock-like, in fact CPGs are controlled *tonically* (as opposed to phasic control) from the brain-stem, and do not reflect any inherent neural oscillatory mechanism (i.e. it's not a tick-tock). One important consequence of a sensory-motor perspective is that internalised rhythmicity of the body will have an effect on perception, via the sensory-motor loops. To put this crudely, the way we hear or see may be determined, at least in part, by the biomechanical properties of our own bodies. Although this proposition may seem at first sight to be verging on the absurd, it is possible to show that the three most important phonological levels of speech rhythm, the syllable, foot and intonational phrase, can all be identified with the three evolutionarily primitive rhythmic behaviours above. Also I have some recent data which shows that an individual's preferred tempo correlates with simple biomechanical measures. However, to return to the specifics. (1) Metrical categories arise naturally from a wavelet representation, the ratios 1:2, 1:3, 1:4 are the first four terms of the harmonic series which emerges from a scale-space representation. Recognition can be modelled as the correlation between a template for each of the categories and any test rhythm. Event rate dependency is a consequence of two things (a) the distribution of temporal frequencies of the RFs (each of the category templates will look a little different depending on the rate, i.e. there are a set of rate- dependent templates for each category) (b) since the RFs are causal they effectively form a memory. (2) Temporal grouping also emerges naturally from a scale-space representation, most evident when the temporal component of the RFs are low-pass, and does not require any prior categories. (3) The filled interval illusion may be explained by the fact the transform of a square pulse is a sin52x/x52 form, whereas that of two or more short events is a harmonic series (see http://www.psy.man.ac.uk/ResearchFolder/PostFold/ToddPoster). (4) Empty interval and event rate discrimination may also be explained by the distribution of temporal frequencies of the cortical RFs, i.e. we are maximally sensitive when the event rate falls in the centre of the cortical distribution. This is approximately 0.5 Hz - 32 Hz (log distribution), but may extend a little further either side. (Incidentally, this same mechanism also accounts for AM and FM detection/discrimination at low modulation frequencies.) (5) A beat is an entirely different beast. It is not something we perceive, but something we impose onto a periodic signal. As has been pointed out, there is a well-defined existence region which corresponds approximately to locomotor rates. For this reason many have suggested that this is the origin of beat induction. According to the sensory-motor perspective a beat is literally an imagined movement. The most likely location in the brain for such an imagined movement is the sensory-motor loop mentioned above (main structures are posterior parietal lobe, pre-motor cortex, cerebro-cerebellum and basal ganglia). Once a beat has been induced it has a certain inertia of its own and influences the way the brain interprets the auditory image flow. If you are at interested in reading more, the following may be of interest. Cheers Neil Todd On expressive timing. Shaffer, L.H., Clarke, E.F. and Todd, N.P.McAngus. Meter and rhythm in piano playing. Cognition 20, 61-77, (1985). Shaffer, L.H. and Todd, N.P.McAngus (1994) The interpretative component in musical performance. In R. Aiello and J. Sloboda (Eds) Musical Perceptions. OUP. pp. 258-270. Todd, N.P.McAngus (1985) A model of expressive timing in tonal music. Music Perception 3, 33-58. Todd, N.P.McAngus (1989) A computational model of rubato. Contemporary Music Review 3, 69-88. Todd, N.P.McAngus (1992) The dynamics of dynamics: a model of musical expression. J. Acoust. Soc. Am.,91(6), 3540-3550. On rhythm perception and time discrimination. Todd, N.P.McAngus (1994) The auditory primal sketch: A multi-scale model of rhythmic grouping. J. New Music Research 23(1), 25-70. Todd, N.P.McAngus (1995) The kinematics of musical expression. J. Acoust. Soc. Am. 97(3), 1940-1949. Todd, N.P.McAngus (1996) An auditory cortical theory of auditory stream segregation. Network : Computation in Neural Systems. 7, 349- 356. Todd, N.P.McAngus (1996). Towards a theory of the principal monaural pathway: pitch, time and auditory grouping. In W. Ainsworth and S. Greenberg (Eds). Proceedings of the International Workshop on the auditory basis of speech perception. Keele, July, 1996. pp 216-221. Todd, N.P.McAngus (1996). Towards a theory of the central auditory system III: Time. In B. Pennycook and E. Costa-Giomi (Eds.) Proceedings of the Fourth International Conference on Music Perception and Cognition. Montreal, August, 1996. pp 185-190. Todd, N.P.McAngus (to appear) A model of auditory image flow I: Architecture. British Journal of Audiology Todd, N.P.McAngus (to appear) A model of auditory image flow II: Detection of amplitude and frequency modulation. British Journal of Audiology. Todd, N.P.McAngus and Brown, G.J. (1996) Visualization of rhythm, time and metre. Artificial Intelligence Review 10, 253-273. Todd, N.P.McAngus and Clarke, E.F. (1995) The perception of rhythmic structure in expressive musical performance. Proceedings of the 15th International Congress of Acoustics.. Volume III. pp 459-462. Todd, N.P.McAngus and Lee, C.S. (to appear) A sensory-motor theory of speech perception: Implications for learning, organisation and recognition. To appear in W. Ainsworth and S. Greenberg (Eds). Listening to Speech. OUP. General reviews Clarke, E.F. (to appear) Rhythm and Tempo(?) In D.Deutsch. Psychology of Music(?) Second Edition. O'Boyle, D.J. (1997). On the human neuropsychology of timing of simple repetitive movements. In: Bradshaw,C.M. & Szabadi, E. (Eds.), Time and behaviour. Psychological and neurobehavioural analyses (pp. 459-515). Amsterdam: Elsevier Press (in press).