Neil P. McAngus Todd
Dept. of Music, City Univ., Northampton Square, London EC1V 0HB, England
Research on musical performance has shown that expressive variations in tempo, dynamics, etc. not explioit in a score can be acounted for by structural factors such as phrasing and rhythm [N. P. McAngus Todd, J. Acoust. Soc. Am. 91, 3540--3550 (1992)]. There now exists a number of models that attempt to recover rhythmic structure from a performed sequence [H. C. Longuet-Higgins, Nature 263, 646--653 (1976); P. Desain, Music Percept. 9(4), 439--454 (1992)]. However, they do not address the question of how the human auditory system might detect temporal events before such higher level processing. A low-level representation is proposed, analogous to the primal sketch in vision [D. Marr and E. Hildreth, Proc. R. Soc. London Ser. B 207, 187--217 (1980)], enocoding intensity changes at different time scales via an array of filters tuned at very low infrasonic frequencies. This representation could act as a front end to perceptual models. Various outputs from an analog implementation of the filter array are demonstrated including music, speech, poetry, and bird song.