Daniel P. W. Ellis
MIT Media Lab., Music & Cognition Group, E15-491, Cambridge, MA 02139
A speculative model of the ``internal representation'' of a sound in the auditory system stores only the peak energy trajectories from a cochlear filterbank [P. W. Ellis and B. L. Vercoe, J. Acoust. Soc. Am. 91, 2334(A) (1992)]. This representation appears to code relevant features, as verified by resynthesis, and can be used for signal separation. One marked shortcoming is the absence of accommodation: a loud but steady stimulus has equal prominence throughout its duration; the subjective experience is that such sounds are rapidly relegated to ``background.'' A related problem with the model is that high-energy components distort and mask nearby low-energy features, even when the former are static and the latter varying; on listening to such a sound, the quieter details are far better resolved by our ears than is apparent in the representation. A simple subtractive feedback scheme can remove the influence of stable components, regardless of energy. Differentiation in time achieves this, but more interesting results are obtained by feeding back a higher-level analysis. Examples of sonic details separated by such methods will be shown and played.