THE EAR VS THE EYE: Summary of a survey. ("Marc-Andre Decoste ou MAD... My Audio Dream" )

Subject: THE EAR VS THE EYE: Summary of a survey.
From:    "Marc-Andre Decoste ou MAD... My Audio Dream"               <decostem(at)IRO.UMONTREAL.CA>
Date:    Wed, 21 Jul 1993 09:40:31 EDT

Hi there listeners, a couple of weeks ago I asked for info on the advantages of our audio perception system over our visual perception system. I would like to personaly thank everyone who kindly replied with very useful information. Since the info could be useful for others, I included all the answers at the end of this message. Some people showed interest in the results of my search for a Computer Science project using sounds (which was the goal of my request for info). After reading some articles on the subject, browsing through the biographical database of this distribution list and, of course, the replies to my request included in this message, I thought we could use sounds to give us an insight of whats beyond the screen when we work with semantic nets. There is a certain amount of information that can be seen on a computer monitor. When we are browsing through a semantic net (or any other knowledge relations organization) we need to know which way to go when we're looking for info beyond the screen. Like when you are in a room, you see many closed doors and you can hear what's beyond these doors. We want to use semantic nets because we can assign sound characteristics to nodes' attributes and sound effects (or filters) to arcs' attributes. We're not sure how much info can be acquired by a general user this way, but that's the fun part of doing Ph.D. research... isn't it? BYE MAD >>>>>>> Here are the replies I received to my request... Enjoy! ----------------------------------------------------------------------------- From jfolsen(at) Tue Jun 22 16:18:27 1993 From: "John F. Olsen" <jfolsen(at)> >>>>>>>> There is a recent human study [Perrott et al.,1993, JASA 93:2134-2138] that shows that the auditory modality is as good, if not better than the visual in determining the relative spatial directions of stimuli presented sequentially. Other work by this group shows that the auditory modality is superior to the visual in redirecting the center of gaze. John Olsen NIMH Lab. Neurophysiology <<<<<<<<< ----------------------------------------------------------------------------- From levitin(at) Tue Jun 22 16:18:43 1993 >>>>>>>> Well, I've just done some work that rather convincingly shows that the auditory system is very good at making absolute judgments of perceptual stimuli. And if you compare this to work done in vision, the auditory system is far better at absolute judgments, using memory (that is stored perceptual stimuli). Specifically, I found that non-musicians are very very good at remembering the absolute pitches of familiar songs. In contrast, people are very poor at remembering the absolute color( hue), brightness or saturation of colored objects. If you're interested, I'll send you a paper. Dan Levitin <<<<<<<<< ----------------------------------------------------------------------------- From mcfadden(at) Tue Jun 22 16:18:59 1993 >>>>>>>> The auditory system works even in the dark. The auditory system works even when the ears are not aimed at the stimulus. There are no lids covering the ears. The auditory system has its entire 12 log units of dynamic range available to it at all times; there is none of that pansy slow-acting light- and dark-adaptation. All the richness of auditory experience originates from about 4000 inner hair cells as compared to 6 million cones and 120 million rods. Cochleas can emit sounds; eyes emit no light. The development of true language ability requires an intact, functional auditory system. The auditory system works essentially in real time while the visual system is always a couple of hundred milliseconds behind. Auditory scientists never engage in hyperbole; vision scientists are forced to. Vision is an amusing minor sense given us solely to occupy the time of lesser minds. D.McFadden <<<<<<<<< ----------------------------------------------------------------------------- From port(at) Wed Jun 23 08:36:23 1993 From: Robert Port <port(at)> >>>>>>>> One major difference between the auditory and visual channels is sensitivity to rapid events in time. Audition beats vision by a long shot -- a factor of 30-40 -- presumably because the chemical response to light takes much longer than the mechanical response to sound pressure. The auditory nerve apparently transmits a phase-locked representation of sound events all the way up to nearly 1000 Hz. (Of course the ear detects frequencies much higher than that, but I'm talking about time-synchronous response to individual events.) This means that individual periods are resolved almost down to a single msec. The visual system, on the other hand, fails to detect flicker in movie images that flash only 24-30 times a second. Now whether this can be exploited for the purpose of facilitating data interpretation, I don't know. It will take some imagination. Bob Port Linguistics/Comp Sci/Cog Sci Indiana University <<<<<<<<< ----------------------------------------------------------------------------- From saberi(at) Wed Jun 23 08:36:33 1993 >>>>>>>> Dear Dr. MAD, You might want ot contact David R. Perrott. He has recently been comparing the spatial acuity of the visual and auditory systems under similar conditions, such as when the stimuli are presented successively in both modalities, instead of simultaneously in vision and successively in audition. I'm not sure if these data are published yet. He may also have other insights for you. His address is: David R. Perrott Department of Psychology California State University Los Angeles, CA 90032 (213) 343-2266 -Kourosh Saberi Dept. of Psych U.C. Berkeley <<<<<<<<< ----------------------------------------------------------------------------- PART II ----------------------------------------------------------------------------- From tec(at) Fri Jun 25 09:03:10 1993 From: Tecumseh Fitch <tec(at)> >>>>>>>>> I and my collaborator Greg Kramer recently reported on some work on just this topic; it will appear in the proceedings of the ICAD (International Conference on Auditory Display) conference which happened in Santa Fe in October '92. The proceedings volume is in press at Addison-Wesley; it should be out by September. In brief, we studied a complex task requiring simultaneous monitoring of many continuously-changing variables, and found an auditory display to be superior to a standard visual display. College undergrads were briefly trained as anesthesiologists: they learned to monitor 8 physiological variables in a computer-simulated human body, and to respond appropriately to medical emergencies like overdose, blood loss, etc. Information from the "digital patient" was presented through either a standard visual display (a strip chart) or an auditory display which we created. The auditory display used two "base streams" which sounded like a heart beating and a person breathing, respectively. These conveyed information (not surprisingly) on heart rate and breathing rate. Other, more abstract, variables were "piggy-backed" onto these base streams: for example, blood pressure controlled the pitch of the heart sound, and body temperature controlled the center frequency of the band-pass filter used to make the breathing sound. We found that subjects responded faster AND more accurately when using the auditory display than with the visual display. The results suggested that subjects formed an gestalt representation of each medical problem when using the auditory display, and were less able to do this with the visual display. These results may not surprise people working in audition: it seems intuitively obvious that we are able to process multiple information sources simultaneously in audition (why else would people enjoy chamber music and symphonies?). In contrast, the primate visual system is adapted for foveating individual objects serially. As intuitive as these results may be, they are apparently surprising to some (see the writeup of our study in this month's American Scientist, 81(3) p 229) Other examples (in simpler tasks) are Tzelgov et al. (1987: Human Factors 29(1): 87-95), who found auditory superiority in a Geiger counter task, and Lewandowski and Kobus (1989: Human Perf. 2(1): 73-84) who got faster (but less accurate) performance with audition in a simple sonar target ID task. > There is a recent human study [Perrott et al.,1993, JASA 93:2134-2138] > that shows that the auditory modality is as good, if not better than > the visual in determining the relative spatial directions of stimuli > presented sequentially. Perrott et al. found no significant difference between minimum audible angle (MAA) and minimum visible angle (MVA) under their experimental conditions, but it is important to realize that their MVAs were orders of magnitude higher than the generally-accepted best values for vision (less than 10 sec of arc, vs. their 27 minutes of arc!). Thus the result, while theoretically quite interesting, may have limited practical importance. Tecumseh Fitch (tec(at) Dept. of Cognitive and Linguistic Sciences Brown University, Box 1978 Providence, RI 02912 <<<<<<<<< ----------------------------------------------------------------------------- From 70312.265(at) Fri Jun 25 09:04:16 1993 From: Gregory Kramer <70312.265(at)> >>>>>>>>> Marc, You should seriously consider two aspects of the auditory system that could guide your choice of an auditory display project: 1. Temporal resolution of audition is far superior to vision. So, monitoring and analysis tasks that need good temporal resolution, such as synchronization, regularity of repeats, etc. 2. Parallel capbility of audition is superior to vision. So, high-dimensional systems, multi-task monitoring, and the like. Please keep me posted. By the way, try to get a hold of the proceedings. They will give you some good thoughts. (But not until October.) Be well. Greg <<<<<<<<< ----------------------------------------------------------------------------- From VOS(at) Fri Jun 25 09:04:27 1993 From: "Piet G. Vos" <VOS(at)> >>>>>>>>> Bop Port's RE to M-A Decoste's Q is a substantial one: Indeed, also in tracking tasks, e.g. requiring to tap in synchrony to a metronome, response variability is systematically larger when the metronome is a visual one (light flashes) instead of the auditory standard. In further agreement with the explanation in terms of different type of physiological processing systems for higher speed and accuracy with auditory reactions is the well established fact that a LED-bound temporal interval is subjectively longer than an objectively equally long click-bound interval. More generally spoken, auditory perception is systematically superior in the domain of time perception and production, and this conclusion is supported by a large number of convergent comparative studies on this issue (c.f. e.g. Walker & Scott, JEP-HPP '81). Piet G. Vos <<<<<<<<< ----------------------------------------------------------------------------- From jfolsen(at) Fri Jun 25 09:04:49 1993 From: "John F. Olsen" <jfolsen(at)> >>>>>>>>> Fitch and Kramer's multichannel auditory monitor is an intriguing device which takes advantage of auditory capablility that is familiar to many neurophysiologists. In my experiments, I find auditory monitoring of EKG to be invaluable. Auditory monitoring conveys instantly changes in heart rate, rhythm, and beat amplitude without the constant attention that a chart recorder or an oscilloscope requires. I also find it easy to correlate stimulus and response through auditory monitoring of the stimulus and action potentials. Best of all, both tasks can be done apparently simultaneously and in conjuction with visually based tasks, such as typing instructions on a keyboard. The auditory mode seems to be capable of dividing and prioritizing attention to a greater extent than the visual mode. In addition, I have found that spatially separating the audiomonitor that broadcasts the neural response from the one for the EKG makes both signals easier to follow. Perhaps Fitch and Kramer could incorporate the auditory spatial dimension in a stereo physiological monitor and thereby add another channel of information, eg., CO2 level coded by elevation, 02 by azimuth. John Olsen Lab. Neurophysiology NIMH <<<<<<<<< ----------------------------------------------------------------------------- From malcolm(at) Fri Jun 25 09:05:43 1993 From: Malcolm Slaney <malcolm(at)> >>>>>>>>> Our group at Apple has released a couple of cochlear models that other auditory researchers might find useful. These services are free.... we find these tools useful for our own work, but we can't support them as you would expect normal products. If you find them useful as they are, great! We use two different models in our daily work. 1) The Patterson-Holdsworth Auditory Filter Bank - Based on a critical band auditory model. This is a linear model with fairly modest computational requirements. A Mathematica technical report is now available that describes a new more efficient implementation. This technical report includes C, Fortran, and Matlab code to design the filters. Matlab code to implement the filters is also shown (it is only three lines long.) To get this report use anonymous FTP to Look for the Gammatone.* files in the /pub/malcolm directory. Or send me a note with your postal address and we can send you the notebook (includes a paper copy and a Macintosh disk with the electronic copy ready for you to modify.) 2) The Lyon Passive Shortwave model - Is a nonlinear model of cochlear behaviour based on a passive non-linearity and a shortwave model of cochlear hydrodynamics. We're working on a better model, but this is all we have to offer for now. Both a technical report and running code were released a couple of years ago. The code still works and runs on just about any Unix machine or a Macintosh. Other people have ported it to DOS. MAIL ORDER EARS.... Finally, we're also offering cochlear modeling by mail. We've established a service where you can send an audio file to a special email address, we compute the cochleagram or correlogram on our Cray, and send you back the picture or the movie by return email. The interfaces are a bit shakey yet, but we're working to make it more stable. (I used the service to settle an argument while I was at ARO in Florida a few months ago.) Both the Lyon and the Patterson cochlear models are available. Let me know if you would find this useful or would like to give it a try. Send me email if you want the Gammatone Tech Report, have any questions, or want more information. Malcolm Slaney Apple ATG Perception Group malcolm(at) <<<<<<<< --------------------------------------------------------------------------- From IN09(at) Mon Jun 28 08:27:39 1993 Return-Path: <(at)> Date: Fri, 25 Jun 93 16:58:43 EDT From: "Albert Bregman, Tel: 514-398-6103" <IN09(at)> To: <decostem(at)IRO.UMontreal.CA> Cc: ""Auditory" Distribution List" <auditory(at)> Subject: Audio characteristics better than visual. In-Reply-To: In reply to your message of TUE 22 JUN 1993 12:32:15 EDT > Hi there, > > > We're looking for any kind of work on human perception to find > characteristics of the auditory sytem that are better than the visual > system. > We typically encode message sequentially in sound (e.g., a sentence), and simultaneously in visual displays (e.g., a printed page.) For this reason, sound maintains its message as it bounces around corners. Light doesn't. So sound is good for warning systems. - Al Bregman <<<<<<<<<<<<<< Marc-Andre Decoste e-mail: decostem(at) D.I.R.O., Universite de Montreal Tel: (514) 343-6111 ext 3513 C.P. 6128, Succ. "A", Montreal, (Qc), CA, H3C 3J7, FX (514) 343-5834 When your eyes are blinding your soul... listen to your heart. (MAD)

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University