[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

THE EAR VS THE EYE: Summary of a survey.

Hi there listeners,

  a couple of weeks ago I asked for info on the advantages of our audio
perception system over our visual perception system. I would like to
personaly thank everyone who kindly replied with very useful information.
Since the info could be useful for others, I included all the answers at
the end of this message.

  Some people showed interest in the results of my search for a Computer
Science project using sounds (which was the goal of my request for info).
After reading some articles on the subject, browsing through the
biographical database of this distribution list and, of course, the replies
to my request included in this message, I thought we could use sounds to
give us an insight of whats beyond the screen when we work with semantic
nets.  There is a certain amount of information that can be seen on a
computer monitor.  When we are browsing through a semantic net (or any
other knowledge relations organization) we need to know which way to go
when we're looking for info beyond the screen. Like when you are in a room,
you see many closed doors and you can hear what's beyond these doors. We
want to use semantic nets because we can assign sound characteristics to
nodes' attributes and sound effects (or filters) to arcs' attributes. We're
not sure how much info can be acquired by a general user this way, but
that's the fun part of doing Ph.D. research... isn't it?



   Here are the replies I received to my request... Enjoy!

From jfolsen@helix.nih.gov Tue Jun 22 16:18:27 1993
From: "John F. Olsen" <jfolsen@helix.nih.gov>
There is a recent human study [Perrott et al.,1993, JASA 93:2134-2138] that
shows that the auditory modality is as good, if not better than the visual in
determining the relative spatial directions of stimuli presented sequentially.
Other work by this group shows that the auditory modality is superior to the
visual in redirecting the center of gaze.
John Olsen
Lab. Neurophysiology
From levitin@darkwing.uoregon.edu Tue Jun 22 16:18:43 1993
Well, I've just done some work that rather convincingly shows that the
auditory system is very good at making absolute judgments of
perceptual stimuli.  And if you compare this to work done in vision,
the auditory system is far better at absolute judgments, using memory
(that is stored perceptual stimuli).  Specifically, I found that
non-musicians are very very good at remembering the absolute pitches
of familiar songs.  In contrast, people are very poor at remembering
the absolute color( hue), brightness or saturation of colored objects.

If you're interested, I'll send you a paper.

Dan Levitin
From mcfadden@psyvax.psy.utexas.edu Tue Jun 22 16:18:59 1993
The auditory system works even in the dark.
The auditory system works even when the ears are not aimed at the stimulus.
There are no lids covering the ears.
The auditory system has its entire 12 log units of dynamic range available
to it at all times; there is none of that pansy slow-acting light- and
All the richness of auditory experience originates from about 4000 inner
hair cells as compared to 6 million cones and 120 million rods.
Cochleas can emit sounds; eyes emit no light.
The development of true language ability requires an intact, functional
auditory system.
The auditory system works essentially in real time while the visual system
is always a couple of hundred milliseconds behind.
Auditory scientists never engage in hyperbole; vision scientists are forced to.

Vision is an amusing minor sense given us solely to occupy the time of
lesser minds.

From port@cs.indiana.edu Wed Jun 23 08:36:23 1993
From: Robert Port <port@cs.indiana.edu>
One major difference between the auditory and visual channels is
sensitivity to rapid events in time.  Audition beats vision by a long
shot -- a factor of 30-40 -- presumably because the chemical response
to light takes much longer than the mechanical response to sound
pressure.  The auditory nerve apparently transmits a phase-locked
representation of sound events all the way up to nearly 1000 Hz. (Of
course the ear detects frequencies much higher than that, but I'm
talking about time-synchronous response to individual events.)  This
means that individual periods are resolved almost down to a single
msec.  The visual system, on the other hand, fails to detect flicker
in movie images that flash only 24-30 times a second.

Now whether this can be exploited for the purpose of facilitating data
interpretation, I don't know.  It will take some imagination.

        Bob Port
        Linguistics/Comp Sci/Cog Sci
        Indiana University
From saberi@garnet.berkeley.edu Wed Jun 23 08:36:33 1993
Dear Dr. MAD,
You might want ot contact David R. Perrott.  He has recently been
comparing the spatial acuity of the visual and auditory systems
under similar conditions, such as when the stimuli are presented
successively in both modalities, instead of simultaneously in vision
and successively in audition.  I'm not sure if these data are
published yet.  He may also have other insights for you.  His
address is:

David R. Perrott
Department of Psychology
California State University
Los Angeles, CA 90032
(213) 343-2266

-Kourosh Saberi
Dept. of Psych
U.C. Berkeley

                                  PART II

From tec@drew.cog.brown.edu Fri Jun 25 09:03:10 1993
From: Tecumseh Fitch <tec@drew.cog.brown.edu>
I and my collaborator Greg Kramer recently reported on some work on
just this topic; it will appear in the proceedings of the ICAD
(International Conference on Auditory Display) conference which
happened in Santa Fe in October '92. The proceedings volume is in
press at Addison-Wesley; it should be out by September.

In brief, we studied a complex task requiring simultaneous monitoring
of many continuously-changing variables, and found an auditory display
to be superior to a standard visual display.  College undergrads were
briefly trained as anesthesiologists: they learned to monitor 8
physiological variables in a computer-simulated human body, and to
respond appropriately to medical emergencies like overdose, blood
loss, etc.  Information from the "digital patient" was presented
through either a standard visual display (a strip chart) or an
auditory display which we created.

The auditory display used two "base streams" which sounded like a
heart beating and a person breathing, respectively. These conveyed
information (not surprisingly) on heart rate and breathing rate.
Other, more abstract, variables were "piggy-backed" onto these base
streams: for example, blood pressure controlled the pitch of the heart
sound, and body temperature controlled the center frequency of the
band-pass filter used to make the breathing sound.

We found that subjects responded faster AND more accurately when using
the auditory display than with the visual display. The results
suggested that subjects formed an gestalt representation of each
medical problem when using the auditory display, and were less able to
do this with the visual display.

These results may not surprise people working in audition: it seems
intuitively obvious that we are able to process multiple information
sources simultaneously in audition (why else would people enjoy
chamber music and symphonies?). In contrast, the primate visual system
is adapted for foveating individual objects serially. As intuitive as
these results may be, they are apparently surprising to some (see the
writeup of our study in this month's American Scientist, 81(3) p 229)

Other examples (in simpler tasks) are Tzelgov et al. (1987: Human
Factors 29(1): 87-95), who found auditory superiority in a Geiger
counter task, and Lewandowski and Kobus (1989: Human Perf. 2(1):
73-84) who got faster (but less accurate) performance with audition in
a simple sonar target ID task.

> There is a recent human study [Perrott et al.,1993, JASA 93:2134-2138]
> that shows that the auditory modality is as good, if not better than
> the visual in determining the relative spatial directions of stimuli
> presented sequentially.

Perrott et al. found no significant difference between minimum audible
angle (MAA) and minimum visible angle (MVA) under their experimental
conditions, but it is important to realize that their MVAs were orders
of magnitude higher than the generally-accepted best values for vision
(less than 10 sec of arc, vs. their 27 minutes of arc!). Thus the
result, while theoretically quite interesting, may have limited
practical importance.

Tecumseh Fitch (tec@cog.brown.edu)
Dept. of Cognitive and Linguistic Sciences
Brown University, Box 1978
Providence, RI 02912
From 70312.265@compuserve.com Fri Jun 25 09:04:16 1993
From: Gregory Kramer <70312.265@compuserve.com>

You should seriously consider two aspects of the auditory system that could
guide your choice of an auditory display project:

1.  Temporal resolution of audition is far superior to vision.  So, monitoring
and analysis tasks that need good temporal resolution, such as
synchronization, regularity of repeats, etc.

2.  Parallel capbility of audition is superior to vision.  So,
high-dimensional systems, multi-task monitoring, and the like.

Please keep me posted.  By the way, try to get a hold of the proceedings.
They will give you some good thoughts.  (But not until October.)

Be well.

From VOS@nici.kun.nl Fri Jun 25 09:04:27 1993
From: "Piet G. Vos" <VOS@nici.kun.nl>
Bop Port's RE to M-A Decoste's Q is a substantial one: Indeed, also in
tracking tasks, e.g. requiring to tap in synchrony to a metronome,
response variability is systematically larger when the metronome is a
visual one (light flashes) instead of the auditory standard. In
further agreement with the explanation in terms of different type of
physiological processing systems for higher speed and accuracy with
auditory reactions is the well established fact that a LED-bound
temporal interval is subjectively longer than an objectively equally
long click-bound interval. More generally spoken, auditory perception
is systematically superior in the domain of time perception and
production, and this conclusion is supported by a large number of
convergent comparative studies on this issue (c.f. e.g. Walker &
Scott, JEP-HPP '81).

Piet G. Vos
From jfolsen@helix.nih.gov Fri Jun 25 09:04:49 1993
From: "John F. Olsen" <jfolsen@helix.nih.gov>
Fitch and Kramer's multichannel auditory monitor is an intriguing device which
takes advantage of auditory capablility that is familiar to many
neurophysiologists.  In my experiments,  I find auditory monitoring of EKG to
be invaluable. Auditory monitoring conveys instantly changes in heart rate,
rhythm, and beat amplitude without the constant attention that a chart recorder
or an oscilloscope requires.  I also find it easy to correlate stimulus and
response through auditory monitoring of the stimulus and action potentials.
Best of all, both tasks can be done apparently simultaneously and in conjuction
with visually based tasks, such as typing instructions on a keyboard.  The
auditory mode seems to be capable of dividing and prioritizing attention to a
greater extent than the visual mode.  In addition, I have found that spatially
separating the audiomonitor that broadcasts the neural  response from the one
for the EKG makes both signals easier to follow. Perhaps Fitch and Kramer could
incorporate the auditory spatial dimension in a stereo physiological monitor
and thereby add another channel of information, eg., CO2 level coded by
elevation, 02 by azimuth.
John Olsen
Lab. Neurophysiology
From malcolm@apple.com Fri Jun 25 09:05:43 1993
From: Malcolm Slaney <malcolm@apple.com>
Our group at Apple has released a couple of cochlear models that other auditory
researchers might find useful.  These services are free.... we find these
tools useful for our own work, but we can't support them as you would expect
normal products.   If you find them useful as they are, great!

We use two different models in our daily work.
        1)      The Patterson-Holdsworth Auditory Filter Bank - Based on a
                critical band auditory model.  This is a linear model with
                fairly modest computational requirements.

                A Mathematica technical report is now available that describes
                a new more efficient implementation.  This technical report
                includes C, Fortran, and Matlab code to design the filters.
                Matlab code to implement the filters is also shown (it
                is only three lines long.)

                To get this report use anonymous FTP to ftp.apple.com.  Look
                for the Gammatone.* files in the /pub/malcolm directory.

                Or send me a note with your postal address and we can send
                you the notebook (includes a paper copy and a Macintosh disk
                with the electronic copy ready for you to modify.)

        2)      The Lyon Passive Shortwave model - Is a nonlinear model of
                cochlear behaviour based on a passive non-linearity and a
                shortwave model of cochlear hydrodynamics.  We're working on
                a better model, but this is all we have to offer for now.

                Both a technical report and running code were released a couple
                of years ago.  The code still works and runs on just about
                any Unix machine or a Macintosh.  Other people have ported
                it to DOS.

Finally, we're also offering cochlear modeling by mail.  We've established
a service where you can send an audio file to a special email address, we
compute the cochleagram or correlogram on our Cray, and send you back the
picture or the movie by return email.  The interfaces are a bit shakey yet,
but we're working to make it more stable.  (I used the service to settle an
argument while I was at ARO in Florida a few months ago.)  Both the Lyon
and the Patterson cochlear models are available.  Let me know if you would
find this useful or would like to give it a try.

Send me email if you want the Gammatone Tech Report, have any questions, or
want more information.

                                                Malcolm Slaney
                                                Apple ATG Perception Group
From IN09@musicb.mcgill.ca Mon Jun 28 08:27:39 1993
Return-Path: <@vm1.mcgill.ca:IN09@MUSICB.MCGILL.CA>
Date:        Fri, 25 Jun 93 16:58:43 EDT
From: "Albert Bregman, Tel:  514-398-6103" <IN09@musicb.mcgill.ca>
To: <decostem@IRO.UMontreal.CA>
Cc: ""Auditory" Distribution List" <auditory@vm1.mcgill.ca>
Subject: Audio characteristics better than visual.
In-Reply-To: In reply to your message of TUE 22 JUN 1993 12:32:15 EDT

> Hi there,
>    We're looking for any kind of work on human perception to find
> characteristics of the auditory sytem that are better than the visual
> system.
We typically encode message sequentially in sound (e.g., a sentence),
and simultaneously in visual displays (e.g., a printed page.)  For
this reason, sound maintains its message as it bounces around corners.
Light doesn't.  So sound is good for warning systems.
 - Al Bregman

Marc-Andre Decoste                 e-mail: decostem@IRO.umontreal.ca
D.I.R.O., Universite de Montreal        Tel: (514) 343-6111 ext 3513
C.P. 6128, Succ. "A", Montreal, (Qc), CA, H3C 3J7, FX (514) 343-5834
When your eyes are blinding your soul... listen to your heart. (MAD)