[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recordings in enclosed spaces.

you said:-"This differences between these two tests is
paritally attributable to the overall intelligibility level, but nonetheless
leads me to the belief that speaker problems may interact with the
reverberation to cause an intelligibility loss."

It does seem to me that one problem may arise simply because a loudspeaker
is a high-pressure/low volume device, whereas many of the sounding objects
(such as human speakers) are the opposite; low-pressure/high volume. So a
loudspeaker will excite local nodes in quite a different way.

"Someone brought up the question if the subjects could tell they were
to recordings.  Several subjects after completing the experiments in the
anechoic chamber commented how the recordings and speaker presentations
the same.  The recordings made in reverberation did not draw the same
and when I listen I can tell the difference between revereberation and
reverberation." -

-It would be interesting to see how people fared when both types of source
are operating simultaneously, especially so where a human speaker and a
recording of the same are used. My guess is that some confusion may arise,
especially where a particular topic is common to both.

"Possible causes
1. Speakers
    As seems to be a consensus of the list, the loudspeaker system is the
first to
blame for reduced intelligibility or quality.  To compensate for loudspeaker
limitations, I used two speakers without an enclosure mounted face to face
wired in phase.  Thus creating a volume source similar to a monopole,
anechoic tests show that this condition is met at lower frequencies but not
higher frequencies.  Of course a human voice is not a monopole, but I
this type of source because I was also testing simulations that used
sources.  The tests showed insignificant differences between the simulations
and listening to the speaker in the room, thus leading to the conclusion
the monopole approximation was acceptable."

-Again, a monopole is a funny sort of object (like a point-source), and not
so many 'real' objects are like monopoles. The nearest I can think of is a
mosquito, or a cricket(we don't have cicadas here). I find crickets quite
hard to find quickly, though one near a wall is easier.
I've been working on a notion that 'facingness' (:audio output asymmetry) is
quite a useful attribute of 'things' in 'places', especially where the
'place' in question is non-anechoic and asymmetrical. But it seems to me
that many of the usable characteristics of facingness disappear in anechoic
circumstances. John Neuhoff (also on this list) has been doing some
experiments on 'facingness'.

"2. Receivers
The other aspect which has received considerable attention is the importance
interaural clues.  These clues are undoubtedly important for multiple
of speech and noise.  However I have not yet found any research showing that
localization of multiple echoes is a viable means of removing reverberation
complex signals. (To my knowledge the precedence effect has not been
to include complex sounds and multiple echoes.)" -

- I may have misunderstood; is it really neccessary to say that multiple
echoes must actually be *localised* to facilitate suppression or
'subtraction'? - one could understand how they only would need to be
'negatively localised' (i.e. "not from there") to be suppressed.

"Closing comments,
Even with this extensive explanation I am still left with the question about
why in reverberant settings do recordings of loudspeakers reduce
intelligibiltiy?  My current focus is on the inexpensive equipment that my
budget has permitted me to buy.  I am aware of some clipping of the louder
speech phonemes.  These had little effect on the anechoic tests, thus my
hypothesis that clipping interacts with revereberation.  Noise is know to
interact with reverberation and cause a greater decreases in intelligibility
than the sum of the individual effects alone.   I estimate the S/N levels in
all of my tests to be >= 11 dB (considering bands 32-16000 Hz) and >=22 dB
(considering bands 125-4000 Hz).  I'm don't want to admit my "hi-fi" isn't
enough and it certainly isn't as interesting as binaural theories, but until
next set of tests prove otherwise it must remain in question." -

-Certainly, speaker crossovers, any restriction on system headroom (psu etc)
any signal processing and AD/DA conversions,as well as the above comments
about hi-spl / low vol. devices, all serve to 'flatten' transients and
homogenise temporal 'edges'. Sort of 'artificial dyslexia'!
It would be interesting to do your intelligibility experiments in a variety
of spatial-sound codecs (stereo, 5.1, ambisonics and wavefield synthesis,
etc.,), as 'separability' and directional localisation are generally design
criteria, but I'm not sure how often intelligibility is used as a test. In
any event, those approaches that attempt wave-front reconstruction also
achieve 'perceptual de-localisation' of loudspeakers (to a greater or lesser
extent). Some, such as wavefield synthesis, do achieve a near-monople image,
and one which is quite localisable to the ambulant percipient.
Probably still too many variables, though!