Re: Recordings in enclosed spaces. (Peter Lennox )

Subject: Re: Recordings in enclosed spaces.
From:    Peter Lennox  <peter(at)LENNOX01.FREESERVE.CO.UK>
Date:    Tue, 4 Sep 2001 14:24:24 +0100

you said:-"This differences between these two tests is paritally attributable to the overall intelligibility level, but nonetheless leads me to the belief that speaker problems may interact with the reverberation to cause an intelligibility loss." It does seem to me that one problem may arise simply because a loudspeaker is a high-pressure/low volume device, whereas many of the sounding objects (such as human speakers) are the opposite; low-pressure/high volume. So a loudspeaker will excite local nodes in quite a different way. "Someone brought up the question if the subjects could tell they were listening to recordings. Several subjects after completing the experiments in the anechoic chamber commented how the recordings and speaker presentations sounded the same. The recordings made in reverberation did not draw the same comments, and when I listen I can tell the difference between revereberation and recorded reverberation." - -It would be interesting to see how people fared when both types of source are operating simultaneously, especially so where a human speaker and a recording of the same are used. My guess is that some confusion may arise, especially where a particular topic is common to both. "Possible causes 1. Speakers As seems to be a consensus of the list, the loudspeaker system is the first to blame for reduced intelligibility or quality. To compensate for loudspeaker limitations, I used two speakers without an enclosure mounted face to face wired in phase. Thus creating a volume source similar to a monopole, informal anechoic tests show that this condition is met at lower frequencies but not at higher frequencies. Of course a human voice is not a monopole, but I selected this type of source because I was also testing simulations that used monopole sources. The tests showed insignificant differences between the simulations and listening to the speaker in the room, thus leading to the conclusion that the monopole approximation was acceptable." -Again, a monopole is a funny sort of object (like a point-source), and not so many 'real' objects are like monopoles. The nearest I can think of is a mosquito, or a cricket(we don't have cicadas here). I find crickets quite hard to find quickly, though one near a wall is easier. I've been working on a notion that 'facingness' (:audio output asymmetry) is quite a useful attribute of 'things' in 'places', especially where the 'place' in question is non-anechoic and asymmetrical. But it seems to me that many of the usable characteristics of facingness disappear in anechoic circumstances. John Neuhoff (also on this list) has been doing some experiments on 'facingness'. "2. Receivers The other aspect which has received considerable attention is the importance of interaural clues. These clues are undoubtedly important for multiple sources of speech and noise. However I have not yet found any research showing that localization of multiple echoes is a viable means of removing reverberation for complex signals. (To my knowledge the precedence effect has not been extended to include complex sounds and multiple echoes.)" - - I may have misunderstood; is it really neccessary to say that multiple echoes must actually be *localised* to facilitate suppression or 'subtraction'? - one could understand how they only would need to be 'negatively localised' (i.e. "not from there") to be suppressed. "Closing comments, Even with this extensive explanation I am still left with the question about why in reverberant settings do recordings of loudspeakers reduce intelligibiltiy? My current focus is on the inexpensive equipment that my budget has permitted me to buy. I am aware of some clipping of the louder speech phonemes. These had little effect on the anechoic tests, thus my hypothesis that clipping interacts with revereberation. Noise is know to interact with reverberation and cause a greater decreases in intelligibility than the sum of the individual effects alone. I estimate the S/N levels in all of my tests to be >= 11 dB (considering bands 32-16000 Hz) and >=22 dB (considering bands 125-4000 Hz). I'm don't want to admit my "hi-fi" isn't good enough and it certainly isn't as interesting as binaural theories, but until my next set of tests prove otherwise it must remain in question." - -Certainly, speaker crossovers, any restriction on system headroom (psu etc) any signal processing and AD/DA conversions,as well as the above comments about hi-spl / low vol. devices, all serve to 'flatten' transients and homogenise temporal 'edges'. Sort of 'artificial dyslexia'! It would be interesting to do your intelligibility experiments in a variety of spatial-sound codecs (stereo, 5.1, ambisonics and wavefield synthesis, etc.,), as 'separability' and directional localisation are generally design criteria, but I'm not sure how often intelligibility is used as a test. In any event, those approaches that attempt wave-front reconstruction also achieve 'perceptual de-localisation' of loudspeakers (to a greater or lesser extent). Some, such as wavefield synthesis, do achieve a near-monople image, and one which is quite localisable to the ambulant percipient. Probably still too many variables, though! Sincerely, ppl

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University