Dissertation on Identification of Environmental Sounds available (Brian Gygi )

Subject: Dissertation on Identification of Environmental Sounds available
From:    Brian Gygi  <bgygi(at)INDIANA.EDU>
Date:    Thu, 12 Jul 2001 14:05:44 -0500

My dissertation, 'Factors in the Identification of Environmental Sounds' is now available online in PDF format at http://www.indiana.edu/~k300bg/ The abstract is included below. Sincerely, Brian Gygi, Ph.D. Indiana University ABSTRACT Environmental sounds have been little studied in comparison to the other main classes of naturally-occurring sounds, speech and music. This dissertation describes a systematic investigation into the acoustic factors involved in the identification of a representative set of 70 environmental sounds. The importance of various spectral regions for identification was assessed by testing the identification of octave-width bandpass-filtered environmental sounds on trained listeners. The six filter center frequencies ranged from 212 to 6788 Hz. The poorest identifiability was in the lowest filter band, at 31% correct, whereas in the four highest filters performance was consistently between 70-80% correct (chance was 1.4%). The contribution of temporal information to the identifiability of these sounds was estimated by using 1-Channel Event-Modulated Noises (EMN) which have the amplitude envelopes of the environmental sounds used, but nearly uniform spectra. Six-Channel EMN which contained some coarse-grained spectral information were also utilized. The identification of both sets of EMN was tested on both experienced and naive listeners. With the 1-Channel EMN, Naive listeners performed poorly, only achieving 22% correct, whereas Experienced listeners fared much better, at 46% correct. Naive listeners recognized the 6-Channel EMN much more easily than the 1-Channel, reaching 54% correct. The sounds that were well recognized across all conditions generally had a distinct temporal envelope and few or no salient spectral features. Some acoustic properties seemed to predict the EMN data fairly well. Combinations of several temporally-and spectrally-based variables accounted for 60% of the variance for the Experienced listeners with 1-Channel EMN, 39% of the variance for the Naive listeners with 1-Channel EMN, and 50% of the variance for Naive listeners with 6-Channel EMN. The variables that were common across the three solutions represented acoustic features such as bursts in the amplitude envelope, movement of the spectral centroid and periodicity. These findings indicate that environmental sounds are similar to speech in spectral-temporal complexity, robustness to signal degradation, and in the acoustic cues utilized by listeners.

This message came from the mail archive
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University