With regard to temporal analysis, an
alternative to the equivalent sound pressure level (=long term rms) is the ‘active
speech level’, as defined by ITU-T P.56. This disregards the ‘silences’
in the speech, and also can quantify the ratio of speech to ‘silence’.
This seems to be a quite sensible approach to quantifying speech level, but I
am not sure if it is used in research much beyond the telecommunications field.
From: AUDITORY - Research in Auditory Perception on behalf of Leonid Litvak
Sent: Thu 8/13/2009 5:57 AM
Subject: [AUDITORY] Question on defining S/N ratio in speech-in-noise testing
I have a question regarding definition of signal-to-noise ratio as it
applies to speech-in-noise testing, with speech material being sentences. On
a simple level, SNR is just level of the signal divided by the level of the
The signal is typically speech, so its level fluctuates over time. Do people
typically use the average signal level computed over the whole sentence,
average signal level computed in 100 ms windows, medium signal level,
maximum signal level, etc.?
The same question could go for the noise token as well.
I would very much appreciate references to papers that discuss these issues.
Finally, we are interested to apply these tests to cochlear implant
recipients that have a well-characterized pre-emphasis curve as part of
their processor. Should the pre-emphasis curve be taken into account when
computing S/N ratios? This is not an issue for spectrally-matched noises,
but may be an issue for non-matched noises.
Thank you very much!