Re: Question on defining S/N ratio in speech-in-noise testing (Daniel TAFT )


Subject: Re: Question on defining S/N ratio in speech-in-noise testing
From:    Daniel TAFT  <DTAFT@xxxxxxxx>
Date:    Thu, 13 Aug 2009 08:31:49 +1000
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; = charset=3Diso-8859-1"> <META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version = 6.5.7654.12"> <TITLE>RE: [AUDITORY] Question on defining S/N ratio in speech-in-noise = testing</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <BR> <P><FONT SIZE=3D2>Hi Leo,<BR> <BR> Your query is more important for the signal level, since noise = fluctuates much less. Even 4-talker babble is quite steady compared to = speech; you can see this for yourself with box-and-whisker plots of = single talker and noise envelopes.<BR> <BR> Often the SNR is (or was?) set in the free-field and so averaged over = long tracts of speech. I'm sorry I don't have any references on = hand.<BR> <BR> This would also imply that SNR is calculated before pre-emphasis. I = would argue that it is the long run SNR in the environment that counts = after all. How you process sound to improve the signal presentation = (hopefully) is up to you. I'm not sure of the need to average in small = windows; it might be overkill since SNR fluctuation is natural and will = even out in the long run anyway.<BR> <BR> In my own research with cochlear implants, I generally determined the = rms level over each full sentence. Just be aware that silences in each = track (if they exist) lead to underestimating the signal level and hence = biasing the SNR. So you may wish to use a very small threshold to = exclude silences from the calculation.<BR> <BR> Regards,<BR> Daniel Taft.<BR> <BR> <BR> <BR> -----Original Message-----<BR> From: AUDITORY - Research in Auditory Perception on behalf of Leonid = Litvak<BR> Sent: Thu 8/13/2009 5:57 AM<BR> To: AUDITORY@xxxxxxxx<BR> Subject: [AUDITORY] Question on defining S/N ratio in speech-in-noise = testing<BR> <BR> Hi All,<BR> <BR> I have a question regarding definition of signal-to-noise ratio as = it<BR> applies to speech-in-noise testing, with speech material being = sentences. On<BR> a simple level, SNR is just level of the signal divided by the level of = the<BR> noise.<BR> <BR> The signal is typically speech, so its level fluctuates over time. Do = people<BR> typically use the average signal level computed over the whole = sentence,<BR> average signal level computed in 100 ms windows, medium signal = level,<BR> maximum signal level, etc.?<BR> <BR> The same question could go for the noise token as well.<BR> <BR> I would very much appreciate references to papers that discuss these = issues.<BR> <BR> Finally, we are interested to apply these tests to cochlear implant<BR> recipients that have a well-characterized pre-emphasis curve as part = of<BR> their processor. Should the pre-emphasis curve be taken into account = when<BR> computing S/N ratios? This is not an issue for spectrally-matched = noises,<BR> but may be an issue for non-matched noises.<BR> <BR> Thank you very much!<BR> <BR> Leo<BR> <BR> </FONT> </P> </BODY> </HTML>


This message came from the mail archive
http://www.auditory.org/postings/2009/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University