[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: question about speech level



Ali,

A technical detail to consider is that the AC40 typically describes signals in HL rather than SPL. For a broadband speech signal, the calculation is probably trivial for you, but be sure to modify both the target and masker on the same scale if you’re comparing levels for SNR.


Tony’s suggestion looks like a great way of getting an appropriate SNR, but I would consider more than just leading & trailing pauses. The presence of silences within the sentence is another problem. A recording of a sentence spoken by a talker who pauses a lot (say, “less-fluent”) contains less energy in the signal compared to a recording of a more fluent talker. Thus, less noise is required to mask the less-fluent talker to reach the same SNR. This creates an unfair advantage for the less-fluent talker target signal (and on the word level, an unfair advantage for words ending in voiceless stops). I say unfair because the perceived volume of the actual signal content could be the same, but the signals are masked unevenly. Perhaps the ITU Objective Measurement of Active Speech Level will address that problem as well.

 

Matt

 

 



On Mon, Jul 23, 2012 at 4:45 AM, ali fallah <ali.fallah@xxxxxxxxx> wrote:
Dear List
 
For creating speech in desired level, I have an AC40 audiometer that produce speech signal in selected dB SPL from a wav file. The recordings has silent periods before and after speech signal. If  I delete silent periods before and after the speech signal, the sound intensity  that audimeter produce, will not change.  Now I want to prepare a mixture of speech and noise in optional SNR values in Matlab but the RMS value of speech signal depends on  silent gaps. 
 
I have two question about this: I want to know for speech signal that is nonstationary, how  desired level in dB SPL is set . Also how can I mix speech and noise in Matlab by setting noise in desired value (for example noise level=65dBSPL and SNR=-10 until we have speech in 55 dBSPL). 
I will grateful if anyone guide me about this and let me know about related documents.
 
best regards