[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
noise-masking experiments and labiovelars
Hi listers, I have two related questions for you.
I'm interested in the effects of following vowels on the
perception of labiovelars. I was wondering if 1) anybody had
done the experiment I'm proposing and 2) whether anybody had
tried using stimuli generated in a manner I'm proposing.
1: There's a well-known diachronic pattern whereby labiovelars
like [k^w] (English <qu>) followed by rounded vowels like [u]
dissimilate, with outcomes like [ku], [pu], [u], etc. I wanted
to investigate whether all these patterns could be motivated
strictly by misperception. Subjects will be presented with
stimuli of the form consisting of one of the stops [p t k k^w],
followed immediately by one of [i e a o u] (as pronounced by a
human speaker) masked in various degrees of noise and presented
with a forced choice immediately afterwards (between
[p t k k^w]).
Surprisingly, I haven't been able to find any prior work doing
this. The closest I've gotten is this, which crucially lacks
H. Winitz, M. Schieb, and J. Reeds. Identification of Stops and
Vowels for the Burst Portion of /p, t, k/ Isolated from
Conversational Speech. Journal of the Acoustical Society of
America, 51:1309â1317, 1972.
Does anyone know of any work with the relevant stimuli?
2) The masking studies I'm aware of seem to be essentially
adding random samples to the original stimuli to generate their
noise conditions. I was wondering if anyone had tried to make
this more natural by matching the intensity contour of the noise
to the intensity contour of the original stimuli.
The procedure is as follows. I calculate the RMS amplitude of
the original signal by convolving the squared signal with a
Kaiser window with beta = 20, # of points given by 3.2 x the
number of frames for a single period of the lowest pitch, which
I set at 100 Hz (young female speaker), then taking the square
root of the result. I then crop the ends to get an intensity
signal that is of the same length as the original signal. I
believe this is the procedure described in the Praat manual. I
then multiply a multiple of this signal by values sampled from
the uniform distribution [-1, 1], add the original stimuli to
the noise, and then renormalize.
This sounds, to my ear, more natural then the stimuli simply in
a bed of noise, though because of the Gaussian window, I
perceive is a tiny bit of "anticipation" of the noise, and an
equal amount of lag as well, though it is very subtle. Has
anyone tried this? Is this a natural thing to compared to the
"bed of noise" in the past? The "bed of noise" condition seems
to be more like "being in the same room as a constant loud
noise", whereas the "envelope of noise" I describe seems more
likely to turn up errors caused by human production/perception
Kyle Gorman ~ kgorman@xxxxxxxxxxxxxx ~ 513 405 2543