[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]

Dear Al,

Thanks for your quick response.  My demonstration may be not an
appropriate one for your purpose, but I just thought it showed
something related to your stuff.  Your explanation (derived
from Whalen & Liberman's idea) seems possible basically.  But
I would like to return the following questions to you for all
of us to think the matter over:

1. Why did my listeners hear several human voices instead of
several pure tones or non-speech sounds ?  It is strange that
the components left for non-speech percepts produce the
perceptual impression of human voices uttering the same vowel.

2. Why doesn't the same phenomenon take place when we listen
to natural speech.  If we need just a small amount of energy for
phoneme perception and the rest is used to perceive other voices
or sounds, we might hear more than one voice when listening to
someone's speech.

My temporary explanation of my demonstration presumes that the
perception of the vowel and the perception of the inharmonic
components are performed in parallel.  1) The vowel /a/, /i/
or /u/ is recognized from the fixed spectral envelope.  2) We
perceive several voices (or tones) because we cannot fuse the
inharmonic components into a single voice.  3) The vowel
recognized in the first stage is allocated to the voices
perceived in the second stage.  Note that there is no reason
to consider that the first stage, which is schematic, should
be finished before the more primitive second stage.  Taking
into account the questions I put above, I think my present
explanation is the most plausible one for the time being.  But
I'm open to any criticisms or "debugs".

                                  Yoshitaka Nakajima