Re: Location of phonemic boundaries

You are right. The characterization that you give is a paraphrase of a classic descriptive paper by Fant. Liberman showed the perceptual effects, and Stevens considered the consequences of the asynchronous correspondence of phoneme segment and acoustic correlates in a feature-based account of speech perception and lexical access.

Here are the citations:

Fant, C. G. M. (1962). Descriptive analysis of the acoustic aspects of speech. Logos, 5, 3-17.

Liberman, A. M. (1970). The grammars of speech and language. Cognitive Psychology, 1, 301-323.

Stevens, K. N. (2005). Features in speech perception and lexical access. In D. B. Pisoni and R. E. Remez (Eds.), The Handbook of Speech Perception (pp. 125-155). Oxford, Blackwell. 

Each of these papers is readily available, I believe.

On Aug 12, 2008, at 6:03 PM, Athanassios Protopapas wrote:

But the phonemes are still overlapping, aren' they?
That is to say, the perceptual beginning of each phoneme typically
precedes the perceptual end of at least one previous phoneme.
Therefore, this procedure may help identify reliably some points in
time meeting specific criteria, but does not help segment time into
individual phonemes. And the extent of overlap varies quite
substantially, depending on phonemes, speaker, conditions, tempo etc.

On Tue, Aug 12, 2008 at 7:42 PM, Richard Warren <rmwarren@xxxxxxx> wrote:
Dear List,

    Jim Miller pointed out on 8 August that "It is well established that the
acoustic information used by a listener to identify a consonant or a vowel
is overlapping and distributed acoustically across a considerable span of
time."  He indicated that although some have attempted to identify the
acoustic locations of consonants and vowels in running speech, they have for
the most part failed since coarticulation extends well into adjacent
phonemes.  But if the question is changed from "acoustic" boundaries to
"perceptual" boundaries, the task becomes rather easy.

     When a sentence is abruptly terminated, the last speech sound is easily
perceived.  By using an arbitrary starting point before the beginning of a
recorded sentence, and moving the time of the cutoff through the sentence,
it is easy to map the perceptual beginning and end of each phoneme within a
few milliseconds.  We have been using this procedure for several decades.

