1aSC2. Audiovisual integration of speech based on minimal visual information.

Session: Monday Morning, December 2


Author: D. H. Whalen
Location: Haskins Labs., 270 Crown St., New Haven, CT 06511
Author: Julia Irwin
Location: Haskins Labs., New Haven, CT 06511
Author: Carol A. Fowler
Location: Univ. of Connecticut


Two competing theories have been proposed to explain the fact that vision can dominate over audition in syllables that have been spliced so that the two modalities specify different phonemes [McGurk and McDonald, Nature 263, 746--748 (1976)] . The first theory states that acoustic and visual information are combined in varying proportions depending on how strong the information is in each signal. The second proposes that the visual signal has linguistic value because speech gestures can be conveyed visually, and that these gestures are the primitives of speech perception for every modality. The present experiment contrasts dynamic and static visual information by reducing the visual signal to two or three video frames, synchronized with the speech in the appropriate location. Dynamic stimuli had at least two frames showing movement of the mouth, while static ones had a single frame, taken from the consonant closure, repeated to make a three frame visual image. Even these brief images were enough to elicit speech percepts that matched the visual image. The dynamic and static images were equally effective, suggesting revisions in both theories. [Work supported by NIH Grant No. HD-01994.]

ASA 132nd meeting - Hawaii, December 1996