Coarticulatory acoustic variation is presumed to be caused by temporally overlapping linguistically significant gestures of the vocal tract. The complex acoustic consequences of such gestures can specify them without recourse to context-sensitive representations of phonetic segments. When the consequences of separate gestures converge on a common acoustic dimension, as during coarticulation, perceptual parsing of overlapping spoken gestures, rather than associations of acoustic features, is required to resolve the distinct gestural events. Direct tests of this theory were conducted in which mutual influences of (1) fundamental frequency during a vowel on prior consonant perception and (2) consonant identity on following vowel stress and pitch perception were found. The results of these converging tests lead to the conclusion that speech perception involves a process in which coarticulated segments are parsed from the acoustic stream of speech along gestural lines.