The phenomenon that sound localization is sometimes misled by the direction of a simultaneously presented visual object is known as the ventriloquist effect. The perceptual ability of making a correspondence between speech sound and the relevant speaker among several speakers was investigated using the ventriloquist effect. A speech sound of a 6-mora word was presented to a subject from one of the five loudspeakers located at intervals of 7.5 deg. At the same time two speakers' images were presented by two CRT displays located at intervals of 30 deg in front of a subject. Four picture conditions based on the relationship between speech sound and visual gestural information were used: still (no motion)--still, still--unrelated (uttering a different word with the presented speech sound), still--related (uttering the same word with the presented speech sound), and unrelated--related. Subjects reported the location of the sound source, seeing the displayed speakers' faces. The ventriloquist effect was observed only when the ``related'' picture was presented. A second experiment was conducted to examine the effect of target sound delay from visual gestural information. The results show that the ventriloquist effect was still observed up to the 300-ms delay condition, although the effect gradually decreased as the delay increased.