This study investigated four factors in vowel perception. Two of these factors are linguistic in nature: the F1 value and the identity of the visual word. The other two factors are speaker-related: the F0 value and the visual gender of the speaker. The auditory stimuli used in this experiment were tokens in continua ranging from ``hood'' to ``hud'' for both a male and a female voice which were produced using LPC resynthesis, maintaining the original male or female voice source and altering the formant values. These continua were then matched with movies of male and female speakers saying either ``hood'' or ``hud,'' to create four series of stimuli. From identification judgments made by 20 listeners, we calculated 50 crossover points on the F1 ``hood''--``hud'' continuum. The boundaries were analyzed with a three-factor ANOVA (visual gender, visual word, and original voice). There were three main effects and no interactions. The visual word effect shows that visual and auditory cues are integrated in determining vowel quality. The original voice effect is essentially an F0 normalization effect. The visual face effect shows auditory-visual integration in the perception of speaker identity which, we hypothesize, has an impact on perceived vowel quality.