[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

audio-visual correlation

List members:

I have been reading with interest the series of recent postings in
response to Al Bregman's posting of a student query regarding the
interaction of audio and visual components in film and animation.  I
posted a brief message myself - directly to Dr. Bregman - referencing my
previous research.  Several individuals have requested further
information about the research projects to which I referred in that
message, so in response I will post the abstracts & publication info for
those two investigations.  If anyone would like further information,
feel free to contact me at the email address indicated below ... or we
can continue this interesting forum as a group.


Lipscomb, S.D. & Kendall, R.A. (1994).  Perceptual judgment of the
relationship between musical and visual components in film.
Psychomusicology, 13, 60-98.

        In this study, the authors investigate the relationship between
the musical soundtrack and visual images in the motion picture
experience.  Five scenes were selected from a commercial motion picture
along with their composer-intended musical scores.  Each soundtrack was
paired with every visual excerpt, resulting in a total of twenty-five
audio/visual composites.  In Experiment I, subjects selected the
composite in which the pairing was considered the "best fit".  Results
indicated that the composer-intended musical score was identified as the
best fit by the majority of subjects for all conditions.  In Experiment
II, subjects rated all twenty-five composites on semantic differential
scales.  A highly significant interaction between audio/visual
combination and the various semantic differential scales was found.
Analysis of this interaction revealed that the composer-intended
combination yielded higher mean scores in response to the four adjective
pairs of the Evaluative dimension.  Clustering the subject responses
into two factor scores (Evaluative vs. a hybrid of Activity and
Potency), confirmed these high Evaluative mean scores.  In addition, the
response contours of the Activity/Potency dimension remained relatively
consistent, suggesting that music exercises a strong and consistent
influence over the subject responses to an audio/visual composite,
regardless of visual stimulus.  The results corroborate previous
research, indicating that a musical soundtrack can change the "meaning"
of a film presentation.  Comparison of the various soundtracks in music
theoretical terms assisted in identifying musical elements which
appeared to be relevant to specific subject ratings.  These comparisons
were utilized in the formulation of a model for music communication in
the context of the motion picture experience.


Lipscomb, S.D. (1995).  Cognition of musical and visual accent structure
alignment in film and animation.  Unpublished (yet) dissertation,
University of California, Los Angeles.

        This investigation examined the relationship between musical
sound and visual images in the motion picture experience.  Most research
in this area has dealt with associational aspects of the music and its
affect on perception of still pictures or "characters" within film
sequences.  In contrast, the present study focused specifically on the
relationship of points perceived as accented musically and visually.
The following research questions were answered:  1) What are the
determinants of "accent" (i.e. salient moments) in the visual and
auditory fields?; and 2) Is the precise alignment of auditory and visual
strata necessary to ensure that an observer finds the combination
        Three experiments were conducted using two convergent methods:
a verbal attribute magnitude estimation (VAME) task and a similarity
judgment task.  Audio-visual (AV) stimuli increased in complexity with
each experiment.  Three alignment conditions were possible between the
musical sound and visual images:  consonant (accents in the music occur
at the same temporal rate and are perfectly aligned with accents in the
visual image), out-of-phase (accents occur at the same rate, but are
perceptibly misaligned), or dissonant (accents occur at different
        Results confirmed that VAME ratings are significantly different
to the three alignment conditions.  Consonant combinations were rated
highest, followed by out-of-phase combinations, and dissonant
combinations received the lowest ratings.  However, as AV stimuli became
more complex (Experiment Three), consonant composites were rated less
synchronized and dissonant combinations were rated more synchronized
than the simple AV composites in Experiment One.  Effectiveness ratings
failed to distinguish between consonant and out-of-phase conditions when
considering actual movie excerpts.  An analysis of variance over the
VAME data from all three experiments, revealed that this difference
between subject responses to simple animations and responses to complex
film excerpts was statistically significant.  A similar result was
observed in the similarity scaling task.  Responses to simple stimuli
divided clearly on three dimensions:  visual component, audio component,
and alignment condition.  However, as the AV composite became more
complex, the third dimension appeared to represent AV congruence (i.e.,
appropriateness).  Modifications to the proposed model of film music
were suggested.

Dr. Scott D. Lipscomb
Assistant Professor, Assistant Director, Undergraduate Advisor of Record
UTSA Division of Music, AR 3.01.58A
6900 N. Loop 1604 West
San Antonio, TX 78249
(210) 458-4354 phone
(210) 458-4381 FAX