[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Defense: A hybrid model for timbre perception

Dear friends, colleagues, and mentors,

As a part of my University Oral Exam (i.e. PhD dissertation defense),
I will be presenting my work on Friday, 30 May 2008, at noon, at CCRMA
stage (Stanford University). The talk should finish by 1:00 pm. You
are warmly invited to attend. Abstract is attached below.

This is also a part of a day-long CCRMA open house (a.k.a. annual big
party.) Please join us for a showcase of the work being done by CCRMA

Best regards,
 Hiroko Terasawa

"A Hybrid Model for Timbre Perception"
Ph.D. Candidate: Hiroko Terasawa
Advisor: Prof. Jonathan Berger
Date: May 30 (Fri), 2008
Time: 12:00 pm
Location: The Knoll (CCRMA), Stage. (http://tinyurl.com/5o8mbz)

Timbre, or the perceived quality of sound, is a fundamental attribute
of sound. It is important in differentiating between musical sounds,
speech utterances, and characterizing everyday sounds in our
environment as well as novel synthetic sounds.

This dissertation presents a perceptually based hybrid model of timbre
perception which integrates the concepts of color and texture. The
color of sound is described in terms of an instantaneous (or ideally
timeless) spectral envelope while the texture of a sound describes the
temporal structure of the sound.

The dissertation presents the framework for the model, a discussion of
prior research, a computational implementation of the model, and a
series of experiments that provide perceptual validation. The
computational model represents a sound's color as the spectral
envelope of a specific window (although the ideal concept of color is
one in which time is non-existent). Texture is represented as the
sequential changes of color with an arbitrary range of time-scale.

In support of the proposed theory a series of psychoacoutic
experiments were performed. The quantitative relationship between the
spectral envelope and subjective perception of complex tones used
Mel-frequency cepstral coefficients (MFCC) as a representation. A
perceptually tested quantitative representation of texture was
established using normalized echo density (NED).

The elusive nature of describing timbre has been a barrier to music
analysis, speech research and psychoacoustics. It is hoped that the
framework presented in this dissertation will form the basis of a
consistent metric for describing timbre.


Hiroko Terasawa, Ph.D. Candidate
CCRMA, Department of Music
Stanford University