## 4pSC4. Vowel identification from harmonic contours.

### Session: Thursday Afternoon, December 5

### Time:

**Author: John S. Antrobus**

**Location: Dept. of Psych., City College of New York, New York, NY 10031**

**Author: Octavio Betancourt**

**Location: City College of New York, New York, NY 10031**

**Abstract:**

A set of automatic algorithms expresses the acoustic vowel signal as the
ratio series, log(f[inf j]/F[inf 0][sup 2/3]), where f[inf j] are the first 32
integer multiples of F[inf 0], plus an additional 32 log(f[inf j]/F[inf 0][sup
2/3]) delta terms that represent vowel trajectories. F[inf 0] is measured on a
window-by-window basis by an algorithm that eliminates all smearing due to
conventional windowing algorithms. In order to reduce the dimensionality of this
expression, the 64 terms are summarized by 11 + 11 terms from the cosine series.
Using ten monosyllabic words spoken by 137 men, women, and children, vowel
classification is within one percent of human accuracy. Because the model uses
none of the circular definitions of formant measurement and makes only one
assumption that is unique to speech, namely, the -log F[inf 0][sup 1/3] offset,
the log(f[inf j]/F[inf 0][sup 2/3]) contour is superior to a formant
representation of voiced speech. Because a Euclidean classifier is as accurate
as a quadratic discriminant function which uses more than ten times as many
degrees of freedom, it is argued that the log(f[inf j]/F[inf 0][sup 2/3])
transform may be accomplished by a genetically acquired neural mapping of the
acoustic signal that facilitates the learning of vowel categories by infants.

ASA 132nd meeting - Hawaii, December 1996