4aSC31. High-quality speech signal manipulation by Fourier synthesis.

Session: Thursday Morning, December 5


Author: Henning Reetz
Location: Allgemeine Sprachwissenschaften, Univ. of Konstanz, Postfach 5560, 78464 Konstanz, Germany


Many speech research applications require manipulation of certain acoustic parameters (e.g., F0, formants, duration) and high-quality resynthesis. Unfortunately, most existing resynthesis methods are either of rather poor quality (e.g., LPC), or require careful parameter adjustment (e.g., formant), or allow manipulations only within a limited range (e.g., PSOLA). Although a resynthesized Fourier analyzed signal is theoretically identical to the original signal, manipulating values in the frequency domain lead to discontinuities in the phase domain and distort the resynthesis. The human ear is, however, insensitive to phase relations in a signal. Thus high-quality resynthesized signals are possible by estimating appropriate phase values preventing phase jumps. The presented method is based on this principle. The method is as follows: The Fourier is computed and either: (1) the harmonics are shifted by keeping the envelope constant (to perform F0 manipulations); (2) the envelope is manipulated by keeping the distances of the harmonics constant (to perform spectral shape manipulations); or (3) frames are repeated (to perform duration manipulations). For the resynthesis, the phase values for the manipulated signals are computed for each Fourier frequency such that no sudden changes occur.

ASA 132nd meeting - Hawaii, December 1996