[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Ph.D. Dissertation Announcement: Pitch Tracking and Speech Enhancement in Noisy and Reverberant Environments

Dear colleagues,

Please allow me to announce my Ph.D. dissertation
titled "Pitch tracking and speech enhancement in noisy
and reverberant environments", which was completed
about a year ago. The abstract is shown as following:

Two causes of speech degradation exist in practically
all listening situations: noise interference and room
reverberation. This dissertation investigates three
particular aspects of speech processing in noisy and
reverberant environments: multipitch tracking for
noisy speech, measurement of reverberation time based
on pitch strength, and reverberant speech enhancement
using one microphone (or monaurally).

An effective multipitch tracking algorithm for noisy
speech is critical for speech analysis and processing.
However, the performance of existing algorithms is not
satisfactory. We present a robust algorithm for
multipitch tracking of noisy speech. Our approach
integrates an improved channel and peak selection
method, a new method for extracting periodicity
information across different channels, and a hidden
Markov model (HMM) for forming continuous pitch
tracks. The resulting algorithm can reliably track
single and double pitch tracks in a noisy environment.
We suggest a pitch error measure for the multipitch
situation. The proposed algorithm is evaluated on a
database of speech utterances mixed with various types
of interference. Quantitative comparisons show that
our algorithm significantly outperforms existing ones.

Reverberation corrupts harmonic structure in voiced
speech. We observe that the pitch strength of voiced
speech segments is indicative of the degree of
reverberation. Consequently, we present a pitch-based
measure for reverberation time (T60) utilizing our new
pitch determination algorithm. The pitch strength is
measured by deriving the statistics of relative time
lags, defined as the distances from the detected pitch
periods to the closest peaks in correlograms. The
monotonic relationship between the measured pitch
strength and reverberation time is learned from a
corpus of reverberant speech with known reverberation

Under noise-free conditions, the quality of
reverberant speech is dependent on two distinct
perceptual components: coloration and long-term
reverberation. They correspond to two physical
variables: signal-to-reverberant energy ratio (SRR)
and reverberation time, respectively. We propose a
two-stage reverberant speech enhancement algorithm
using one microphone. In the first stage, an inverse
filter is estimated to reduce coloration effects so
that SRR is increased. The second stage utilizes
spectral subtraction to minimize the influence of
long-term reverberation. The proposed algorithm
significantly improves the quality of reverberant
speech. Our algorithm is quantitatively compared with
a recent one-microphone reverberant speech enhancement
algorithm on a corpus of speech utterances in a number
of reverberant conditions. The results show that our
algorithm performs substantially better.

The full dissertation also can be downloaded at:



Mingyang Wu

Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.