can anyone help with this one...


Probably what is happening is that in the "transposed" speech, the
amplitudes of the components are the same, but their frequencies have
been shifted by a fixed multiplier. A four semitone upward shift would
require a frequency multiplier of 2^(4/12) = 1.26. This will result in
the natural formant resonances being shifted up, changing the quality 
of the voice considerably, such that an adult male voice can sound 
more like an adult female or child voice.

This problem could be fixed using software that keeps the spectral
envelope of the sound close to that of the original. If the program
models speech in terms of an glottal excitation, which changes its
frequency, and a vocal tract modulation, which is invariant with
respect to the original, that may be what you want. The method of
Homomorphic Speech Modelling does something like this. A talk on it
is given at:


Pretty mathematical though.

The question is Is there software out there that will do this?
I actually think that the free download program Praat might do it
because it inherently models signals in terms of excitation and
filter. Maybe someone on this list knows of alternative software


