Zafiris G. Politis
Dept. of Elec. and Comput. Eng., Aristotle Univ. of Thessaloniki, Univ. Campus, 54006, Thessaloniki, Greece
Aspects of esophageal speech are investigated in this paper. Esophageal speech is produced by laryngectomized people who utter by expelling air constricted under the entrance of the esophagus, forcing the cricopharyngeal muscle to oscillate equivalently to vocal cords in normal speakers. Nine male esophageal speakers were used for the analysis. Spoken material consisted of Greek vowels and syllables (CV, CCV, VC), each repeated three times continuously by each speaker. F[inf 0] values and plots were obtained for all speakers using central clipping autocorrelation, cepstrum analysis, and a modified Hilbert transform envelope method that seemed to give more consistent results among others. Most frequently observed F[inf 0] values varied from speaker to speaker with an average of 73 Hz. F[inf 1] vs F[inf 2] plots for Greek phonemes (alpha), (epsilon), (iota), o, o(upsilon), were obtained by LPC. Deviations from normal speakers were very small. Significant similarity to English equivalent phonemes was also observed. Speech power versus time slope for vowel-type utterances was investigated as a measure of power reduction rate, showing an average of -86 dB/s. Finally, implications about source volume velocity are made using LPC inverse filtering. Cepstrum analysis revealed a -6-dB/oct voice source spectral tilt instead of -12 dB/oct for normal speakers.