histograms of F0 in speech contours (Christian Kaernbach )

Subject: histograms of F0 in speech contours
From:    Christian Kaernbach  <chris(at)PSYCHOLOGIE.UNI-LEIPZIG.DE>
Date:    Sun, 13 May 2001 09:53:28 +0200

Alain asked for more background scepticism. Alain, I did not read the paper so it would be not fair. But maybe I can put forward some questions to Martin that would help to clarify bias issues... Martin, when hand-marking targets on visually presented speech contours, did you (or your colaborators) specify a point on that contour or a range and some automatic algorithm determined the minimum or maximum in that range? How was the visual display presented? Was it a) always the same frequency range for F0 or was the range b) specifically adjusted for each speech segment in question following to what the algorithm of pitch contour extraction thought appropriate for presentation? If a): What was this range? Were its boundarys in full semitones or were they in quarter semitones, or some non-semitone value? If b): What could be possible range boundaries chosen by the algorithm? Where these defined in full semitones, in quarter semitones, or in an even finer resolution? Both a) or b): Were there any horizontal grid lines across the display, and were those on semitones, or what was their position? Was the pitch contour presented as a continuous line, or was it quantized in quarter semitones? Answering to these questions would help me to know whether I should be sceptical or not. And then there was one important point in a message by Martin: Their data were from sentence material that was specifically chosen so as to show clear targets. That migh make all the difference, i.e. it could well be that AP histograms are real for Martin's data and non-existent for Alain's data. If we would want to settle the subject completely, Martin could analyze Alain's data in his way, and Alain could analyze Martin's data. The latter seems by far more simple, as it invloves no hand marking. Alain, would you be ready to let Martin's data pass through your algorithm? This would add evidence as to whether this is a methodological problem and/or a material question... Regards, Christian

This message came from the mail archive
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University