Today synthetic speech is often based on concatenation of natural speech such as polyphones. There is a need for methods for assessing the overall speech quality of this type of synthesis. In this investigation a method based on shadowing is presented. The subjects listen to spoken utterances produced by the system to be evaluated. At the same time the subjects repeat the utterances as fast as possible. The time difference between the incoming and the repeated speech is measured at the vowel onsets and the averaged time difference over the utterances is used as a measure of the speech quality. The shadowing technique mirrors the ability of the subjects to predict the future of the incoming utterances at every point in time. This technique can therefore be used as a means to judge the accuracy of incoming cues, such as segment duration and the fundamental frequency contour. Results of this method will be presented for different types of speech synthesis and the sensitivity of this method for prosodic cues will be shown.