I'm neither an expert in audiology or music perception, so my apologies if what I'll say is too obvious or unsubstantial.
Eliot Handelman wrote:
I think part of the solution is to recognize that Diana actually IS singing, which could explain why the effect is robust.
I think Diana is singing, too. The sometimes behave so strangely bit sounded like music to me from the very first time, i.e. before I got to hear the repetitions (I had no idea what the whole thing was about so I can't say I perceived it that way because I was expecting it to sound like that). Then I read the list postings.
To my mind, this is more music than speech. To begin with, the intonation contours in each word are rather flat .The F0 contour of sometimes behave so strangely looks pretty much like the so-called stylised "call contour" (Ladd, 1996). From a perceptual point of view, the pitch variation within each accented syllable is almost negligible. "times", "have" "strange" and "ly", sound nearly monotonous. There isn't much variation across successive syllables, either. (e.g. "sometimes" seems to be said practically on a level tone). Most speech -the one that doesn't sound so "musical " - tends to show appreciable intra- and intersyllabic and even intra-segmental pitch variations.
Besides, there is the duration factor. I think that the relative duration of the syllables also contributes to its sounding like music -the TÁta-taTÁ-ta-TÁta (if you know what I mean...).
In my humble opinion, the repetition makes the listener more aware of the music-like features. But the repetition per se may not be responsible for the sing-song effect. I played around with the sound file on Praat, varying pitch within and across syllables and altering the relative durations until it stopped sounding like music. No matter how many times I replay this new version, it definitely doesn't sound like singing...