I like the term: "critical flicker fusion rate (CFF)" from the Wiki entry. I too suspect that this varies for auditory stimulus, partly because of streaming, and partly also because it is possible to have the same rate of stimulus with different results.
Consider a 60Hz pulse wave (16.6 ms) played over a 12 channel sound system where each loudspeaker receives a pulse every 200 ms, delayed by 16.6 ms. From any one loudspeaker there is a 5 Hz pulse stream; from any two adjacent speakers there is still a 5 Hz pulse stream, but the "pulse" is now a 'two pulse complex'.
If these two speakers are facing my left ear, they will most likely fuse, but if I put my head between the speakers so that each ear (simply) receives the pulse from that speaker, what will my perception be? And expend this to 3, 4, 5 speakers ... at what point does the "n-pulse complex" cease to be a 5 Hz "tone", an fuse into a 60Hz tone. There is an area of sound art loosely called 'micro-montage' (or micro-editing) which plays in the area below 60 ms.
On 2008, Dec 3, at 1:39 PM, Ross Deas wrote: