Stop consonant identification based on initial spectra?

I'm sorry to interrupt the current frenzy of pet anecdotes (in which no one
has yet mentioned fish)...

I'm looking for a reference that reports whether or not humans can identify
stop consonants based on their initial spectra--before the formant
transitions to the following vowel. Secondarily (though I suppose more
fundamentally), are the initial spectra (first 10 msec or however long
*before* formant transitions) invariant with respect to following vowels?
Differences between voiced and unvoiced?

Background: I had been well indoctrinated in the motor theory of speech
perception, teaching my students the wonders of categorical perception of
stop consonants despite widely varying formant transition profiles across
different vowels (i.e., /di/ looks rather different than /du/ but we
identify /d/ in both). A recent conference poster looking at
neurophysiological spectral representation in non-human primate suggested
that response to spectra of stop consonants (without the following formant
transitions) was sufficient to distinguish and identify them. Alas, I did
not get the relevant human reference and have been unable to find one in an
informal search of my reference books and MEDLINE.

Thanks in advance,

