To investigate the functional differences between vowel (V) onsets and vowel offsets in the perception of temporal structures in speech stimuli, listeners were required to estimate the perceived difference in speaking rates from various four-mora Japanese words. In the V-onset condition, the inter-onset intervals of vowels were uniformly changed (either lengthened or shortened) while preserving their interoffset intervals, and vice versa, in the V-offset condition. These manipulations did not change the duration of the entire word. Each of the modified words was paired with its unmodified counterpart and was given to the listeners. Interestingly, the results showed that changing the V-onset intervals correlated with a change in the perceived speaking rate, despite the fact that the duration of the entire word was unchanged. However, the modifications of the V-offset intervals had no significant effect on the perceived speaking rate. Although there was no significant functional difference between V onsets and V offsets in the detection of local changes in the temporal structure [Kato et al., Proc. ICSLP-94, 1979--1982 (1994)], the current results successfully demonstrated that they do differ and suggest that V onsets are the dominant cue in determining the speaking rate, i.e., the tempo of events. The results will be discussed in relation to recent models predicting perceived beat locations.