New methods are proposed by which the sentence structure of utterances can be estimated from the prosody under real-time conditions. The conventional Hough transform algorithm is modified so that it will be able to extract approximated straight lines of pitch patterns in real time. The sensitivity of the approximation can be established at will. So, parallel transform processes with different sensitivities can give multilevel components of an utterance structure and can make a tree in real time. The tree is not exactly a grammatical one, but rather a psychological one or one of the speaker's intentions. Using this structure, a sentence tree matching algorithm is developed to recognize the sentences. The trees are transformed to regular trees by adding dummy nodes to express trees as lists, and the editing distance of the tree is computed recursively by a list-matching technique. Each of the extracted lines of fine sensitivity are classified into codes by vector quantization. Speaker's intention states or dialogue control information in the conversational speech will be related to certain sequences of the codes. One hundred and twenty-eight spontaneous conversations (total length is about 21 h) were collected from 64 subjects. The performance of the proposed methods was checked by the CORPUS.