[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: speech/music characteristics

  > I'm working in speech recognition, and am trying to be able to distinguish
  > between speech and non-speech (especially music) sounds in an audio track.

Going more for the music vs. "other" distinction, Mike Hawley's (1993)
MIT PhD Thesis, "Structure out of Sound" used measurements of the
length of constant-frequency peaks in the spectrum for discrimination.

Steve Smoliar and I extended this work in

Lonce Wyse and Steven W. Smoliar, ``Toward Content-Based Audio
Indexing and Retrieval and a New Speaker Discrimination Technique''.
(to appear in) D.F. Rosenthal & H.G. Okuno (eds.)  Readings In
Computational Auditory Scene Analysis. (Lawrence Erlbaum, Mahwah NJ)
1998. (linked from my homepage)

Of course, using only information in the signal, there can be no
perfect discrimination between music and other since the domain of
music in general (e.g. electro-acoustic music) is all sound.

                                                - lonce