Acoustical similarity

Dear list,

I am looking for "general" metrics of the acoustical (not perceived)
similarity between mono signals independent of a features extraction
stage (e.g., peak level, harmonicity etc.).

Ideally, this metric would operate on a low-level representation of the
signal (ideally the waveform).

Any ideas/comments?

