Could you provide some insight on general audio similarity measure?

Dear list,

       I find that apart from speech and music, there are a lot of sound we listen everyday. So I want to try some recognition method on this. A critical thing for this is the similarity measure. In speech we have the speaker identification technique for the speech similarity measure, and in music, both instrument recognition and existing melody / tempo measurement can provide some idea for this. But i can not figure out the basic property we used to identify general sound. Can you give some hint from psychological, acoustic, or higher level computational model for this?

Thanks in advance.