How to compare classification results

has anyone already deal with the comparison of classification results coming from experiments using various number of classes. In other words: how to compare a recognition-rate X coming from an experiment with N classes to a recognition-rate Y coming from an experiment with M classes.

I guess one possibility is to compute for both, the ratio of the obtained recognition-rate to the random recognition rate (which depends on the number of classes).
- recognition-rate of 50% for 2 classes would give 1 (50%/50%)
- recognition-rate of 50% for 4 classes would give 2 (50%/25%);
So this would lead to the conclusion that the second system performs better.

However, this measure has the drawback that it favors experiments with large number of classes:
A 2 classes problem will never exceed a ratio of 2 (100%/50%) !

Geoffroy Peeters