Re: A piano is a piano is...

 > > If one is going to challenge, from an ethnomusicological point of
 > > view, the idea that an acoustic piano, as opposed to an electronic
 > > piano, is the "real" piano
 > ... I suppose the main
 > issue is, how significant is the difference, to the human mind, of
 > various sounds of the piano-type.

Another factor applicable when hearing a known recorded
sound through a "poor medium", such as a small radio, is
"known transformation compensation": we tend to "hear
through" familiar transformations (such as "weak bass") and
reconstruct the sound internally to some extent from memory.

This effect is also active when hearing sounds in
reverberant environments. We not only build a neural
recognition apparatus for familiar sounds, but we also learn
to partially compensate for systematic distortions that we
are able to hear applied to many sounds.  As everyone knows,
various interesting auditory and optical illusions are based
on this sort of "reconstruction in the mind" based on
partial stimuli.

Thus, the question "What sounds real?" is tied
not only to the listener's entire life history of hearing
examples of a particular sound source, but also to a lifetime
of hearing how sounds can be modified in predictable ways
by reverberant environments, narrowband media, and the like.
Each individual will have a somewhat different "threshold
of recognition" for various stimuli and distortion combinations.

As a result of these considerations, it would seem the
safest things to measure are simple JNDs.  However, it would
be great if we could also define "quality equivalence
classes".  At CCRMA, we often like to say that our synthesis
models are "musically equivalent" to the original recording,
and leave it at that.  (As perhaps in Jim Beauchamp's
example, often the most salient difference is that the noise
is removed.)  We generally try only to verify that the
distortions caused by our synthesis techniques do not
detract from the musicality or recognizability of the
instrument tone, in the opinion of most listeners.  The
difference is often analogous to the difference between
computer graphics and film recordings: those dinosaurs
look very real, but suspiciously perfect and a little
too "glossy".

As audio coding converges more and more toward synthesis
technology, "perceptual distortion measures" will become
more and more a matter of opinion.  Ultimately, we may have
rely more and more on "critics" such as we now have for
movies and restaurants.

Julius Smith
CCRMA, Stanford

