[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Question: same/different judgments across domains.



Hi Max,

I looked at this a bit in grad school, particularly with very brief sounds though mostly focusing on onsets bc I was interested in getting at “when” listeners can recognize what they hear to subsequently engage any potentially different listening strategies (I.e., you more frequently hear/recognize quickly during what is basically a sound onset than dropping in on the middle of an acoustic event in the real world).

Anyway, I think the thread raises some very good points - I’d just add that it sort of depends what question you (they) are asking. I kept it fairly high level. At like 25ms listeners can only barely tell different sound classes apart. But I think by 250ms you do have different listening strategies and the same acoustic dimension can carry different kinds of information for different classes so it depends on what you’re interested in (e.g., pitch is more variable in a given vowel and can cue different speakers or emotions, often doesn’t vary as much within an instrument note and is not as useful for identifying instruments, is basically absent for many noisy environmental sounds). So IMO the trickier thing in limited time windows is controlling things so the comparisons are meaningful for your q bc in my experience there’s always a bit of compromise here due to how different those sound classes are. Note speech I think is interesting and tricky here bc it’s particularly slippery: it’s acoustically rich and variable from moment to moment. 

Anyhow since you asked for some recs here’s links to a few papers of mine that dig into this that could be helpful - all looking at slightly different questions with multiple sound classes on limited time scales. Perhaps there’s a better way to treat some of these issues but this general approach seemed like a fairly straightforward starting place to me:

https://asa.scitation.org/doi/abs/10.1121/1.5014057

https://direct.mit.edu/jocn/article/32/1/111/95406/The-Rapid-Emergence-of-Auditory-Object

(Follow up to the two previous should be on some arxiv soonish? Whenever I can get around to it! heh)

https://www.frontiersin.org/articles/10.3389/fpsyg.2019.01594/full

https://www.sciencedirect.com/science/article/abs/pii/S1053811919300813?via%3Dihub





On Sun, May 9, 2021 at 12:30 AM Jan Schnupp <000000e042a1ec30-dmarc-request@xxxxxxxxxxxxxxx> wrote:
Same/different judgments are always a bad idea. Unless stimuli are actually identical, they are not the same, so the observer has to make some sort of "close enough" judgment which always involves a bit of a fudge in their minds. Much better to play 3 sounds and ask which was the odd one out, or two pairs and ask which pair was more different. In those cases you have a much more unambiguous way of declaring a response objectively correct or incorrect. There is no internal "close enough" criterion that may vary from subject to subject or from domain to domain. Playing with duration is tricky. Certain categories of sounds have characteristic temporal envelopes and if you make them "much shorter than they should be" then they are no longer good representives of their domain or category. 
Good luck with your experiment. 
Jan 


On Sat, May 8, 2021, 12:34 PM Max Henry <max.henry@xxxxxxxxxxxxxx> wrote:
Hi folks. Long time listener, first time caller...

Some friends of mind are setting up an experiment with same/different judgements between pairs of sounds. They want to test sounds from a variety of domains: speech, music, natural sounds, etc.

One of the researchers suggested that listeners will have different listening strategies depending on the domain, and this might pose a problem for the experiment: our sensitivity for difference in pitch, for example, might be very acute for musical sounds but much less-so for speech sounds.

I have a hunch that if the stimuli were short enough, this might sidestep the problem. Ie, if I played you 250 milliseconds of speech, or 250 milliseconds of music, you would not necessarily use any particular domain-specific listening strategy to tell the difference. It would simply be “sound.”

I suspect this is because a sound that’s sufficiently short can stay entirely in echoic memory. For longer sounds, you have to consolidate the information somehow, and the way that you consolidate it has to do with the kind of domain it falls into. For speech sounds, we can throw away the acute pitch information.

But that’s just a hunch. I’m wondering if this rings true for any of you, that is to say, if it reminds you of any particular research. I’d love to read about it.

It's been a pleasure to follow these e-mails. I'm glad to finally have an excuse to write. Wishing you all well.

Max Henry (he/his)
Graduate Researcher and Teaching Assistant
Music Technology Area
McGill University.