Re: [AUDITORY] Question: same/different judgments across domains. (Mattson ogg )


Subject: Re: [AUDITORY] Question: same/different judgments across domains.
From:    Mattson ogg  <mattson.ogg@xxxxxxxx>
Date:    Sun, 9 May 2021 11:14:30 -0400

--000000000000c1412a05c1e71e7b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Max, I looked at this a bit in grad school, particularly with very brief sounds though mostly focusing on onsets bc I was interested in getting at =E2=80= =9Cwhen=E2=80=9D listeners can recognize what they hear to subsequently engage any potentially different listening strategies (I.e., you more frequently hear/recognize quickly during what is basically a sound onset than dropping in on the middle of an acoustic event in the real world). Anyway, I think the thread raises some very good points - I=E2=80=99d just = add that it sort of depends what question you (they) are asking. I kept it fairly high level. At like 25ms listeners can only barely tell different sound classes apart. But I think by 250ms you do have different listening strategies and the same acoustic dimension can carry different kinds of information for different classes so it depends on what you=E2=80=99re inte= rested in (e.g., pitch is more variable in a given vowel and can cue different speakers or emotions, often doesn=E2=80=99t vary as much within an instrume= nt note and is not as useful for identifying instruments, is basically absent for many noisy environmental sounds). So IMO the trickier thing in limited time windows is controlling things so the comparisons are meaningful for your q bc in my experience there=E2=80=99s always a bit of compromise here due to = how different those sound classes are. Note speech I think is interesting and tricky here bc it=E2=80=99s particularly slippery: it=E2=80=99s acousticall= y rich and variable from moment to moment. Anyhow since you asked for some recs here=E2=80=99s links to a few papers o= f mine that dig into this that could be helpful - all looking at slightly different questions with multiple sound classes on limited time scales. Perhaps there=E2=80=99s a better way to treat some of these issues but this= general approach seemed like a fairly straightforward starting place to me: https://asa.scitation.org/doi/abs/10.1121/1.5014057 https://direct.mit.edu/jocn/article/32/1/111/95406/The-Rapid-Emergence-of-A= uditory-Object (Follow up to the two previous should be on some arxiv soonish? Whenever I can get around to it! heh) https://www.frontiersin.org/articles/10.3389/fpsyg.2019.01594/full https://www.sciencedirect.com/science/article/abs/pii/S1053811919300813?via= %3Dihub On Sun, May 9, 2021 at 12:30 AM Jan Schnupp < 000000e042a1ec30-dmarc-request@xxxxxxxx> wrote: > Same/different judgments are always a bad idea. Unless stimuli are > actually identical, they are not the same, so the observer has to make so= me > sort of "close enough" judgment which always involves a bit of a fudge in > their minds. Much better to play 3 sounds and ask which was the odd one > out, or two pairs and ask which pair was more different. In those cases y= ou > have a much more unambiguous way of declaring a response objectively > correct or incorrect. There is no internal "close enough" criterion that > may vary from subject to subject or from domain to domain. Playing with > duration is tricky. Certain categories of sounds have characteristic > temporal envelopes and if you make them "much shorter than they should be= " > then they are no longer good representives of their domain or category. > Good luck with your experiment. > Jan > > > On Sat, May 8, 2021, 12:34 PM Max Henry <max.henry@xxxxxxxx> wrote: > >> Hi folks. Long time listener, first time caller... >> >> Some friends of mind are setting up an experiment with same/different >> judgements between pairs of sounds. They want to test sounds from a vari= ety >> of domains: speech, music, natural sounds, etc. >> >> One of the researchers suggested that listeners will have different >> listening strategies depending on the domain, and this might pose a prob= lem >> for the experiment: our sensitivity for difference in pitch, for example= , >> might be very acute for musical sounds but much less-so for speech sound= s. >> >> I have a hunch that if the stimuli were short enough, this might sideste= p >> the problem. Ie, if I played you 250 milliseconds of speech, or 250 >> milliseconds of music, you would not necessarily use any particular >> domain-specific listening strategy to tell the difference. It would simp= ly >> be =E2=80=9Csound.=E2=80=9D >> >> I suspect this is because a sound that=E2=80=99s sufficiently short can = stay >> entirely in echoic memory. For longer sounds, you have to consolidate th= e >> information somehow, and the way that you consolidate it has to do with = the >> kind of domain it falls into. For speech sounds, we can throw away the >> acute pitch information. >> >> But that=E2=80=99s just a hunch. I=E2=80=99m wondering if this rings tru= e for any of you, >> that is to say, if it reminds you of any particular research. I=E2=80=99= d love to >> read about it. >> >> It's been a pleasure to follow these e-mails. I'm glad to finally have a= n >> excuse to write. Wishing you all well. >> >> *Max Henry* (he/his) >> Graduate Researcher and Teaching Assistant >> Music Technology Area >> *McGill University*. >> www.linkedin.com/in/maxshenry >> github.com/maxsolomonhenry >> www.maxhenrymusic.com/ >> > --000000000000c1412a05c1e71e7b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div><div style=3D"color:rgb(0,0,0)" dir=3D"auto">Hi Max,</div><div style= =3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><div style=3D"color:rgb(0,0,0)= " dir=3D"auto">I looked at this a bit in grad school, particularly with ver= y brief sounds though mostly focusing on onsets bc I was interested in gett= ing at =E2=80=9Cwhen=E2=80=9D listeners can recognize what they hear to sub= sequently engage any potentially different listening strategies (I.e., you = more frequently hear/recognize quickly during what is basically a sound ons= et than dropping in on the middle of an acoustic event in the real world).<= /div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><div style=3D"c= olor:rgb(0,0,0)" dir=3D"auto"><span></span></div><div style=3D"color:rgb(0,= 0,0)" dir=3D"auto">Anyway, I think the thread raises some very good points = - I=E2=80=99d just add that it sort of depends what question you (they) are= asking. I kept it fairly high level. At like 25ms listeners can only barel= y tell different sound classes apart. But I think by 250ms you do have diff= erent listening strategies and the same acoustic dimension can carry differ= ent kinds of information for different classes so it depends on what you=E2= =80=99re interested in (e.g., pitch is more variable in a given vowel and c= an cue different speakers or emotions, often doesn=E2=80=99t vary as much w= ithin an instrument note and is not as useful for identifying instruments, = is basically absent for many noisy environmental sounds). So=C2=A0IMO the t= rickier thing in limited time windows is controlling things so the comparis= ons are meaningful for your q bc in my experience there=E2=80=99s always a = bit of compromise here due to how different those sound classes are. Note s= peech I think is interesting and tricky here bc it=E2=80=99s particularly s= lippery: it=E2=80=99s acoustically rich and variable from moment to moment.= =C2=A0</div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><div sty= le=3D"color:rgb(0,0,0)" dir=3D"auto"><span></span></div><div style=3D"color= :rgb(0,0,0)" dir=3D"auto">Anyhow since you asked for some recs here=E2=80= =99s links to a few papers of mine that dig into this that could be helpful= - all looking at slightly different questions with multiple sound classes = on limited time scales. Perhaps there=E2=80=99s a better way to treat some = of these issues but this general approach seemed like a fairly straightforw= ard starting place to me:</div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"= ><br></div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"><a href=3D"https://= asa.scitation.org/doi/abs/10.1121/1.5014057"><span style=3D"color:rgb(0,0,0= )">https://asa.scitation.org/doi/abs/10.1121/1.5014057</span></a></div><div= style=3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><div style=3D"color:rgb(= 0,0,0)" dir=3D"auto"><a href=3D"https://direct.mit.edu/jocn/article/32/1/11= 1/95406/The-Rapid-Emergence-of-Auditory-Object"><span style=3D"color:rgb(0,= 0,0)">https://direct.mit.edu/jocn/article/32/1/111/95406/The-Rapid-Emergenc= e-of-Auditory-Object</span></a></div><div style=3D"color:rgb(0,0,0)" dir=3D= "auto"><br></div><div style=3D"color:rgb(0,0,0)" dir=3D"auto">(Follow up to= the two previous should be on some arxiv soonish? Whenever I can get aroun= d to it! heh)</div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><= div style=3D"color:rgb(0,0,0)" dir=3D"auto"><a href=3D"https://www.frontier= sin.org/articles/10.3389/fpsyg.2019.01594/full"><span style=3D"color:rgb(0,= 0,0)">https://www.frontiersin.org/articles/10.3389/fpsyg.2019.01594/full</s= pan></a></div><div style=3D"color:rgb(0,0,0)" dir=3D"auto"><br></div><div s= tyle=3D"color:rgb(0,0,0)" dir=3D"auto"><a href=3D"https://www.sciencedirect= .com/science/article/abs/pii/S1053811919300813?via%3Dihub"><span style=3D"c= olor:rgb(0,0,0)">https://www.sciencedirect.com/science/article/abs/pii/S105= 3811919300813?via%3Dihub</span></a></div></div><div dir=3D"auto"><br></div>= <div dir=3D"auto"><br></div><div dir=3D"auto"><br></div><div dir=3D"auto"><= br></div><div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmai= l_attr">On Sun, May 9, 2021 at 12:30 AM Jan Schnupp &lt;<a href=3D"mailto:0= 00000e042a1ec30-dmarc-request@xxxxxxxx">000000e042a1ec30-dmarc-reque= st@xxxxxxxx</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote= " style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style= :solid;padding-left:1ex;border-left-color:rgb(204,204,204)"><div dir=3D"aut= o">Same/different judgments are always a bad idea. Unless stimuli are actua= lly identical, they are not the same, so the observer has to make some sort= of &quot;close enough&quot; judgment which always involves a bit of a fudg= e in their minds. Much better to play 3 sounds and ask which was the odd on= e out, or two pairs and ask which pair was more different. In those cases y= ou have a much more unambiguous way of declaring a response objectively cor= rect or incorrect. There is no internal &quot;close enough&quot; criterion = that may vary from subject to subject or from domain to domain. Playing wit= h duration is tricky. Certain categories of sounds have characteristic temp= oral envelopes and if you make them &quot;much shorter than they should be&= quot; then they are no longer good representives of their domain or categor= y.=C2=A0<div dir=3D"auto">Good luck with your experiment.=C2=A0</div></div>= <div dir=3D"auto"><div dir=3D"auto">Jan=C2=A0<br><div dir=3D"auto"><br></di= v></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmai= l_attr">On Sat, May 8, 2021, 12:34 PM Max Henry &lt;<a href=3D"mailto:max.h= enry@xxxxxxxx" target=3D"_blank">max.henry@xxxxxxxx</a>&gt; wro= te:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left-width:1px;border-left-style:solid;padding-left:1ex;border= -left-color:rgb(204,204,204)"> <div dir=3D"ltr"> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> Hi folks. Long time listener, first time caller... <br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <br> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif">Some friends = of mind are setting up an experiment with same/different judgements between= pairs of sounds. They want to test sounds from a variety of domains: speec= h, music, natural sounds, etc.</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif">One of the re= searchers suggested that listeners will have different listening strategies= depending on the domain, and this might pose a problem for the experiment:= our sensitivity for difference in pitch, for example, might be very acute = for musical sounds but much less-so for speech sounds.</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif">I have a hunc= h that if the stimuli were short enough, this might sidestep the problem. I= e, if I played you 250 milliseconds of speech, or 250 milliseconds of music= , you would not necessarily use any particular domain-specific listening st= rategy to tell the difference. It would simply be =E2=80=9Csound.=E2=80=9D</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif">I suspect thi= s is because a sound that=E2=80=99s sufficiently short can stay entirely in= echoic memory. For longer sounds, you have to consolidate the information = somehow, and the way that you consolidate it has to do with the kind of dom= ain it falls into. For speech sounds, we can throw away the acute pitch information. <br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><br> </div> But that=E2=80=99s just a hunch. I=E2=80=99m wondering if this rings true f= or any of you, that is to say, if it reminds you of any particular research= . I=E2=80=99d love to read about it.</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> It&#39;s been a pleasure to follow these e-mails. I&#39;m glad to finally h= ave an excuse to write. Wishing you all well.<br> </div> <div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <br> </div> <div id=3D"m_5289833683295697566m_-3489444030197921230Signature"> <div> <div></div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <b style=3D"font-family:Calibri,Arial,Helvetica,sans-serif">Max Henry</b>= =C2=A0(he/his)</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> Graduate Researcher and Teaching Assistant</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> Music Technology Area</div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <span style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10p= t;color:rgb(0,0,0)"><b style=3D"font-family:Calibri,Arial,Helvetica,sans-se= rif">McGill University</b>.</span><br> </div> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <span style=3D"box-sizing:border-box;margin:0px;font-size:1.4rem;outline:0p= x;line-height:1.42857;text-align:start;font-family:Calibri,Arial,Helvetica,= sans-serif;background-color:rgb(255,255,255);color:rgba(0,0,0,0.9)"></span>= <a href=3D"http://www.linkedin.com/in/maxshenry" rel=3D"noreferrer" target= =3D"_blank" style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><span = style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;color:rgb(23,78,134= )">www.linkedin.com/in/maxshenry</span></a><span style=3D"box-sizing:border= -box;margin:0px;font-size:1.4rem;outline:0px;line-height:1.42857;text-align= :start;font-family:Calibri,Arial,Helvetica,sans-serif;background-color:rgb(= 255,255,255);color:rgba(0,0,0,0.9)"></span><br> </div> <span style=3D"color:rgb(23,78,134)"></span> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <a href=3D"https://github.com/maxsolomonhenry" rel=3D"noreferrer" target=3D= "_blank" style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><span sty= le=3D"font-family:Calibri,Arial,Helvetica,sans-serif;color:rgb(23,78,134)">= github.com/maxsolomonhenry</span></a><br> </div> <span style=3D"color:rgb(23,78,134)"></span> <div style=3D"font-family:Calibri,Arial,Helvetica,sans-serif;font-size:10pt= ;color:rgb(0,0,0)"> <a href=3D"https://www.maxhenrymusic.com/" rel=3D"noreferrer" target=3D"_bl= ank" style=3D"font-family:Calibri,Arial,Helvetica,sans-serif"><span style= =3D"font-family:Calibri,Arial,Helvetica,sans-serif;color:rgb(23,78,134)">ww= w.maxhenrymusic.com/</span></a><br> </div> </div> </div> </div> </div> </blockquote></div> </blockquote></div></div> --000000000000c1412a05c1e71e7b--


This message came from the mail archive
src/postings/2021/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University