[AUDITORY] Making Sense of Sounds Data Challenge (Bones Oliver )


Subject: [AUDITORY] Making Sense of Sounds Data Challenge
From:    Bones Oliver  <O.C.Bones@xxxxxxxx>
Date:    Thu, 9 Aug 2018 14:07:49 +0000
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--_000_84D5CBE57A590446942AD9545DFD155F0FB4A9E5uospexch02_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Dear List, We hereby announce the "Making Sense of Sounds" (MSoS) Challenge: http://cvssp.org/projects/making_sense_of_sounds/site/challenge/ The task in the MSoS Challenge is to classify audio files as belonging to one of five broad categories derived from human classification experiments: Nature, Human, Music, Effects, or Urban. The MSoS Challenge has a development dataset of 1500 five-second audio files. Performance will be judged using an evaluation dataset of 500 audio files. The results of the MSoS Challenge will be announced at the DCASE 2018 Workshop: http://dcase.community/workshop2018/ For more information about the challenge and how to take part, see: http://cvssp.org/projects/making_sense_of_sounds/site/challenge/ Important dates: Challenge announcement and development data set release: 8 Aug 2018 Evaluation data set release: 1 Oct 2018 Submission open: 1 Oct 2018 Submission deadline: 30 Oct 2018 Results announced: 19/20 Nov 2018 (at DCASE 2018 Workshop) Contact: MSoS.challenge@xxxxxxxx We look forward to your submission! Oliver Bones On behalf of the MSoS Challenge organizers Additional information: Humans (with no hearing impairment) use sound in everyday life constantly to interpret their surrounding environment, refocus their attention, detect anomalies and communicate through language and vocal emotional expressions. They are able to identify a large number of sounds, e.g., the call of a bird, the noise of an engine, the cry of a baby, the sound of a string instrument. They are also capable of generalising from past experience to new sounds, e.g. recognising a dulcimer or a kora as a musical instrument despite having never heard this instrument before in their life. The MSoS data challenge calls for machine systems to attempt to replicate this human ability. The task is to classify audio data as belonging to one of five broad categories, which were derived from human classification. In a psychological experiment at the University of Salford, participants were asked to categorise 60 sound types, chosen so as to represent the most commonly used search terms on Freesound.org. Five principal categories were identified by correspondence analysis and hierarchical cluster analysis of the human data: Nature Human Music Effects Urban Within each class the data for the task consists of varying sound types, e.g., different animals in the 'Nature' category or different instruments in the 'Music' category such as 'guitar' and 'mandolin'. Most of the sound types are represented by several instances themselves, coming from different recordings, e.g. different guitars. The machine classifier is therefore forced to reproduce a human capability to be successful: Humans are able to identify a hitherto unheard animal sound as belonging to an animal based upon previously established schemas, and a hitherto unheard musical instrument as a musical instrument, etc. Full details can be found on the MSoS website: http://cvssp.org/projects/making_sense_of_sounds/site/challenge/ --_000_84D5CBE57A590446942AD9545DFD155F0FB4A9E5uospexch02_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @xxxxxxxx {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @xxxxxxxx {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri",sans-serif; color:windowtext;} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri",sans-serif; mso-fareast-language:EN-US;} @xxxxxxxx WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body lang=3D"EN-GB" link=3D"#0563C1" vlink=3D"#954F72"> <div class=3D"WordSection1"> <p class=3D"MsoNormal">Dear List,<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">We hereby announce the &quot;Making Sense of Sounds&= quot; (MSoS) Challenge:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">http://cvssp.org/projects/making_sense_of_sounds/sit= e/challenge/<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">The task in the MSoS Challenge is to classify audio = files as<o:p></o:p></p> <p class=3D"MsoNormal">belonging to one of five broad categories derived fr= om human <o:p></o:p></p> <p class=3D"MsoNormal">classification experiments: Nature, Human, Music, Ef= fects, or Urban.<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">The MSoS Challenge has a development dataset of 1500= five-second<o:p></o:p></p> <p class=3D"MsoNormal">audio files. Performance will be judged using an eva= luation dataset of <o:p></o:p></p> <p class=3D"MsoNormal">500 audio files.<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">The results of the MSoS Challenge will be announced = at the DCASE 2018<o:p></o:p></p> <p class=3D"MsoNormal">Workshop:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">http://dcase.community/workshop2018/<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">For more information about the challenge and how to = take part, see:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">http://cvssp.org/projects/making_sense_of_sounds/sit= e/challenge/<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Important dates:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Challenge announcement and development data set rele= ase: 8 Aug 2018 <o:p></o:p></p> <p class=3D"MsoNormal">Evaluation data set release: 1 Oct 2018 <o:p></o:p><= /p> <p class=3D"MsoNormal">Submission open: 1 Oct 2018 <o:p></o:p></p> <p class=3D"MsoNormal">Submission deadline: 30 Oct 2018 <o:p></o:p></p> <p class=3D"MsoNormal">Results announced: 19/20 Nov 2018 (at DCASE 2018 Wor= kshop)<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Contact: MSoS.challenge@xxxxxxxx<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">We look forward to your submission!<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Oliver Bones<o:p></o:p></p> <p class=3D"MsoNormal">On behalf of the MSoS Challenge organizers<o:p></o:p= ></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Additional information:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Humans (with no hearing impairment) use sound in eve= ryday life<o:p></o:p></p> <p class=3D"MsoNormal">constantly to interpret their surrounding environmen= t, refocus their<o:p></o:p></p> <p class=3D"MsoNormal">attention, detect anomalies and communicate through = language and vocal<o:p></o:p></p> <p class=3D"MsoNormal">emotional expressions. They are able to identify a l= arge number of<o:p></o:p></p> <p class=3D"MsoNormal">sounds, e.g., the call of a bird, the noise of an en= gine, the cry of a<o:p></o:p></p> <p class=3D"MsoNormal">baby, the sound of a string instrument. They are als= o capable of<o:p></o:p></p> <p class=3D"MsoNormal">generalising from past experience to new sounds, e.g= . recognising a<o:p></o:p></p> <p class=3D"MsoNormal">dulcimer or a kora as a musical instrument despite h= aving never heard<o:p></o:p></p> <p class=3D"MsoNormal">this instrument before in their life. The MSoS data = challenge calls<o:p></o:p></p> <p class=3D"MsoNormal">for machine systems to attempt to replicate this hum= an ability.<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">The task is to classify audio data as belonging to o= ne of five broad<o:p></o:p></p> <p class=3D"MsoNormal">categories, which were derived from human classifica= tion. In a<o:p></o:p></p> <p class=3D"MsoNormal">psychological experiment at the University of Salfor= d, participants<o:p></o:p></p> <p class=3D"MsoNormal">were asked to categorise 60 sound types, chosen so a= s to represent the<o:p></o:p></p> <p class=3D"MsoNormal">most commonly used search terms on Freesound.org. Fi= ve principal<o:p></o:p></p> <p class=3D"MsoNormal">categories were identified by correspondence analysi= s and hierarchical<o:p></o:p></p> <p class=3D"MsoNormal">cluster analysis of the human data:<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Nature<o:p></o:p></p> <p class=3D"MsoNormal">Human<o:p></o:p></p> <p class=3D"MsoNormal">Music<o:p></o:p></p> <p class=3D"MsoNormal">Effects<o:p></o:p></p> <p class=3D"MsoNormal">Urban<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Within each class the data for the task consists of = varying sound<o:p></o:p></p> <p class=3D"MsoNormal">types, e.g., different animals in the &#8216;Nature&= #8217; category or<o:p></o:p></p> <p class=3D"MsoNormal">different instruments in the &#8216;Music&#8217; cat= egory such as &#8216;guitar&#8217;<o:p></o:p></p> <p class=3D"MsoNormal">and &#8216;mandolin&#8217;. Most of the sound types = are represented by several<o:p></o:p></p> <p class=3D"MsoNormal">instances themselves, coming from different recordin= gs, e.g. different<o:p></o:p></p> <p class=3D"MsoNormal">guitars. The machine classifier is therefore forced = to reproduce a<o:p></o:p></p> <p class=3D"MsoNormal">human capability to be successful: Humans are able t= o identify a<o:p></o:p></p> <p class=3D"MsoNormal">hitherto unheard animal sound as belonging to an ani= mal based upon<o:p></o:p></p> <p class=3D"MsoNormal">previously established schemas, and a hitherto unhea= rd musical<o:p></o:p></p> <p class=3D"MsoNormal">instrument as a musical instrument, etc.<o:p></o:p><= /p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">Full details can be found on the MSoS website: <o:p>= </o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> <p class=3D"MsoNormal">http://cvssp.org/projects/making_sense_of_sounds/sit= e/challenge/<o:p></o:p></p> <p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p> </div> </body> </html> --_000_84D5CBE57A590446942AD9545DFD155F0FB4A9E5uospexch02_--


This message came from the mail archive
src/postings/2018/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University