Source Codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing (Yi Yu )


Subject: Source Codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing
From:    Yi Yu  <yi.yu.yy@xxxxxxxx>
Date:    Tue, 7 Oct 2014 19:22:51 +0800
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--089e013d17ce83f3c60504d36b8e Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Dear colleagues, Source codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing are released now, together with some example datasets of music songs. More datasets can be provided under request. You can directly run programs by following the instructions in "readme.txt" and can freely modify programs to do any scientific experiments. The basic descriptions are listed below: 1. Indexing Based on Low-Level Music Features Based on investigating the statistical representation of acoustic short-time sequential correlations, a multi-probe histogram (MPH) is computed from each audio track, to provide a more adequate balance between scalability, robustness and discrimination ability. The idea of locality sensitive hashing (LSH) was applied to compute MPH from a sequence of chroma features. The major MPH bins of an audio track are in the top-n major MPH bins of its variants with a high probability. Based on the analysis of the order statistics (OS) of MPH bins, an adapted LSH approach is suggested to map MPHs to hash values. (Download at http://www.comp.nus.edu.sg/~yuy/MPH.zip) 2. Indexing Based on Mid-Level Music Attributes Chord progressions (CPs) are exploited to realize accurate and meaningful summarization of music content and efficient organization of the database.The SVMhmm model was adopted, SVM for per-feature chord recognition, and HMM for CP recognition. Through a modified Viterbi algorithm, N-best CPs are locally probed to generate a simple and descriptive chord progression histogram (CPH). Organizing songs in the layered tree-structure further helps alleviate the potential imbalance among buckets. (Download at http://www.comp.nus.edu.sg/~yuy/CPH.zip) Best regards, Yi Yu --089e013d17ce83f3c60504d36b8e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline <div dir=3D"ltr"><span style=3D"font-family:arial,sans-serif;font-size:13px= ">Dear colleagues,</span><br style=3D"font-family:arial,sans-serif;font-siz= e:13px"><br><span style=3D"font-family:arial,sans-serif;font-size:13px">Sou= rce codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing are</sp= an><br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D= "font-family:arial,sans-serif;font-size:13px">released now, together with s= ome example datasets of music songs. More</span><br style=3D"font-family:ar= ial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;= font-size:13px">datasets can be provided under request. You can directly ru= n programs by</span><br style=3D"font-family:arial,sans-serif;font-size:13p= x"><span style=3D"font-family:arial,sans-serif;font-size:13px">following th= e instructions in &quot;readme.txt&quot; and can freely modify programs</sp= an><br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D= "font-family:arial,sans-serif;font-size:13px">to do any scientific experime= nts. The basic descriptions are listed below:</span><br style=3D"font-famil= y:arial,sans-serif;font-size:13px"><br style=3D"font-family:arial,sans-seri= f;font-size:13px"><span style=3D"font-family:arial,sans-serif;font-size:13p= x">1. Indexing Based on Low-Level Music Features</span><br style=3D"font-fa= mily:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans= -serif;font-size:13px">Based on investigating the statistical representatio= n of acoustic</span><br style=3D"font-family:arial,sans-serif;font-size:13p= x"><span style=3D"font-family:arial,sans-serif;font-size:13px">short-time s= equential correlations, a multi-probe histogram (MPH) is</span><br style=3D= "font-family:arial,sans-serif;font-size:13px"><span style=3D"font-family:ar= ial,sans-serif;font-size:13px">computed from each audio track, to provide a= more adequate balance between</span><br style=3D"font-family:arial,sans-se= rif;font-size:13px"><span style=3D"font-family:arial,sans-serif;font-size:1= 3px">scalability, robustness and discrimination ability. The idea of locali= ty</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><span st= yle=3D"font-family:arial,sans-serif;font-size:13px">sensitive hashing (LSH)= was applied to compute MPH from a sequence of</span><br style=3D"font-fami= ly:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-s= erif;font-size:13px">chroma features. The major MPH bins of an audio track = are in the top-n</span><br style=3D"font-family:arial,sans-serif;font-size:= 13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">major MPH= bins of its variants with a high probability. Based on the</span><br style= =3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"font-family= :arial,sans-serif;font-size:13px">analysis of the order statistics (OS) of = MPH bins, an adapted LSH approach</span><br style=3D"font-family:arial,sans= -serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;font-siz= e:13px">is suggested to map MPHs to hash values. (Download at</span><br sty= le=3D"font-family:arial,sans-serif;font-size:13px"><!-- <a href=3D"http://w= ww.comp.nus.edu.sg/~yuy/MPH.zip" target=3D"_blank" style=3D"font-family:ari= al,sans-serif;font-size:13px"> -->http://www.comp.nus.edu.sg/~yuy/MPH.zip <font color=3Dgray>[ www.comp.nus.edu.sg/~yuy/MPH.zip ]</font> <!-- </a> -= -><span style=3D"font-family:arial,sans-serif;font-size:13px">)</span><br s= tyle=3D"font-family:arial,sans-serif;font-size:13px"><br style=3D"font-fami= ly:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-s= erif;font-size:13px">2. Indexing Based on Mid-Level Music Attributes</span>= <br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"fo= nt-family:arial,sans-serif;font-size:13px">Chord progressions (CPs) are exp= loited to realize accurate and meaningful</span><br style=3D"font-family:ar= ial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;= font-size:13px">summarization of music content and efficient organization o= f the</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><span= style=3D"font-family:arial,sans-serif;font-size:13px">database.The SVMhmm = model was adopted, SVM for per-feature chord</span><br style=3D"font-family= :arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-ser= if;font-size:13px">recognition, and HMM for CP recognition. Through a modif= ied Viterbi</span><br style=3D"font-family:arial,sans-serif;font-size:13px"= ><span style=3D"font-family:arial,sans-serif;font-size:13px">algorithm, N-b= est CPs are locally probed to generate a simple and</span><br style=3D"font= -family:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,s= ans-serif;font-size:13px">descriptive chord progression histogram (CPH). Or= ganizing songs in the</span><br style=3D"font-family:arial,sans-serif;font-= size:13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">laye= red tree-structure further helps alleviate the potential imbalance</span><b= r style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"font= -family:arial,sans-serif;font-size:13px">among buckets.</span><br style=3D"= font-family:arial,sans-serif;font-size:13px"><span style=3D"font-family:ari= al,sans-serif;font-size:13px">(Download at=C2=A0</span><!-- <a href=3D"http= ://www.comp.nus.edu.sg/~yuy/CPH.zip" target=3D"_blank" style=3D"font-family= :arial,sans-serif;font-size:13px"> -->http://www.comp.nus.edu.sg/~yuy/CPH.z= ip <font color=3Dgray>[ www.comp.nus.edu.sg/~yuy/CPH.zip ]</font> <!-- </a> -= -><span style=3D"font-family:arial,sans-serif;font-size:13px">)</span><br s= tyle=3D"font-family:arial,sans-serif;font-size:13px"><br style=3D"font-fami= ly:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-s= erif;font-size:13px">Best regards,</span><br style=3D"font-family:arial,san= s-serif;font-size:13px"><br style=3D"font-family:arial,sans-serif;font-size= :13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">Yi Yu</s= pan><br></div> --089e013d17ce83f3c60504d36b8e--


This message came from the mail archive
http://www.auditory.org/postings/2014/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University