Source Codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing (Yi Yu )


Subject: Source Codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing
From:    Yi Yu  <yi.yu.yy@xxxxxxxx>
Date:    Tue, 7 Oct 2014 18:47:48 +0800
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--001a1133be8a2310f20504d2eed1 Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Dear colleagues, Source codes for Chroma/Chord-based Large-Scale Audio Indexing/Hashing are released now, together with some example datasets of music songs. More datasets can be provided under request. You can directly run programs by following the instructions in "readme.txt" and can freely modify programs to do any scientific experiments. The basic descriptions are listed below: 1. Indexing Based on Low-Level Music Features Based on investigating the statistical representation of acoustic short-time sequential correlations, a multi-probe histogram (MPH) is computed from each audio track, to provide a more adequate balance between scalability, robustness and discrimination ability. The idea of locality sensitive hashing (LSH) was applied to compute MPH from a sequence of chroma features. The major MPH bins of an audio track are in the top-n major MPH bins of its variants with a high probability. Based on the analysis of the order statistics (OS) of MPH bins, an adapted LSH approach is suggested to map MPHs to hash values. (Download at http://www.comp.nus.edu.sg/~yuy/MPH.zip) 2. Indexing Based on Mid-Level Music Attributes Chord progressions (CPs) are exploited to realize accurate and meaningful summarization of music content and efficient organization of the database.The SVMhmm model was adopted, SVM for per-feature chord recognition, and HMM for CP recognition. Through a modified Viterbi algorithm, N-best CPs are locally probed to generate a simple and descriptive chord progression histogram (CPH). Organizing songs in the layered tree-structure further helps alleviate the potential imbalance among buckets. (Download at http://www.comp.nus.edu.sg/~yuy/CPH.zip) Best regards, Yi Yu --001a1133be8a2310f20504d2eed1 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline <div dir=3D"ltr"><span style=3D"font-family:arial,sans-serif;font-size:13px= ">Dear colleagues,</span><br style=3D"font-family:arial,sans-serif;font-siz= e:13px"><br style=3D"font-family:arial,sans-serif;font-size:13px"><br style= =3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"font-family= :arial,sans-serif;font-size:13px">Source codes for Chroma/Chord-based Large= -Scale Audio Indexing/Hashing are</span><br style=3D"font-family:arial,sans= -serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;font-siz= e:13px">released now, together with some example datasets of music songs. M= ore</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><span s= tyle=3D"font-family:arial,sans-serif;font-size:13px">datasets can be provid= ed under request. You can directly run programs by</span><br style=3D"font-= family:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sa= ns-serif;font-size:13px">following the instructions in &quot;readme.txt&quo= t; and can freely modify programs</span><br style=3D"font-family:arial,sans= -serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;font-siz= e:13px">to do any scientific experiments. The basic descriptions are listed= below:</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><br= style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"font-= family:arial,sans-serif;font-size:13px">1. Indexing Based on Low-Level Musi= c Features</span><br style=3D"font-family:arial,sans-serif;font-size:13px">= <span style=3D"font-family:arial,sans-serif;font-size:13px">Based on invest= igating the statistical representation of acoustic</span><br style=3D"font-= family:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sa= ns-serif;font-size:13px">short-time sequential correlations, a multi-probe = histogram (MPH) is</span><br style=3D"font-family:arial,sans-serif;font-siz= e:13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">compute= d from each audio track, to provide a more adequate balance between</span><= br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"fon= t-family:arial,sans-serif;font-size:13px">scalability, robustness and discr= imination ability. The idea of locality</span><br style=3D"font-family:aria= l,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;fo= nt-size:13px">sensitive hashing (LSH) was applied to compute MPH from a seq= uence of</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><s= pan style=3D"font-family:arial,sans-serif;font-size:13px">chroma features. = The major MPH bins of an audio track are in the top-n</span><br style=3D"fo= nt-family:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial= ,sans-serif;font-size:13px">major MPH bins of its variants with a high prob= ability. Based on the</span><br style=3D"font-family:arial,sans-serif;font-= size:13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">anal= ysis of the order statistics (OS) of MPH bins, an adapted LSH approach</spa= n><br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"= font-family:arial,sans-serif;font-size:13px">is suggested to map MPHs to ha= sh values. (Download at</span><br style=3D"font-family:arial,sans-serif;fon= t-size:13px"><!-- <a href=3D"http://www.comp.nus.edu.sg/~yuy/MPH.zip" targe= t=3D"_blank" style=3D"font-family:arial,sans-serif;font-size:13px"> -->http= ://www.comp.nus.edu.sg/~yuy/MPH.zip <font color=3Dgray>[ www.comp.nus.edu.sg/~yuy/MPH.zip ]</font> <!-- </a> -= -><span style=3D"font-family:arial,sans-serif;font-size:13px">)</span><br s= tyle=3D"font-family:arial,sans-serif;font-size:13px"><br style=3D"font-fami= ly:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-s= erif;font-size:13px">2. Indexing Based on Mid-Level Music Attributes</span>= <br style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"fo= nt-family:arial,sans-serif;font-size:13px">Chord progressions (CPs) are exp= loited to realize accurate and meaningful</span><br style=3D"font-family:ar= ial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-serif;= font-size:13px">summarization of music content and efficient organization o= f the</span><br style=3D"font-family:arial,sans-serif;font-size:13px"><span= style=3D"font-family:arial,sans-serif;font-size:13px">database.The SVMhmm = model was adopted, SVM for per-feature chord</span><br style=3D"font-family= :arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-ser= if;font-size:13px">recognition, and HMM for CP recognition. Through a modif= ied Viterbi</span><br style=3D"font-family:arial,sans-serif;font-size:13px"= ><span style=3D"font-family:arial,sans-serif;font-size:13px">algorithm, N-b= est CPs are locally probed to generate a simple and</span><br style=3D"font= -family:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,s= ans-serif;font-size:13px">descriptive chord progression histogram (CPH). Or= ganizing songs in the</span><br style=3D"font-family:arial,sans-serif;font-= size:13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">laye= red tree-structure further helps alleviate the potential imbalance</span><b= r style=3D"font-family:arial,sans-serif;font-size:13px"><span style=3D"font= -family:arial,sans-serif;font-size:13px">among buckets.</span><br style=3D"= font-family:arial,sans-serif;font-size:13px"><span style=3D"font-family:ari= al,sans-serif;font-size:13px">(Download at=C2=A0</span><!-- <a href=3D"http= ://www.comp.nus.edu.sg/~yuy/CPH.zip" target=3D"_blank" style=3D"font-family= :arial,sans-serif;font-size:13px"> -->http://www.comp.nus.edu.sg/~yuy/CPH.z= ip <font color=3Dgray>[ www.comp.nus.edu.sg/~yuy/CPH.zip ]</font> <!-- </a> -= -><span style=3D"font-family:arial,sans-serif;font-size:13px">)</span><br s= tyle=3D"font-family:arial,sans-serif;font-size:13px"><br style=3D"font-fami= ly:arial,sans-serif;font-size:13px"><span style=3D"font-family:arial,sans-s= erif;font-size:13px">Best regards,</span><br style=3D"font-family:arial,san= s-serif;font-size:13px"><br style=3D"font-family:arial,sans-serif;font-size= :13px"><span style=3D"font-family:arial,sans-serif;font-size:13px">Yi Yu</s= pan><br></div> --001a1133be8a2310f20504d2eed1--


This message came from the mail archive
http://www.auditory.org/postings/2014/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University