[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MSD: announcing the *Last.fm dataset*



Hi Thierry
 
The attached are two papers addressing how to improve content-based scalable searchability over a large audio database.
PS: I am at NUS in Singapore. If you have any question, please donot hesitate to contact me.
Congratulations on your job. 
 
Thanks
 
Yi Yu
On Fri, Oct 21, 2011 at 12:12 AM, Thierry Bertin-Mahieux <tb2332@xxxxxxxxxxxx> wrote:
The Million Song Dataset (MSD) team is proud to partner with Last.fm to announce a new complementary dataset: the Last.fm dataset. It contains song-level tags and song-to-song similarity. And it's big (i.e. BIG)! A few numbers:
http://labrosa.ee.columbia.edu/millionsong/lastfm

   * 943,347 matched tracks MSD <-> Last.fm
   * 505,216 tracks with at least one tag
   * 584,897 tracks with at least one similar track
   * 522,366 unique tags
   * 8,598,630 (track - tag) pairs
   * 56,506,688 (track - similar track) pairs

We thank Last.fm (http://www.last.fm/) for making this data available, it is the largest addition to the MSD so far. We are convinced that its impact on music information retrieval will be considerable.

As always, we appreciate any feedback! For instance, my favorite tag so far is "Acid Smurfs". A few additional notes on the MSD:
- we are working on some additional data regarding collaborative filtering, more on this at ISMIR
- we turned the CAL500 and CAL10K datasets into MSD format (http://bit.ly/oyBCwQ)
- please consider attending our tutorial at ISMIR (http://bit.ly/pSwlEA)


Happy swimming in data!
Thierry Bertin-Mahieux
Million Song Dataset team
http://labrosa.ee.columbia.edu/millionsong/

Attachment: MM09.pdf
Description: Adobe PDF document

Attachment: MM10.pdf
Description: Adobe PDF document