demonstration

LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis

Authors:
Markus Schedl

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria
View Profile

,
Stefan Brandl

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria
View Profile

,
Oleg Lesota

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria
View Profile

,
Emilia Parada-Cabaleiro

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria
View Profile

,
David Penz

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and Machine Learning Unit, TU Wien, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and Machine Learning Unit, TU Wien, Austria
View Profile

,
Navid Rekabsaz

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria

Institute of Computational Perception / Multimedia Mining and Search Group, Johannes Kepler University Linz, Austria and AI Lab / Human-centered AI Group, Linz Institute of Technology, Austria
View Profile

CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and RetrievalMarch 2022Pages 337–341https://doi.org/10.1145/3498366.3505791

Published:14 March 2022Publication History

CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval

Pages 337–341

ABSTRACT

We present the LFM-2b dataset containing the listening records of over 120,000 users of the music platform Last.fm. These users provide a total of more than two billion individual listening events that span a time range of over 15 years, from February 2005 until March 2020. These listening events refer to a total of 50 million distinct tracks of 5 million distinct artists. Beside the common metadata (i. e., artist and track name), LFM-2b contains additional information both regarding the users and items. This includes the demographic information of users, namely country, gender, and age, and the fine-grained genre and style of items together with the vector embeddings of their lyrics.

LFM-2b is a rich dataset that enables research on a variety of recommender system algorithms, such as the ones based on collaborative filtering (e.g., leveraging the user–item interactions in the form of listening events), but also content-based approaches (e.g., exploiting genres and lyrics), or hybrid combinations thereof. Users’ demographic information furthermore enable experimentation on identifying and mitigating various data and algorithmic biases of recommender systems, and investigating fairness aspects of such systems, e.g., according to gender.

References

Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, Florida, USA, October 24-28, 2011, Anssi Klapuriand Colby Leider (Eds.). University of Miami, 591–596. http://ismir2011.ismir.net/papers/OS6-1.pdfGoogle Scholar
Brian Brost, Rishabh Mehrotra, and Tristan Jehan. 2019. The Music Streaming Sessions Dataset. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, Ling Liu, Ryen W. White, Amin Mantrach, Fabrizio Silvestri, Julian J. McAuley, Ricardo Baeza-Yates, and Leila Zia (Eds.). ACM, 2594–2600. https://doi.org/10.1145/3308558.3313641Google ScholarDigital Library
Òscar Celma. 2010. Music Recommendation and Discovery - The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer, Berlin, Germany. https://doi.org/10.1007/978-3-642-13287-2Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
Gideon Dror, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2011. The Yahoo! Music Dataset and KDD-Cup’11. In Proceedings of the 2011 International Conference on KDD Cup 2011 - Volume 18(KDDCUP’11). JMLR.org, 3–18.Google Scholar
David Hauger, Markus Schedl, Andrej Košir, and Marko Tkalčič. 2013. The million musical tweet dataset: what we can learn from microblogs. In Proceedings of the International Society for Music Information Retrieval Conference. Curitiva, Brazil, 189–194.Google Scholar
Dominik Kowald, Peter Muellner, Eva Zangerle, Christine Bauer, Markus Schedl, and Elisabeth Lex. 2021. Support the underground: characteristics of beyond-mainstream music listeners. EPJ Data Science 10, 1 (2021), 1–26.Google ScholarCross Ref
Oleg Lesota, Alessandro B. Melchiorre, Navid Rekabsaz, Stefan Brandl, Dominik Kowald, Elisabeth Lex, and Markus Schedl. 2021. Analyzing Item Popularity Bias of Music Recommender Systems: Are Different Genders Equally Affected?. In RecSys ’21: Fifteenth ACM Conference on Recommender Systems, Amsterdam, The Netherlands, 27 September 2021 - 1 October 2021, Humberto Jesús Corona Pampín, Martha A. Larson, Martijn C. Willemsen, Joseph A. Konstan, Julian J. McAuley, Jean Garcia-Gathright, Bouke Huurnink, and Even Oldridge (Eds.). ACM, 601–606. https://doi.org/10.1145/3460231.3478843Google ScholarDigital Library
Brian McFee and Gert RG Lanckriet. 2012. Hypergraph Models of Playlist Dialects. In Proceedings of the International Society for Music Information Retrieval Conference. ISMIR, Porto, Portugal, 343–348.Google Scholar
Alessandro B. Melchiorre, Navid Rekabsaz, Emilia Parada-Cabaleiro, Stefan Brandl, Oleg Lesota, and Markus Schedl. 2021. Investigating gender fairness of recommendation algorithms in the music domain. Inf. Process. Manag. 58, 5 (2021), 102666. https://doi.org/10.1016/j.ipm.2021.102666Google ScholarDigital Library
A Poddar, E Zangerle, and Y Yang. 2018. nowplaying-RS: a new benchmark dataset for building context-aware music recommender systems. In Proceedings of the 15th Sound and Music Computing Conference.Google Scholar
Markus Schedl. 2016. The LFM-1b Dataset for Music Retrieval and Recommendation. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval (New York, New York, USA) (ICMR ’16). Association for Computing Machinery, New York, NY, USA, 103–110. https://doi.org/10.1145/2911996.2912004Google ScholarDigital Library
Markus Schedl, Eelco Wiechert, and Christine Bauer. 2018. The Effects of Real-world Events on Music Listening Behavior: An Intervention Time Series Analysis. In Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon, France, April 23-27, 2018, Pierre-Antoine Champin, Fabien Gandon, Mounia Lalmas, and Panagiotis G. Ipeirotis (Eds.). ACM, 75–76. https://doi.org/10.1145/3184558.3186936Google ScholarDigital Library
Gabriel Vigliensoni and Ichiro Fujinaga. 2017. The Music Listening Histories Dataset. In Proceedings of the International Society for Music Information Retrieval Conference. ISMIR, Suzhou, China, 96–102.Google Scholar
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, 2019. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771(2019).Google Scholar
Hamed Zamani, Markus Schedl, Paul Lamere, and Ching-Wei Chen. 2019. An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation. ACM Trans. Intell. Syst. Technol. 10, 5 (2019), 57:1–57:21. https://doi.org/10.1145/3344257Google ScholarDigital Library

Index Terms

LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction
      2. Recommender systems
  2. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

The LFM-1b Dataset for Music Retrieval and Recommendation
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

We present the LFM-1b dataset of more than one billion music listening events created by more than 120,000 users of Last.fm. Each listening event is characterized by artist, album, and track name, and further includes a timestamp. On the (anonymous) ...
Read More
Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty
SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

A shortcoming of current approaches for music recommendation is that they consider user-specific characteristics only on a very simple level, typically as some kind of interaction between users and items when employing collaborative filtering. To ...
Read More
Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

Over the last years, music consumption has changed fundamentally: people switch from private, mostly limited music collections to huge public music collections provided by music streaming platforms. Thus, the amount of available music has increased ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval
March 2022
399 pages
ISBN:9781450391863
DOI:10.1145/3498366
General Chairs:
David Elsweiler
University of Regensburg, Bavaria, Germany
,
Udo Kruschwitz
University of Regensburg, Bavaria, Germany
,
Bernd Ludwig
University of Regensburg, Bavaria, Germany
Copyright © 2022 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 March 2022
Check for updates
Author Tags
auto-tagging
bias
classification
dataset
experimentation
fairness
music information retrieval
recommender systems
user modeling
Qualifiers
- demonstration
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate55of163submissions,34%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 413
  Total Downloads
- Downloads (Last 12 months)152
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis

CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

The LFM-1b Dataset for Music Retrieval and Recommendation

Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty

Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

LFM-2b: A Dataset of Enriched Music Listening Events for Recommender Systems Research and Fairness Analysis

CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

The LFM-1b Dataset for Music Retrieval and Recommendation

Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty

Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media