ALF-200k: Towards Extensive Multimodal Analyses of Music Tracks and Playlists


In recent years, approaches in music information retrieval have been based on multimodal analyses of music incorporating audio as well as lyrics features. Because most of those approaches are lacking reusable, high-quality datasets, in this work we propose ALF-200k, a publicly available, novel dataset including 176 audio and lyrics features of more than 200,000 tracks and their attribution to more than 11,000 user-created playlists. While the dataset is of general purpose and thus, may be used in experiments for diverse music information retrieval problems, we present a first multimodal study on playlist features and particularly analyze, which type of features are shared within specific playlists and thus, characterize it. We show that while acoustic features act as the major glue between tracks contained in a playlists, also lyrics features are a powerful means to attribute tracks to playlists.

Advances in Information Retrieval - 39th European Conference on IR Research, ECIR 2018