/lyrics-content-features

Additional material for the paper “More Than Words: Linking Music Preferences And Moral Values Through Lyrics”.

Primary LanguageJupyter Notebook

Lyrics Feature Modelling: Sentiments, Emotion Associations, Morals, and Topics

This directory contains the lyrics data and the code for lyrical feature modeling from our paper that have been accepted to the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). Due to the privacy issues here we share only the partial implementation discussed in our paper.

Install:

To reuse this repo, install the requested libraries

pip install -r requirements.txt

Data:

In this directory, we have added the initial artist page names that were liked by the LikeYouth (read more about this data here) participants on Facebook. Also, we have provided the best obtained LDA model file and the topic visualisation (.html) file.

To get access to artists' lyrics and the lyrics' content features, please download the data as a zip file from the Lyrics_annotated_data hosted in the OSF directory. When cloning/downloading the repo, add the .csv files into this directory to re-run the experiments we described. When using the initial artist_lyrics_initial_dt.csv and re-running the scripts we have provided here, you should be able to reproduce the artist_lyrics_annot_vader_nrc_moralStrength_lda_final_dt.csv we shared here.

Implementation:

The notebooks directory contains the Jupyter scripts for obtaining lyrics features: sentiment analysis with VADER, emotion associations with NRC lexicon, moral scores with MoralStrength) lexicon and lyrics topics using LDA topic modeling. Whereas, the py_scripts folder contains the python code for lyrics scraping using Genius API cleaning and preprocessing and language detection using spaCy.