/heavy_metal_history

Analysing the most important heavy metal albums in history.

Primary LanguageJupyter Notebook

The History of Heavy Metal

In this project we will gather statistics from LastFM and Spotify to analyse and visualise the most influencial Heavy Metal albums in history.

Python-package Actions Status Python-package-conda Actions Status

Motivation

This projects aims to demonstrate good software engineering practices (versioning, testing, refactoring).

Table of Contents

Getting started

Install

It is possible to install and run our code both with and without a conda environment. See both options below.

With a Conda environment

To use with conda simply run the following commands. An environment will be created with the name spotify, so make sure that there does not already exist an environment with that name.

git clone https://github.com/ostromann/heavy_metal_history.git
cd heavy_metal_history
conda env create -f environment.yml
conda activate metalhistory

Without a Conda environment

If you want to install the repository not in a Conda-based environment, simply clone the repository, then run the following commands:

cd heavy_metal_history
pip3 install -r requirements.txt

Note to run

python3 -m venv /path/to/new/virtual/environment

if you want to create a virtual environment before installing the required packages.

Usage

The code in this repository can be devided into two groups of functions, namely data retrieval/pre-processing and visualization. We provide two jupyter notebooks to demonstrate the usage of these functions.

To get familiar with the data retrieval/pre-processing, run:

jupyter notebook 0-preprocessing.ipynb

This uses the functions in metalhistory/data_query_functions.py to create a csv file of pre-processed data. We do not recommend running this for a large number of album entries as it takes a long time. We have instead included the pre-processed csv file in the repository.

To see how to do some visualizations, run:

jupyter notebook 1-visualizations.ipynb

This notebook uses the functions in metalhistory/visualization_api.py to visualize the pre-processed data in different ways.

It is also possible to view these notebooks in a browser by navigating to e.g. 1-visualizations.ipynb.

Testing

Run test routines with:

pytest -s

from the root directory. This command will execute all the test routines contained in the tests folder. The -s option will output to screen any print() statement. To run singular test routines, execute:

pytest -s metalhistory/tests/test_query_api.py

to test the data query functions, and

pytest -s metalhistory/tests/test_visualization_api.py

to test the visualization functions.

Folder structure

We use the following folder structure in this project:

  • heavy_metal_history, root of repository (includes notebooks)
    • data, raw and pre-processed data
    • images, generated visualizations and other images
    • metalhistory, source code
      • tests

Development Notes

Commit Convention

In general we follow the Conventional Commits 1.0.0. We use the following commit types: feat, fix, docs, test

Branch Convention

We draw some inspiration from Gitflow and use two permanent branches, namely master and stable. For each new feature we create a new temporary branch of master named type/scope where type is one of feature/fix/doc or similar, and scope is a brief name for the feature. The name is written in small letters and words are separated by hyphen (-). An example branch name is feature/word-cloud-visualization.

When enough features have been implemented for a release, we merge the master branch with the stable branch and increment the release version.

Workpackages

Keep track of this project's development on this Trello board.

Datasets

Currently we have two lists of input data:

./data/artists_unfiltered.csv which contains a list of artists that have released at least 1 album that could be tagged as a subgenre of metal (See What counts as Heavy Metal?). This means that album tags should be checked before including all albums of an artist.

./data/MA_10k_albums.csv which contains the a list of 10,000 albums and their respective artists that received the highest Metascores on Encyclopedia Metallum: The Metal Archives.

Additionally, we have one preprocessed dataset that is ready for data analysis and visualizations:

./data/proc_MA_1k_albums.csv contains the first 1,000 albums of ./data/MA_10k_albums.csv with added information like listeners, playcounts, tags, urls, images etc.

Data Collection

The data will be collected using Spotify's Web API and LastFM's Web API.

The following features and limitations were already identified in the two APIs:

Spotify's Web API:

  • doesn't show playcounts
  • release years are often wrong (due to re-masters, special editions etc.)
  • gives only a popularity score measured against the most popular artists in general

LastFM's API:

  • shows playcounts and number of listeners
  • many different versions of the same album appear
  • release years are often wrong, too

To get the right release years we'd perhaps need to use another API (Wikipedia?) or use some other approach (like take the lowest ever mentioned release year of an album on Last FM)

FAQ

Do I need an API key?

Right now you will need a personal API key for both Spotify's Web API and LastFM's Web API. Both are free but require registration (see Spotify Authentication and LastFM Authentication).

What counts as Heavy Metal?

To give a broad overview of the genre all of the following subgenres of Heavy Metal are considered (taken from Wikipedia and extended by some sub Wikipedia sites. Can still be extended):

  • Alternative metal

    • Funk metal
    • Nu metal
    • Rap metal

  • Avant-garde metal
  • Black metal

    • Ambient black metal
    • Blackened heavy metal
    • Blackened screamo
    • Blackgaze
    • Black'n'Roll
    • Depressive suicidal black metal
    • NSBM
    • Post-black metal
    • Red and Anarchist black metal
    • Symphonic black metal
    • Viking metal
    • War metal

  • Christian metal

    • Unblack metal

  • Crust punk

    • Blackened crust

  • Death metal

    • Blackened death metal
    • Death 'n' roll
    • Melodic death metal
    • Technical death metal
    • Symphonic death metal

  • Doom metal

    • Death-doom
    • Drone metal
    • Funeral doom
    • Sludge metal
    • Stoner metal

  • Extreme metal
  • Folk metal

    • Celtic metal
    • Pirate metal
    • Pagan metal

  • Glam metal

    • Hair metal
    • Pop metal

  • Gothic metal
  • Grindcore

    • Deathgrind
    • Goregrind
    • Pornogrind
    • Electrogrind

  • Grunge
  • Industrial metal

    • Industrial death metal
    • Industrial black metal

  • Kawaii metal
  • Latin metal
  • Metalcore

    • Melodic metal
    • Deathcore
    • Mathcore
    • Electronicore
    • Synthcore
    • Trancecore
    • Nu metalcore
    • Nu metal revival
    • New nu metal
    • Progressive metalcore
    • Technical metalcore
    • Ambient metalcore

  • Neoclassical metal / Shred metal
  • Neue Deutsche Härte
  • Post-metal
  • Power metal
  • Progressive metal

    • Djent
    • Space metal

  • Speed metal
  • Symphonic metal
  • Thrash metal

    • Crossover thrash
    • Groove metal
    • Teutonic thrash metal

  • Traditional heavy metal

    • New wave of British heavy metal (NWOBHM)
    • New wave of American heavy metal (NWOAHM)
    • New wave of traditional heavy metal (NWOTHM)