Data: MIDI data for songs in the Million Songs Dataset can be obtained by downloading "LMD-matched" and "LMD-matched-metadata" from the Lakh MIDI Dataset.
Preprocessing:
- Find the mood of each song using pygn to query the Gracenote API
- Discretize midi data into sixteenth notes, convert to piano-roll notation (binary matrix of [nsamples x npitches]), and tranpose to C major or C minor.