MIDI Music Generation

Generate music using language modelling approach with LSTM neural networks. MIDI instructions are converted into a sequence of 'words' and the task is to predict the next word in the sequence, given the previous n words.

Requirements

tensorflow==1.4.1
Keras==2.0.8
midi==0.2.3
pygame===1.9.1
pandas==0.22.0
numpy==1.13.1

Dataset

The MIDI files used for training the model was downloaded from Classical Piano Midi Page - Main Page

Model Description

MIDI is a simple binary protocol for communicating with electronic music equipment. These are instructions to the device (not the actual sound) such as Note-on, Note-off, system messages, etc.

For this analysis, these intsructions for each piece of audio was converted into a sequence of text messages as shown in midi-txt/. This was then fed into an LSTM RNN model, which uses the previous n tokens to predict the next token in the sequence (very much like the language modelling approach in NLP).

Instructions

Begin by creating the text sequences from the MIDI files.

python create_data.py

Specify the model parameters in config.py and train the model by running:

python train.py

Note: With the assumption that a larger dataset would produce a better model, I first build the model with all the MIDI files. However, I found that this generated very noisy sounds was not optimal in this case. Hence I use the a random sample of 10 MIDI files by setting the num_files parameter in the config file. There are other parameters that you can play around with such as the how many previous tokens to be considered in the language model (prev_n_tokens) and the hidden size of the LSTM cell (rnn_size).

Once training is done, a new audio file can be generated by the model by

python generate.py

The generated MIDI files are stored in generated-midi directory and can be played by:

python play_midi.py

Sample Audio

Below are a few music samples generated by this model:

Sample 1 | Sample 2 | Sample 3

Acknowledgements

The code for preprocessing the MIDI files has been borrowed from Tatsuya Hatanaka.