/REMI-z

The REMI-z tokenizer for MIDI

Primary LanguagePythonMIT LicenseMIT

REMI-z tokenizer and MultiTrack music data structure

This is the official implementation of the REMI-z tokenizer in the paper Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement.

[github | PyPI | author]

This tool helps to convert your music between MIDI and REMI-z representation, which is an efficient sequence representation of multitrack music, meanwhile facilitate manipulate the music at bar level.

The core of this tokenizer is the MultiTrack class as the data structure for multitrack music, which is a hierachical format. Here are the structural details:

  • The music is represented by an MultiTrack object, which is list of bars.
    • Each Bar object represents all notes being played within one bar, grouped by Track object, together with time signature and tempo info of this bar.
      • Each Track object represents one instrument, contatining notes of that instrument in this bar.
        • Each Note object represent one note, including onset, offset, pitch, velocity information.

This Multitrack object can be create from various formats (e.g., MIDI or REMI-z), and convert into various formats (e.g., MIDI, and REMI-z representation).

Install

Install from pip

pip install REMI-z

Install from source

git clone https://github.com/Sonata165/REMI-z.git
cd REMI-z
pip install -r requirements.txt
pip install .

Usage

Please refer to the demo.ipynb.