craffel/pretty-midi

PrettyMIDI.write error in align_midi.py (ValueError: attribute must be in range 0..16777215)

almostimplemented opened this issue · 4 comments

I've been using pretty_midi to align the audio and MIDI of the solo jazz piano performances from the RWC Music Database. I have had a couple problems, which I will break into two GitHub issues. (The other is #223).

Note: I am aiming for fine-alignment, so my hop size is 64 (default is 512).

This issue is about an error while writing out the resulting aligned MIDI.

Command*:

python align_midi.py --midi_audio_file norm_RM-J003.wav 3.WAV norm_RM-J003.MID aligned_RM-J003.MID

*Note: I added this midi_audio_file flag to skip fluidsynth and to used a pre-rendered MIDI file.

Stacktrace:

(venv) ace01@wynne:/import/c4dm-datasets/PiJAMA/scripts$ cat align_3.out
nohup: ignoring input
align_midi.py:57: FutureWarning: Pass sr=22050, hop_length=64 as keyword args. From version 0.10 passing these as positional arguments will result in an error
  times = librosa.frames_to_time(np.arange(cqt.shape[0]), fs, hop)
/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/pretty_midi/pretty_midi.py:1043: UserWarning: original_times must be strictly increasing; automatically enforcing this.
  warnings.warn('original_times must be strictly increasing; '
Loading /import/c4dm-datasets/RWC/Jazz_Music/3.WAV ...
Loading norm_RM-J003.MID ...
Aligning /import/c4dm-datasets/RWC/Jazz_Music/3.WAV to norm_RM-J003.MID ...
Writing aligned_RM-J003.MID ...
Traceback (most recent call last):
  File "align_midi.py", line 150, in <module>
    midi_object.write(parameters['output_file'])
  File "/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/pretty_midi/pretty_midi.py", line 1317, in write
    tempo=int(6e7/(60./(tick_scale*self.resolution)))))
  File "/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/mido/midifiles/meta.py", line 469, in __init__
    self._setattr(name, value)
  File "/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/mido/midifiles/meta.py", line 501, in _setattr
    spec.check(name, value)
  File "/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/mido/midifiles/meta.py", line 294, in check
    check_int(value, 0, 0xffffff)
  File "/import/c4dm-datasets/PiJAMA/venv/lib/python3.7/site-packages/mido/midifiles/meta.py", line 145, in check_int
    raise ValueError('attribute must be in range {}..{}'.format(low, high))
ValueError: attribute must be in range 0..16777215

This also happened with song number 5 from the dataset.

The related files to reproduce are here:
https://drive.google.com/drive/folders/1EdDkaJZ1zMNCi9TmZxcl0tEmnufxxmTx?usp=sharing

Hm, I don't think you should attempt to use the provided align_midi example (which, to be up front, is a usage example - not a part of the library) with such a fine-resolution. I don't think the algorithm is accurate enough to provide any more fine-grained alignment. The issue you're facing is actually separate - the issue is that, for the MIDI file in question, int(6e7/(60./(tick_scale*self.resolution))) evaluates to a larger number than can be stored with MIDI. I vaguely recall the RWC files as having a ridiculously fine-grained resolution, maybe that's the issue? Otherwise this would only be happening if the tempo was anomalous, which I suppose is possible if the MIDI alignment code glitches. It might be worth checking what tick_scale (which is computed based on the tempo) and resolution (which is typically something like 220 for a sane MIDI file) is when this happens?

Thank you @craffel.

I appreciate that the example is merely that, and not part of the library. My motivation for using this example was actually Google Magenta's MAESTRO publication, where they mention using the same algorithm (with a more efficient DTW) to get 3-millisecond fine-alignment. [1] I probably should mention how well it worked for the cases where it didn't hit this bug 👏 Here an example of it getting incredibly good alignment on the first RWC jazz piano song.

I vaguely recall the RWC files as having a ridiculously fine-grained resolution ... what tick_scale and resolution

I created new MIDI files from the raw RWC MIDI. And I used default values in mido.MidiFile.

>>> p = pretty_midi.PrettyMIDI("norm_RM-J003.MID")
>>> p.resolution
480
>>> p.ti
p.tick_to_time(           p.time_signature_changes  p.time_to_tick(
>>> p._tick_scales
[(0, 0.0010416666666666667)]

The tick scale error happens in PrettyMIDI. write, not during the initialization, so it is a result of the computed tick scales as part of adjust_times.

[1] https://arxiv.org/pdf/1810.12247.pdf

After the initial alignment and segmentation, we applied Dynamic Time Warping (DTW) to account for any jitter in either the audio or MIDI recordings. DTW has seen wide use in audio-to-MIDI alignment; for an overview see Muller (2015). We follow the ¨ align midi example from pretty midi (Raffel & Ellis, 2014), except that we use a custom C++ DTW implementation for improved speed and memory efficiency to allow for aligning long sequences. First, in Python, we use librosa to load the audio and resample it to a 22,050Hz mono signal. Next, we load the MIDI and synthesize it at the same sample rate, using the same FluidSynth process as above. Then, we pad the end of the shorter of the two sample arrays so they are the same length. We use the same procedure as align midi to extract CQTs from both sample arrays, except that we use a hop length of 64 to achieve a resolution of ∼3ms.

Yeah, I'm not sure why int(6e7/(60./(tick_scale*self.resolution))) would end up being so large - if self.resolution is 480, then it must be that tick_scale is particularly huge (e.g. because of an anomalous tempo produced by the alignment example). The limitation on the attribute range is fundamental to MIDI, so I think unless you find another cause (i.e. one within pretty_midi and outside of the MIDI alignment example), I think everything is working as attended.

Sounds good. I'll close this and explore a custom alignment implementation for my constant tempo MIDI files. I'll also see if reducing the resolution is an immediate workaround.

Thank you!