bytedance/piano_transcription

Logical bug in MaestroDataset

Opened this issue · 0 comments

In utils/data_generator.py, line 86, we ensure we grab a segment that is contained by the waveform:

        # Load hdf5
        with h5py.File(hdf5_path, 'r') as hf:
            start_sample = int(start_time * self.sample_rate)
            end_sample = start_sample + self.segment_samples

            if end_sample >= hf['waveform'].shape[0]:
                start_sample -= self.segment_samples 
                end_sample -= self.segment_samples

However, you fail to update start_time, so when you later grab the target_dict, it will be off by self.segment_seconds.

            # Process MIDI events to target
            (target_dict, note_events, pedal_events) = \
                self.target_processor.process(start_time, midi_events_time, 
                    midi_events, extend_pedal=True, note_shift=note_shift)

I don't think this is an issue, because your Sampler logic only constructs meta for valid segments:

while (start_time + self.segment_seconds < hf.attrs['duration'])

but it is still a logical error so I thought I would report and offer a fix.