Inaccurate pitch_tolerance in transcription.precision_recall_f1_overlap

Question

Inaccurate pitch_tolerance in transcription.precision_recall_f1_overlap

ax-le opened this issue 6 years ago · 3 comments

Computing transcription.precision_recall_f1_overlap([...], pitch_tolerance = 30) gives lower statistical outputs than with the default value of pitch_tolerance, transcription.precision_recall_f1_overlap([...], pitch_tolerance = 50) in my examples. (For example, F-measure is equal to 0.379 with a pitch_tolerance set to 30 and to 0.396 when pitch_tolerance is set to 50).
However, my estimated pitches are midi-scale integers, so as my ground truth. In that sense, the minimal positive gap between an estimated pitch and the ground truth is a semi-tone, or 100 cents.
Hence, a tolerance smaller than 100 cents shouldn't affect the statistical outputs.
I highly suspect a rounding operation misleading the pitch comparison.

Answer 1 · 2019-04-12T14:27:58.000Z

Thanks for reporting this. Can you provide a MWE or example files which reproduce the issue?

Answer 2 · 2019-04-15T09:09:11.000Z

Sure, here are two .txt files reference.txt and estimation.txt reproducing the issue. The first and second columns of the files contain respectively the onset and offset times, and the third ones the pitches. In my tests, offset times were ignored.

Answer 3 · 2019-04-15T15:25:30.000Z

Sorry, I missed an important detail in your first message. You wrote

my estimated pitches are midi-scale integers

That's the wrong format for transcription.precision_recall_f1_overlap. The docstring clearly says

Array of estimated pitch values in Hertz

You can convert from your MIDI pitches to Hz using either pretty_midi.note_number_to_hz or
librosa.midi_to_hz or just 440.0*(2.0**((note_number - 69)/12.0)).