Inaccurate pitch_tolerance in transcription.precision_recall_f1_overlap
ax-le opened this issue · 3 comments
Computing transcription.precision_recall_f1_overlap([...], pitch_tolerance = 30)
gives lower statistical outputs than with the default value of pitch_tolerance
, transcription.precision_recall_f1_overlap([...], pitch_tolerance = 50)
in my examples. (For example, F-measure is equal to 0.379 with a pitch_tolerance set to 30 and to 0.396 when pitch_tolerance is set to 50).
However, my estimated pitches are midi-scale integers, so as my ground truth. In that sense, the minimal positive gap between an estimated pitch and the ground truth is a semi-tone, or 100 cents.
Hence, a tolerance smaller than 100 cents shouldn't affect the statistical outputs.
I highly suspect a rounding operation misleading the pitch comparison.
Thanks for reporting this. Can you provide a MWE or example files which reproduce the issue?
Sure, here are two .txt files reference.txt and estimation.txt reproducing the issue. The first and second columns of the files contain respectively the onset and offset times, and the third ones the pitches. In my tests, offset times were ignored.
Sorry, I missed an important detail in your first message. You wrote
my estimated pitches are midi-scale integers
That's the wrong format for transcription.precision_recall_f1_overlap
. The docstring clearly says
Array of estimated pitch values in Hertz
You can convert from your MIDI pitches to Hz using either pretty_midi.note_number_to_hz
or
librosa.midi_to_hz
or just 440.0*(2.0**((note_number - 69)/12.0))
.