YIN vs McLeod pitch detection accuracy
apaatsio opened this issue ยท 7 comments
Are there benchmarks of the pitch detection accuracy of the YIN and McLeod algorithms? Is there any significant difference?
It's a good question. You could probably modify https://raw.githubusercontent.com/sevagh/pitch-detection/master/test/test_instruments.cpp to compare YIN and MPM side-by-side.
Personally, since figuring out how to do YinFFT (and thus bringing the performance in line with MPM), I prefer YIN due to the low pitch cutoff for MPM:
- https://github.com/sevagh/pitch-detection/blob/master/src/pitch_detection_priv.h#L7
- https://github.com/sevagh/pitch-detection/blob/master/src/mpm.cpp#L84
Simply put MPM doesn't function well for low pitches under 80Hz e.g.: #50
I don't remember why I picked the value 80Hz. Maybe from real testing, or maybe using the guitar low E (~82.4Hz) as a threshold.
Here's some newer methods I've seen (maybe time for a pitch-detection-v2 project?):
https://code.soundsoftware.ac.uk/projects/pyin
https://github.com/marl/crepe
I'm considering using https://github.com/EliosMolina/audio_degrader to run some real/degradation tests against YIN and McLeod to make them prove their worth
Here's a sneak preview of a test using my (hopefully soon-to-be-released) audio-degradation-toolbox.
First I got a file (Flute, Db7) from http://theremin.music.uiowa.edu/MIS.html
I applied some degradations:
sevagh:audio-degradation-toolbox $ cat degradations.json
[
{
"name": "noise",
"color": "white",
"snr": 6
},
{
"name": "noise",
"color": "violet",
"snr": 13
},
{
"name": "mp3",
"bitrate": 32
},
{
"name": "mp3",
"bitrate": 32
},
{
"name": "mp3",
"bitrate": 32
}
]
Running pitch detection against the pure clip (converted from aiff to wav):
sevagh:pitch-detection $ ./wav_analyzer/wav_analyzer ~/repos/audio-degradation-toolbox/Flute-Db7-pure.wav
sample rate: 44100
len samples: 133120
frame size: 2
seconds: 3.01859
channels: 1
mpm: 2267.48
yin: 2268.88
pyin: 12422.3, 0.305056
pyin: 2268.88, 0.682327
pyin: 1133.89, 0.012614
pmpm: 2267.48, 1
A Db7 should be 2217.46
but it could be out of tune I guess.
Finally, with the degradations applied:
sevagh:pitch-detection $ ./wav_analyzer/wav_analyzer ~/repos/audio-degradation-toolbox/Flute-Db7-degraded.wav
sample rate: 44100
len samples: 135983
frame size: 2
seconds: 3.08351
channels: 1
mpm: 2268.01
yin: 2273.61
pyin: 12790.1, 0.305056
pyin: 2273.61, 0.592254
pyin: 2.20599, 0.102687
pmpm: 2268.01, 1
MPM is closer to what it calculated with the clean clip. YIN is straying further.
I'll be formalizing all of this in a shiny new mega test suite, hopefully with some head-to-head tables to put in the README.
YIN wins - check it out: https://github.com/sevagh/pitch-detection#degraded-audio-tests
Thank you for doing all of this.
Seems there is not much difference in accuracy. However, I am interested in the fact that YIN can detect lower frequencies more easily. I switched from YIN to MPM a long time ago, partly due to the lack of an FFT-based version at the time. Maybe time to revisit..