Evaluation Performance
Closed this issue · 4 comments
Hi,
Thanks for sharing your work.
When I use the pre-trained model checkpoint from your repo(https://github.com/qiuqiangkong/piano_transcription_inference),
I can't get the performance of the original paper's score.
I set the arguments like below.
Much lower performance results were obtained.
Can I get your paper's score with this source code? or do I need to edit something?
(https://github.com/bytedance/piano_transcription/blob/master/pytorch/calculate_score_for_paper.py)
Thank you.
Oh, it's right that the performance comes out well. Thank you.
Hey, I encountered the same problem. May I know what have you done to achieve the same performance results from the paper?
Hi,
If you check closely the given evaluation code, they use mir_eval.transcription.precision_recall_f1_overlap for the Note F1-score.
However, it is for the Note w/ Offset F1-score, not the Note F1-score.
If you want to get Note F1-score, you should use mir_eval.transcription.onset_precision_recall_f1.
(Please refer to the following link: https://github.com/craffel/mir_eval/blob/main/mir_eval/transcription.py)
Thank you very much!