bytedance/piano_transcription

Evaluation Performance

Closed this issue · 4 comments

Hi,

Thanks for sharing your work.

When I use the pre-trained model checkpoint from your repo(https://github.com/qiuqiangkong/piano_transcription_inference),
I can't get the performance of the original paper's score.
capture

I set the arguments like below.

setting

Much lower performance results were obtained.

Can I get your paper's score with this source code? or do I need to edit something?
(https://github.com/bytedance/piano_transcription/blob/master/pytorch/calculate_score_for_paper.py)

Thank you.

Oh, it's right that the performance comes out well. Thank you.

Hey, I encountered the same problem. May I know what have you done to achieve the same performance results from the paper?

Hi,
If you check closely the given evaluation code, they use mir_eval.transcription.precision_recall_f1_overlap for the Note F1-score.
However, it is for the Note w/ Offset F1-score, not the Note F1-score.
If you want to get Note F1-score, you should use mir_eval.transcription.onset_precision_recall_f1.
(Please refer to the following link: https://github.com/craffel/mir_eval/blob/main/mir_eval/transcription.py)

Thank you very much!