Evaluation Protocol
MoayedHajiAli opened this issue · 2 comments
MoayedHajiAli commented
Hello,
I noticed in your code that you have n_candidate_per_text to be set to 3 by default. I am wondering if that was used during the evaluation as it was not mentioned in the paper?
Additionally, what CLAP backbone did you use to calculate the CLAP Score in comparison with other methods such as Make-an-Audio and audio-ldm, as the scale of the number is very different. Thank you for your help.
Best,
soujanyaporia commented
Hi,
For evaluation, neither we generated 3 samples nor we selected the best. The 630k-audioset-best checkpoint was used to report the scores.
MoayedHajiAli commented
Thank you for your response!