Evaluation pipeline
SnowdenLee opened this issue · 6 comments
Hi,
thanks for the interesting work. I'm wondering if you could please share the code for your evaluation, e.g. , different similarity evaluation. Thanks a lot!
+1 The work is very interesting. It would be nice if you could include the evaluation pipeline for reproducibility. Thanks!
Hi @SnowdenLee, and @korawat-tanwisuth, thanks for your interest!
The evaluation code is now under the metrics
folder including a script for CLIP metrics and a script for BLIP metrics.
Hi @AttendAndExcite,
could you maybe provide a list of prompts you used? I know you mentioned the templates in the paper, but would be good to have exact prompts with combination of objects/animals/colors. Thanks a lot!
Sure. You can find all the prompts for each of the three categories in the attached file. It's a dictionary containing a list of prompts for each category.
a&e_prompts.txt
The sets of all possible animals/objects/colors are specified in the supplementary of the material, but these are the exact prompts we used in the paper.
Can I ask which 30 out of these prompts did you use for human evaluation?
We randomly choose 10 prompts from each of the 3 subsets. Unfortunately, we did not save the exact prompts used for the evaluation study.