Reproduce SD (& other models') results

Question

Reproduce SD (& other models') results

zwcolin opened this issue a year ago · 2 comments

Hello,

I was wondering if there's any way to reproduce your results from your main table? I didn't find any information about seed/generator usage from your codebase/report so I'd really appreciate if you could provide some insights on reproducing these numbers. Thanks!

Also, a side question, for all SD that is mentioned from the codebase and the report, are you using stable diffusion 1.4 or 1.5?

Answer 1 · 2023-09-05T19:22:13.000Z

Hi,

we directly used the open-source versions of the models that we have tested in our benchmarking. We used SD 1.4 and the default parameters from https://github.com/CompVis/stable-diffusion/blob/main/scripts/txt2img.py

Note that we have provided the outputs of our object detector in ./objdet_results . All of our generated images are available here: https://huggingface.co/datasets/tgokhale/sr2d_visor/tree/main

Happy to provide more information if needed.

Answer 2 · 2023-09-05T19:47:47.000Z

Thanks for the response and the additional information!

Another question that we had is how accurate is the detector for ground truth images? i.e., by setting the threshold to 0.1, do you provide a VISOR reference to ground truth image-prompt (or image-caption) pairs, such as all captions relating to spatial relationships from the COCO dataset?