How do you compute FID?
xml94 opened this issue · 3 comments
To reproduce your results, one thing I am not sure, could you please help me to make sure?
For cityscapes dataset, from https://github.com/NVlabs/SPADE, and [https://github.com/mseitzer/pytorch-fid](FID computation). Without doubt, your synthesized images are from your codes.
But how about the real images?
- Resize. resize real images to the same size with your synthesized (256 * 512) in nearest down sampling?
- what are the real images? only val image froms cityscapes (gtFine/val) or all images (gtFine/val, gtFine/test, gtFine/train)?
Thanks a lot.
Hi,
Thanks for asking. The real images should come from the "val" set. To obtain these images, you should collect all images from gtFine/val
, this should result in 500 images.
Note that our test scripts produces exactly 500 generated images, as they are conditioned on the same label map as test images.
Test images should be indeed resized to 256*512. In our dataloaders/CityscapesDataset.py
we do the resizing via bicubic interpolation:
image = TR.functional.resize(image, (new_width, new_height), Image.BICUBIC)
, so I would advice to use this interpolation. Although, since you need to simply downsample with factor x2, the interpolation scheme should not play a big role.
@SushkoVadim
Thanks for your reply.
I think you was talking about 'val' in gtFine, instead of 'test'. I saw your code.
Sorry, I meant "val" set, that's correct.
Thanks for pointing out!
I will update my previous message.