generating images for given class

Question

generating images for given class

LijaAlex12 opened this issue 4 years ago · 5 comments

Could you please throw some light on generating images for a given class using the pre-trained model.

Answer 1 · 2021-05-15T16:37:23.000Z

Hi,

Could you please provide an example of what you mean by a "given class"?
Note that in this project we work with semantic image synthesis, so our pre-trained models are conditioned on semantic label maps (which have a class label for each pixel). Therefore, to generate an image, one needs a "given class" for each pixel.

In principle, it is possible to have all pixels conditioned on the same class (e.g., full image of "grass" or "sky" class), but I am not sure this is the output you desire.

Answer 2 · 2021-05-15T17:42:30.000Z

I actually meant, given an image as input, will it be able to produce similar images assuming that the dataset has only 1 class label. For eg: if we input the image of lion, will it be able to produce similar images (say a count of 5 images)of lion using the pre-trained model available. Also will I be able to train and test using Google Colab.

Answer 3 · 2021-05-16T09:40:02.000Z

What you describe is in principle possible with our pre-trained models. Note, however, that in order to generate new images, you will still need a label map to input to the generator.

Do you have a label map for your input image?
If yes, then you can sample new images directly with this label map as input, and by varying the 3d noise. In this case, the generator will produce different versions of the same scene layout with differently looking objects of the same category.

If you have only the image (without the label map), then there is a possibility to infer it via our segmentation-based discriminator. This is described in section B.5 and Figure K in our paper. In this case, you will need to pass the image of interest though the discriminator network, obtain the label map prediction as argmax among real classes (excluding zero "fake" class), and then pass this estimation as input to the generator network. By varying the 3d noise, one can obtain different versions of the initial image.

As a final note, both described procedures would still work only with classes that were present in the training data. So it would not work out of the box with a lion, but can be tried for example on "cat", "horse", "elephant", or any other class from COCO-stuff dataset.

Answer 4 · 2021-05-16T10:35:42.000Z

To add to Vadim's answer, here's what it looks like when you generate several new images from a single image. Note that you don't need a label map for that, since the OASIS discriminator can extract labels itself.

From left to right: original image, discriminator label predicition, 3 newly synthesized images.

Answer 5 · 2021-05-17T17:29:10.000Z

Thank you very much. It helped me a lot.