This ViT implementation as generative network

Question

This ViT implementation as generative network

MrCorsair3 opened this issue 10 months ago · 1 comments

First of all, I want to thank you for a very good job, it looks great!

I would like to use the ViT implementation presented in this repository to generate images in a supervised training process.

The ViT implementation presented in this repository provides a very clear API, however, from what I understand, the main role of these implementations is the task of classification.

Let's assume that I would like to use the presented ViT as a generator in the GAN network (without going into details) to generate 224x224 image based on another input image of the same size. Can I use the presentent API to generate the appropriate architecture?

Thank you in advance
Best regards

Answer 1 · 2023-09-04T14:35:06.000Z

@MrCorsair3 there was a lot of interest in introducing attention to GANs, before DDPMs swept the field away

would start with https://arxiv.org/abs/2107.04589