VQGAN-CLIP: A Jupyter Notebook repository from krishnakaushik25

Brief introduction

VQGAN-CLIP has been in vogue for generating art using deep learning. Searching the r/deepdream subreddit for VQGAN-CLIP yields quite a number of results. Basically, VQGAN can generate pretty high fidelity images, while CLIP can produce relevant captions for images. Combined, VQGAN-CLIP can take prompts from human input, and iterate to generate images that fit the prompts.

Huge Thanks to the creators for sharing detailed explained notebooks on Google Colab, the VQGAN-CLIP technique has been very popular even among the arts communities.

Originally made by Katherine Crowson. The original BigGAN+CLIP method was by advadnoun.Special thanks to Ahsen Khaliq for gradeio code.

The original colab Notebook Link

Taming Transformers Github Repo and CLIP Github Repo

This simple Gradeio app for generating VQGAN-CLIP images on a local environment. Screenshot of the UI as below:

Setup

The steps for setup are based on the Colab referenced above.

Git clone this repo. git clone https://github.com/krishnakaushik25/VQGAN-CLIP
After that, cd into the repo and run: pip install gradio for gradeio istallation.
Install the required Python libraries.Run pip install -r requirements.txt
And then Run python app.py
Wait for few minutes till it executes and then it shows it is running on localhost ann open it in a browser on http://localhost:7860
You can select any exampples listed in the app, and enter the text and parameters of your choice, it can take 30 min - 1 hour time for some examples if the the model iterations are more.

How many steps to run VQGAN-CLIP? There is no ground rule on how many steps to run to get a good image. Images generated are also not guaranteed to be interesting. Experiment!

Less training steps can produce some images which can be very bad.

app_file: Path to your main application file (which contains either gradio or streamlit Python code).
The path file is app.py which is based on gradio.

Further tips / cool variations from the internet

Long, descriptive prompts can have surprisingly pleasant effects: Reddit post
Unreal engine trick
Appending "by James Gurney" for stylization
Rotation and zoom effect with VQGAN+CLIP and RIFE
The Streamlit version of VQGAN-CLIP

krishnakaushik25/VQGAN-CLIP

Brief introduction

Setup

Further tips / cool variations from the internet