/diffusioncomics

DIP final project

Primary LanguagePython

Style Transfer Using Stable Diffusion

Contributors: Asvin Venkataramanan, Sloke Shrestha, Sundar Sripada V.S.

We won the EE 371Q Digital Image Processing Ram’s Horn best project award!! EE 371Q is a famous class taught by Prof. Alan Bovik at University of Texas at Austin

Video presentation available at:

YouTube

Report available at: Style Transfer Using Stable Diffusion)

Code

This code base was used to implement the final project for the course EE371Q Digital Image Processing.

We used this code base to fine-tune a Diffusion model using the LoRA technique to convert images into the style of Calvin and Hobbes comics. Here's a brief overview of the files.

  • code/get_cnh_dataset.ipynb and code/get_cnh_dataset.py were used to download files from the Internet Archive and create the dataset.
  • code/lora_example.sh and code/train_text_to_image_lora.py were used to fine-tune the diffusion model.
  • code/infer.py, code/infertext.py, code/infervideo.py were used to generate samples with the fine-tuned model.
  • code/utils.py and code/utils_slides.py were used for miscellaneous tasks.
  • environment.yml has the packages required to run this code.

Samples

Text to Image Samples

Generated samples before and after fine-tuning.

Image to Image Samples

Image to Image Samples with Edge Input