Pokémon concept art generation with Stable Diffusion

This project contains a pipeline to generate Pokémon concept art text2image. It takes Stable Diffusion 1.5 as a base model and trains a LoRA model on Pokémon artwork to tweak its results.

The project serves as a demo project for converting a Jupyter Notebook prototype into a DVC pipeline. The base notebook is located in notebooks/pokemon_generator.ipynb; the DVC pipeline is defined in dvc.yaml.

How to run

The project is configured to work on a Mac with an M1 chip. Tweaks will be necessary to run the project on different hardware. Make sure to change pipe.to("mps") in src/generate_text_to_image.py if you are not using an M1 (or later) device.

Requirements (tested)

Python >= 3.11.2
virtualenv >= 20.14.1
A Mac with an M1 chip or later (see above)

Instructions

Clone the repository
Create a new virtual environment with python3 -m venv .venv
Activate the virtual environment with source .venv/bin/activate
Install the requirements with pip install -r requirements.txt
Configure your Kaggle API credentials.
To run the notebook, use jupyter notebook
To run the DVC pipeline, use dvc repro. Or use dvc exp run for a new experiment.
If you would like to mirror your DVC cache to a DVC remote, follow these docs.

RCdeWit/sd-pokemon-generator

Pokémon concept art generation with Stable Diffusion

How to run

Requirements (tested)

Instructions

Further reading

Sources used