Pokémon concept art generation with Stable Diffusion

This project contains a pipeline to generate Pokémon concept art text2image. It takes Stable Diffusion 1.5 as a base model and trains a LoRA model on Pokémon artwork to tweak its results.

The project serves as a demo project for converting a Jupyter Notebook prototype into a DVC pipeline. The base notebook is located in notebooks/pokemon_generator.ipynb; the DVC pipeline is defined in dvc.yaml.

How to run

The project is configured to work on a Mac with an M1 chip. Tweaks will be necessary to run the project on different hardware. Make sure to change pipe.to("mps") in src/generate_text_to_image.py if you are not using an M1 (or later) device.

Requirements (tested)

Instructions

  1. Clone the repository
  2. Create a new virtual environment with python3 -m venv .venv
  3. Activate the virtual environment with source .venv/bin/activate
  4. Install the requirements with pip install -r requirements.txt
  5. Configure your Kaggle API credentials.
  6. To run the notebook, use jupyter notebook
  7. To run the DVC pipeline, use dvc repro. Or use dvc exp run for a new experiment.
  8. If you would like to mirror your DVC cache to a DVC remote, follow these docs.

Further reading

If you would like a more detailed rundown on converting a Jupyter Notebook into a DVC pipeline, please take a look at the following materials:

Sources used