Follow instructions in CompVis stable diffusion to set up environment. Make sure you are in the proper conda env (conda activate ldm
).
- Run
create_dataset.py
to create dataset. Extracts Kanji images and corresponding English description from .xml (source).
data/
├── kanji/
│ ├── kanji_id1.jpg
│ ├── kanji_id2.jpg
│ ├── ...
│ └── descriptions.txt
- Run ./train.sh
- Generate images with
python3 inference.py --prompt="prompt" --num_images=10