This repository is for sharing pre-trained OpenAI DALLE model and generating images from given texts.
All models are trained by lucidrains/DALLE-pytorch + VQGAN (Taming transformer) with different training code and BPE model.
If you want to train DALLE, please go to lucidrains/DALLE-pytorch and support them to reproduce better DALLE models
- CUB200
- COCO
- CUB200
- COCO
- Install requirements
$ pip install -r requirements
- Install DeepSpeed
- DeepSpeed is only necessary for attention type 'sparse'.
- Follow the instruction here and install DeepSpeed
- Download models below and save them in pretrained folder
- Check the link in Details for the model specifics
Dataset | Download | Password | Optimizer | Attention type | Size | Details |
---|---|---|---|---|---|---|
CUB200 | link | v9ge | Adam | ('full', 'sparse') | 1.1GB | link |
CUB200 | link | eui3 | Adam | ('full', 'axial_row', 'axial_col', 'conv_like') | 1.1GB | link |
CUB200 | link | 47w1 | AdamW | ('full', 'sparse') | 1.1GB | link |
COCO | link | p3ki | Adam | ('full', 'sparse') | 1.5GB | link |