/Stable-Diffusion-for-Remote-Sensing-Image-Generation

A project for text-to-image remote sensing image generation.

Primary LanguageJupyter NotebookMIT LicenseMIT

Stable Diffusion for Remote Sensing Image Generation

Author: Zhiqiang yuan @ AIR CAS, Send a Email

Welcome 👍Fork and Star👍, then we'll let you know when we update

-------------------------------------------------------------------------------------

./assets/MAIN.png

A simple project for text-to-image remote sensing image generation, and we will release the code of using multiple text to control regions for super-large RS image generation later. Also welcome to see the project of image-condition fake sample generation in TGRS, 2023.

Environment configuration

Follow and thanks original training repo .

Pretrained weights

We used RS image-text dataset RSITMD as training data and fine-tuned stable diffusion for 10 epochs with 1 x A100 GPU. When the batchsize is 4, the GPU memory consumption is about 40+ Gb during training, and about 20+ Gb during sampling. The pretrain weights is realesed at last-pruned.ckpt.

Using

Samling

Download the pretrain weights last-pruned.ckpt to current dir, and run with:

python scripts/txt2img.py \
    --prompt 'Some boats drived in the sea' \
    --outdir 'outputs/RS' \
    --H 512 --W 512 \
    --n_samples 4 \
    --config 'configs/stable-diffusion/RSITMD.yaml' \
    --ckpt './last-pruned.ckpt'

Traing

Put images of RSITMD in data/RSITMD/images, and run with:

python main.py \
    -t \
    --base configs/lammbda/RSITMD.yaml \
    --gpus 1 \
    --scale_lr False \
    --num_nodes 1 \
    --check_val_every_n_epoch 10 \
    --finetune_from './last-pruned.ckpt'

Examples

Caption: Some boats drived in the sea. ./assets/shows1.png

Caption: A lot of cars parked in the airport. ./assets/shows2.png

Caption: A large number of vehicles are parked in the parking lot, next to the bare desert. ./assets/shows3.png

Caption: There is a church in a dark green forest with two table tennis courts next to it. ./assets/shows4.png