Scripts to use the Stable Diffusion AI to create images from a textual description.
This script is without bells and whistles. But it has some features like:
- Collecting images in a directory and store accompanying information in a CSV file.
- A companion script for scaling the created images also with AI support.
I have used these scripts:
- On a Linux operating system
- With Python 3.9.7
- Together with Nvidia GPU RTX 3060 (12 GB GPU RAM)
- And 15 GB RAM.
Note: If the process of generating the image dies then it was running out of memory. Maybe closing some other programs can free enough memory to avoid it. Observe your memory with htop or top when necessary.
Before pulling files from obscure internet sources it is worthwhile to follow the instructions of the original Stable Diffuser project. There are not too many steps to accomplish and it is easily done.
- Register at https://huggingface.co/ and confirm your registration with the link sent by email.
- Login to huggingface.co and access the AI model: https://huggingface.co/CompVis/stable-diffusion-v1-4
- Confirm the license of the model. The license ensures that output of the AI model is not used in an irresponsible way.
- Go to your profile settings on huggingface.co and generate a user token. Secure this token locally. For example, use a password manager.
- Optional: Before leaving huggingface.co, you may join a user group (organization) to share resources.
Now you are ready for your the command line. This script expects that your computer has a GPU with CUDA abilities.
Download this repository from Github:
git clone https://github.com/nullpointer-io/stablediffusion.git
Change into the directory:
cd ./stablediffusion
Install the necessary Python libraries. The incorporated torch library must be matched with the CUDA version of your GPU. Note: The size of the torch library for CUDA 11.6 is around 2 GB for example.
pip install -r requirements.cuda116.txt # CUDA 11.6
pip install -r requirements.cuda113.txt # CUDA 11.3
pip install -r requirements.cuda102.txt # CUDA 10.2
Recommendation: To avoid clutter of Python libraries on your computer you should use tools like pyenv and/or pipenv.
Login to huggingface.co via the huggingface command line interface.
huggingface-cli login
When prompted paste your user token which has been created before. Don't be irritated when the pasted token is not visible on the command line.
Be warned: For the initial run gigabytes of data will be downloaded. Take the opportunity and grab some coffee, water or tea.
Based on your amount of GPU RAM there are two options to execute the script.
a. Create an image with a GPU over 10 GB RAM.
./text2image.py "a chair in the shape of an avacado"
b. Create an image with a GPU less than 10 GB RAM.
./text2image.py "a chair in the shape of an avacado" --over10gb=False
To scale the image with AI support use image2scale.py. See related chapter
Images are stored as PNG files into the subdirectory collected. The used parameter set for the image generation is written to CSV. The CSV file resides in the same directory like the images. The prefix of an image file corresponds to the index in the CSV file. In this way every image is preserved and will not be overwritten.
Thus it is possible to easily create series of images:
for i in $(seq 1 50); do ./text2image.py "a chair in the shape of an avacado" --steps=$i; done
The image quality and output can be controlled by parameters:
--steps=STEPS (Default: 15) -> Number of Inference Steps
--seed=SEED (Default: 1024) -> Manual Generator Seed
--manualseed=MANUALSEED (Default: True) -> Enable or Disable the Manual Generator Seed
--scale=SCALE (Default: 7.5) -> Guidance Scale
Example:
./text2image.py "a chair in the shape of an avacado" --steps=50 --scale=2.5 --manualseed=False
In reference 1 the parameters are described in detail.
Use the script image2scale.py
to increase the size of images with AI support.
At the first run the models are downloaded. The integrity of the models is verified
by hash. The expected hashes and source URLs are configured in share/models.py
.
The script image2scale.py
is some sort of wrapper around the original script:
https://raw.githubusercontent.com/xinntao/Real-ESRGAN/v0.2.5.0/inference_realesrgan.py
Adding the integrity check and adapting the output paths to this project.
Note: The code of the original script are the functions main and parse_args.
In the most cases you only have to execute:
./image2scale.py -i collected/00001-a-chair-in-the-shape-of-an-avacado.png
The result wil be stored the sub directory scaled
.
Execute the script with -h
to inspect possible arguments.
This script is based on information from the references 2 1 3 4.