/diffusion-tool

AI image generator and upscaler tool, created for my university exam

Primary LanguageJavaGNU General Public License v3.0GPL-3.0

diffusion-tool

diffusion-tool

Image generator and upscaler created for my AI university exam

Description

At its core, it's a JavaFX application that integrates the Python interpreter and uses it to implement Stable Diffusion pipelines for generative AI plus upscaling and BSRGAN's degradation model for the upscaling of any image.
I initially thought about using the Spring framework to manage user registration, but I wanted everyone to be able to use the program offline, so I opted for a local approach instead and the user data is now saved on the working directory.
It is structured as follow: from the user side, we have the Login and Sign Up pages. Once an user has logged in, they have access to the Home, Profile, Generate and Upscale pages.
The last two are the essential part of the project and they act as GUI for the Python scripts.

Prerequisites

In order to compile and run the software, it is required that you have the following prerequisites:

  • Open Java Development Kit (OpenJDK) 17 or above
  • Apache Maven (at least version 3.6.3 is recommended)

You also MUST install a Python virtual environment in your home directory, inside a folder named 'venv', with the packages listed in requirements.

cd ~/venv/bin
source activate
pip install -r requirements.txt

System requirements

I will only include consumer-level hardware.
AI-computing capable hardware that has a GPU with enough VRAM should be capable of running this software.
ATTENTION: currently, AMD GPUs are not supported as the application relies on CUDA, a technology exclusive to NVIDIA.

Minimum Recommended
OS Linux x64 Linux x64
DSP X.Org's X11 Wayland
CPU Intel Core i5-7500 / AMD Ryzen 5 1600 Intel Core i7-9700k / AMD Ryzen 5 3600x
RAM 16 GBs 16 GBs
GPU NVIDIA GeForce GTX 1660 SUPER NVIDIA GeForce RTX 3060

Building

Executable packages can be downloaded from Releases or manually built instead.
You can do that assuming the above prerequisites have already been installed.
Once you're in the project directory, type the following in a terminal to download the dependencies and compile all the classes:

mvn clean install

Then, if you also want a runnable .jar archive, type:

mvn package

With these commands, a new folder named 'target' is created containing the compiled project as well as the executable file.

Unlock Stable Diffusion 3

The newest generative model is currently gated, so first you need to sign up here.
Proceed to generate a token under your account settings which you will use to login with:

huggingface-cli login

Enter your credentials first, then the token when it's needed.

Screenshots

Home

home-view

Image Generation

generate-view

Image Upscaling

upscale-view

Upscaling Comparison

Low-res vs. Upscaled

UpscalingComparison UpscalingComparison2

Credits

As stated before, this project uses BSRGAN's degradation model for upscaling purposes.
BSRGAN is a practical degradation model for Deep Blind Image Super-Resolution, developed by Kai Zhang, Jingyun Liang, Luc Van Gool, Radu Timofte, Computer Vision Lab, in ETH Zurich, Switzerland.
You can check out their repository and find out more here: BSRGAN.
In order to set up the model, a script made by TGS963 in the public repository of upscayl was particularly helpful.
I've edited said script to adapt it and make it work on my project, keeping acknowledgments in comments just below library imports.
The project utilizes Stable Diffusion's generative AI pipelines for image generation and upscaling, in particular:

License