Diffusion Lab

Create professional AI art and storyboards instantly from text prompts. Python web app powered by Stable Diffusion XL with advanced features like inpainting, ControlNet, and batch generation.

🚀 Quick Start

Option 1: Automated Setup (Recommended)

# Clone repository
git clone https://github.com/arun-gupta/diffusion-lab.git
cd diffusion-lab

# One-command setup and run
./run_webapp.sh
# Open http://localhost:5001 in your browser

Option 2: Fresh Install

# Complete clean setup (removes old environment)
./cleanup_and_reinstall.sh

# Then run the web app
python3 -m diffusionlab.api.webapp
# Open http://localhost:5001 in your browser

Option 3: Manual Setup

# Clone and setup manually
git clone https://github.com/arun-gupta/diffusion-lab.git
cd diffusion-lab
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Run the web app
python3 -m diffusionlab.api.webapp
# Open http://localhost:5001 in your browser

✨ Features

📖 Storyboard Generation: Create 5-panel storyboards with AI images and captions
🎨 Single-Image Art: Generate high-quality AI images from text prompts
🔄 Batch Generation: Create multiple variations of the same prompt for creative exploration
🔄 Image-to-Image: Transform sketches/photos into polished artwork
🎯 Inpainting: Remove objects or fill areas with AI-generated content
🔗 Prompt Chaining: Create evolving story sequences with multiple prompts
🎯 ControlNet: Precise control over composition, pose, and structure using reference images
📱 Web Interface: Modern, responsive UI with real-time generation

🤖 AI Models

This application uses state-of-the-art AI models to power its generation capabilities:

Primary Models

Stable Diffusion XL (`stabilityai/stable-diffusion-xl-base-1.0`)

Purpose: Main image generation model
Used for: Text-to-image generation, image-to-image transformation
Configuration: fp16 variant with safetensors format
Performance: Optimized for high-quality 512×512 image generation

Stable Diffusion XL Inpainting (`stabilityai/stable-diffusion-xl-base-1.0`)

Purpose: Specialized inpainting pipeline for content-aware fill
Used for: Object removal, background replacement, creative editing
Features: Precise mask-based generation with seamless blending

StableLM 3B 4E1T (`stabilityai/stablelm-3b-4e1t`)

Purpose: Text generation and caption creation
Used for:
- Generating scene variations for storyboards
- Creating descriptive captions for generated images
- Text processing and prompt enhancement
Size: 3 billion parameters optimized for text tasks

ControlNet Models (Advanced Control)

The app includes configuration for ControlNet models for precise generation control:

Edge Detection (`lllyasviel/control_v11p_sd15_canny`)

Use case: Line art, architectural drawings, precise outlines
Best for: Converting sketches to finished artwork

Depth Map (`lllyasviel/control_v11f1p_sd15_depth`)

Use case: 3D scenes, architectural visualization, spatial control
Best for: Creating images with precise depth and perspective

Human Pose (`lllyasviel/control_v11p_sd15_openpose`)

Use case: Character poses, figure drawing, animation
Best for: Maintaining specific poses from reference images

Semantic Segmentation (`lllyasviel/control_v11p_sd15_seg`)

Use case: Object placement, scene composition, layout control
Best for: Controlling where objects appear in the scene

Supporting Libraries

ControlNet Auxiliary Processors

CannyDetector: Edge detection preprocessing
OpenposeDetector: Human pose estimation
MLSDdetector: Line detection for architectural elements
HEDdetector: Holistically-nested edge detection

Model Loading Strategy

Primary models (SDXL, SDXL Inpainting, StableLM) are loaded at startup for AI mode
ControlNet models are loaded on-demand to conserve memory
Demo mode uses placeholder generation without loading heavy models
Fallback mechanisms ensure the app works even if some models fail to load

Technical Specifications

Torch dtype: float32 (configurable)
Model format: Safetensors for faster loading
Variant: fp16 for SDXL (optimized for memory efficiency)
Device optimization: CUDA, MPS (Apple Silicon), or CPU fallback
Memory management: Attention slicing enabled for performance
Model size: ~10GB total for all models

Performance Notes

GPU recommended: CUDA-compatible GPU or Apple Silicon for optimal performance
Memory requirements: 8GB+ RAM (16GB+ recommended for AI mode)
Storage: ~10GB for all models and dependencies
Loading time: ~30-60 seconds for initial model loading

📸 Example Outputs

🖥️ Main Interface

Modern web-based interface with intuitive controls and real-time generation

📖 Storyboard Generation

"A detective walks into a neon-lit alley at midnight, rain pouring down" (Cinematic style)

🎨 Single-Image Art

"A spaceship crew encounters an alien artifact on a distant planet" (Pixar style)

🔄 Batch Generation

"A magical forest with glowing mushrooms and fairy lights" (4 variations, Anime style, 0.8 strength)

🔄 Image-to-Image Transformation

"A polished character design for a sci-fi video game protagonist" (Photorealistic style, strength: 0.5)

Input Image (1870×2493 → 1024×1024)	Output Image (AI standard size)

🎯 Inpainting (Content-aware Fill)

Object removal: "Natural landscape continuation with trees and sky" (Photorealistic style)

🔗 Prompt Chaining (Story Evolution)

Character Journey: "A young adventurer's quest from village to mountain peak" (Pixar style, 5-step evolution)

Template Steps:

Character introduction and setting
Character faces a challenge or conflict
Character overcomes the challenge
Character learns and grows
Character reaches their goal or destination

🎯 ControlNet (Precise Control)

"A majestic dragon in a fantasy landscape" (Cinematic style, Canny edge control, strength: 1.0)

🎮 Usage

For detailed usage instructions, see the Usage Guide.

Basic Usage

Open http://localhost:5001 in your browser
Select your mode: Storyboard, Single-Image Art, Batch Generation, Image-to-Image, Inpainting, Prompt Chaining, or ControlNet
Toggle between Demo (fast) and AI (full quality) modes
Enter your prompt and choose a style
For Batch Generation: Configure variation count, layout, and strength
For Image-to-Image: Upload an image and adjust transformation strength
For Inpainting: Upload an image, draw masks, and describe what should fill them
For Prompt Chaining: Add multiple prompts for story evolution
For ControlNet: Upload a reference image, select control type, and adjust strength/guidance
Click "Generate" and download your results

Key Features

📖 Storyboard Generation

Generate 5-panel storyboards with AI images and captions
Choose from multiple visual styles
Perfect for storytelling and concept development

🎨 Single-Image Art

Create high-quality AI images from text prompts
Multiple style presets available
Ideal for concept art and illustrations

🔄 Batch Generation

Generate multiple variations of the same prompt
Configure variation count (2-8) and strength
Choose from grid, horizontal, or vertical layouts

🔄 Image-to-Image

Upload sketches, photos, or concepts
Adjust transformation strength (0.1 = subtle, 1.0 = complete change)
Transform into any style or concept

🎯 Inpainting

Draw masks on areas to change
Describe what should fill masked areas
Remove objects or add new content seamlessly

🔗 Prompt Chaining

Create story sequences with multiple prompts
Use templates like "Character Journey" or "Environmental Progression"
Generate evolving narratives

🎯 ControlNet

Upload reference images for precise control over composition
Choose from Canny edge detection, depth maps, pose estimation, or segmentation
Adjust control strength and guidance timing for fine-tuned results

🛠️ Installation

Requirements

Python 3.8+
CUDA-compatible GPU (recommended) or Apple Silicon (M1/M2/M3)
8GB+ RAM (16GB+ recommended for AI mode)

Tested System Specifications:

Hardware: Apple M1 Max
OS: macOS Sequoia 15.5
Memory: 64 GB RAM
Storage: SSD with sufficient space for AI models (~10GB)

Setup

Automated Setup (Recommended)

# Clone repository
git clone https://github.com/arun-gupta/diffusion-lab.git
cd diffusion-lab

# Use automated setup scripts
./run_webapp.sh              # One-command setup and run
# OR
./cleanup_and_reinstall.sh   # Fresh install (removes old environment)

Manual Setup

# Clone repository
git clone https://github.com/arun-gupta/diffusion-lab.git
cd diffusion-lab

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run web application
python3 -m diffusionlab.api.webapp

Available Automation Scripts

run_webapp.sh: One-command setup and web app launch
cleanup_and_reinstall.sh: Complete fresh installation
setup.py: Cross-platform Python setup script
run.sh: Quick launcher with automatic setup detection

Alternative Interfaces

# Gradio Interface
python3 -m diffusionlab.tasks.storyboard

# Demo Version (no AI models)
python3 -m diffusionlab.tasks.demo

📁 Project Structure

diffusion-lab/
├── diffusionlab/
│   ├── api/webapp.py          # Flask web application
│   ├── tasks/                 # Generation tasks
│   ├── static/                # CSS, JS, generated images
│   ├── templates/             # HTML templates
│   └── config.py              # Configuration settings
├── docs/                      # Example images and documentation
├── tests/                     # Test scripts
└── requirements.txt           # Python dependencies

🔧 Troubleshooting

For detailed troubleshooting information, see the Troubleshooting Guide.

Quick Fixes

Static files not loading (404 errors)

# Run from project root
python3 -m diffusionlab.api.webapp

AI mode not available

Check all dependencies are installed
Ensure running from project root
Verify diffusionlab/tasks/ directory exists

Generate button not working

Check browser console for JavaScript errors
Ensure app.js loads without 404 errors

Getting Help

Check browser console (F12) for JavaScript errors
Watch Flask logs in terminal for server errors
Visit Issues for known problems
See Troubleshooting Guide for comprehensive solutions

🚧 Planned Features

✅ Storyboard Generation: Create 5-panel storyboards with AI images and captions (Implemented)
✅ Single-Image Art: Generate high-quality AI images from text prompts (Implemented)
✅ Batch Generation: Create multiple variations of the same prompt for creative exploration (Implemented)
✅ Image-to-Image: Transform sketches/photos into polished artwork (Implemented)
✅ Inpainting: Remove objects or fill areas with AI-generated content (Implemented)
✅ Prompt Chaining: Create evolving story sequences with multiple prompts (Implemented)
✅ ControlNet: Precise control over composition, pose, and structure using reference images (Implemented)
🔄 Outpainting: Extend image borders
🔄 Style Transfer: Apply artistic styles
🔄 Animated Diffusion: Frame interpolation
🔄 Custom Training: DreamBooth integration

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

💡 Tip: Start with Demo mode to test features quickly, then switch to AI mode for full-quality generation!

arun-gupta/diffusion-lab

Diffusion Lab

🚀 Quick Start

Option 1: Automated Setup (Recommended)

Option 2: Fresh Install

Option 3: Manual Setup

✨ Features

🤖 AI Models

Primary Models

Stable Diffusion XL (stabilityai/stable-diffusion-xl-base-1.0)

Stable Diffusion XL Inpainting (stabilityai/stable-diffusion-xl-base-1.0)

StableLM 3B 4E1T (stabilityai/stablelm-3b-4e1t)

ControlNet Models (Advanced Control)

Edge Detection (lllyasviel/control_v11p_sd15_canny)

Depth Map (lllyasviel/control_v11f1p_sd15_depth)

Human Pose (lllyasviel/control_v11p_sd15_openpose)

Semantic Segmentation (lllyasviel/control_v11p_sd15_seg)

Supporting Libraries

ControlNet Auxiliary Processors

Model Loading Strategy

Technical Specifications

Performance Notes

📸 Example Outputs

🖥️ Main Interface

📖 Storyboard Generation

🎨 Single-Image Art

🔄 Batch Generation

🔄 Image-to-Image Transformation

🎯 Inpainting (Content-aware Fill)

🔗 Prompt Chaining (Story Evolution)

🎯 ControlNet (Precise Control)

🎮 Usage

Basic Usage

Key Features

📖 Storyboard Generation

🎨 Single-Image Art

🔄 Batch Generation

🔄 Image-to-Image

🎯 Inpainting

🔗 Prompt Chaining

🎯 ControlNet

🛠️ Installation

Requirements

Setup

Automated Setup (Recommended)

Manual Setup

Available Automation Scripts

Alternative Interfaces

📁 Project Structure

🔧 Troubleshooting

Quick Fixes

Getting Help

🚧 Planned Features

📄 License

Stable Diffusion XL (`stabilityai/stable-diffusion-xl-base-1.0`)

Stable Diffusion XL Inpainting (`stabilityai/stable-diffusion-xl-base-1.0`)

StableLM 3B 4E1T (`stabilityai/stablelm-3b-4e1t`)

Edge Detection (`lllyasviel/control_v11p_sd15_canny`)

Depth Map (`lllyasviel/control_v11f1p_sd15_depth`)

Human Pose (`lllyasviel/control_v11p_sd15_openpose`)

Semantic Segmentation (`lllyasviel/control_v11p_sd15_seg`)