This Python application automates the process of generating blog posts from Stable Diffusion images using AI. It monitors a directory for new PNG images, uses a Language Model (LLM) to create a story and title based on the image, processes the image (resizing and adding a watermark), and then uploads it to a Ghost blog as a new post.
For more details, read the original blog post here: Adding AI to my AI Art Workflow with Ollama and a Multimodal LLM
Read the new updates here [adding Anthropic Claude to the workflow]: AI-Powered Blog Post Generator: Turning Pixels into Prose with Ollama and Claude
- Monitors a directory for new PNG images from Stable Diffusion
- Uses either a local LLM (llava-llama3 via Ollama) or a remote LLM (Claude via Anthropic API) to generate story content and titles
- Processes images: converts PNG to JPEG, resizes, and adds a watermark
- Uploads processed images and generated content to a Ghost blog
- Archives processed images and associated files
- Fallback mechanism: if the remote LLM fails, it falls back to the local LLM
- Transparency: includes information about which LLM model was used in the blog post
- Python 3.7+
- Ghost blog instance
- Ollama with llava-llama3 model installed (for local LLM)
- Anthropic API key (for Claude, if using remote LLM option)
-
Clone the repository:
git clone https://github.com/BitsofJeremy/sd_image_processing_and_upload.git cd sd_image_processing_and_upload
-
Create a virtual environment and activate it:
virtualenv -p python3 venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required packages:
pip install -r requirements.txt
-
Copy the
.env-example
file to.env
and fill in your configuration details:cp .env-example .env
-
Install Ollama and the llava-llama3 model:
# Install Ollama (if not already installed) curl https://ollama.ai/install.sh | sh # Pull the llava-llama3 model ollama pull llava-llama3
Edit the .env
file with your specific settings. Here's an example of what it should look like:
# Local Directory
INPUT_DIR='Where A1111 Saves files'
OUTPUT_DIR='Temp directory for working with images'
ARCHIVE_DIR='A place for your archived images to end up'
WATERMARK_PATH='Add a transparent PNG for a watermark'
# Sometimes you need a tag line for the end of your posts
TAGLINE="I Like Turtles"
# LLM Source [local or remote]
LLM_SOURCE='local'
# Anthropic Key and Model (for remote option)
ANTHROPIC_API_KEY="Your Anthropic Key"
ANTHROPIC_MODEL="claude-3-5-sonnet-20240620"
# Ollama Model to use for agent
OLLAMA_MODEL='llava-llama3'
# Your Ghost Blog
GHOST_BLOG_URL='https://example-blog.com'
GHOST_ADMIN_API_KEY='super secret'
GHOST_API_KEY='super secret'
Run the script with:
sh run_app.sh
The script will continuously monitor the input directory for new PNG files, process them, generate blog posts, and upload them to your Ghost blog.
The script logs its activities to script_log.txt
in the same directory as the script. You can monitor this file for information about the script's operations and any errors that occur.
There is a bonus script for removing ALL blog posts from your Ghost installation. Use with caution, as it was primarily used for testing:
python remove_posts.py
- Ollama
- llava-llama3: A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks
- Ollama Python Client
- Anthropic's Claude API (for remote LLM option)
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.