ha-llmvision: A Python repository from valentinfrlch

Image and video analyzer for Home Assistant using multimodal LLMs

🌟 Features · 📖 Resources · ⬇️ Installation · 🚧 Roadmap · 🪲 How to report Bugs

LLM Vision is a Home Assistant integration to analyze images, videos and camera feeds using the vision capabilities of multimodal LLMs.
Supported providers are OpenAI, Anthropic, Google Gemini, LocalAI, Ollama and any OpenAI compatible API.

Features

Compatible with OpenAI, Anthropic Claude, Google Gemini, LocalAI, Ollama and custom OpenAI compatible APIs
Takes images and video from camera entities as input
Takes local image and video files as input
Images can be downscaled for faster processing

Resources

Check the docs for detailed instructions on how to set up LLM Vision and each of the supported providers, get inspiration from examples or join the discussion on the Home Assistant Community.

Installation

Search for LLM Vision in Home Assistant Settings/Devices & services
Select your provider
Follow the instructions to add your AI providers.

Detailed instruction on how to set up LLM Vision and each of the supported providers are available here: https://llm-vision.gitbook.io/getting-started/

Debugging

To enable debugging, add the following to your configuration.yaml:

logger:
  logs:
    custom_components.llmvision: debug

Roadmap

Note

These are planned features and ideas. They are subject to change and may not be implemented in the order listed or at all.

New Provider: NVIDIA ChatRTX
HACS: Include in HACS default
Animation Support: Support for animated GIFs
New Provider: Custom (OpenAI API compatible) Providers
Feature: HTTPS support for LocalAI and Ollama
Feature: Support for video files
Feature: Analyze Frigate Recordings using frigate's event_id

How to report a bug or request a feature

Important

Bugs: If you encounter any bugs and have followed the instructions carefully, feel free to file a bug report.
Feature Requests: If you have an idea for a feature, create a feature request.

Create new Issue

Support

You can support this project by starring this GitHub repository. If you want, you can also buy me a coffee here: