/pgai

Bring AI models closer to your PostgreSQL data

Primary LanguagePLpgSQLPostgreSQL LicensePostgreSQL

pgai

pgai brings AI workflows to your PostgreSQL database

Discord Try Timescale for free

pgai simplifies the process of building search, and Retrieval Augmented Generation(RAG) AI applications with PostgreSQL.

pgai brings embedding and generation AI models closer to the database. With pgai, you can now do the following directly from within PostgreSQL in a SQL query:

Here's how to get started with pgai:

  • Everyone: Use pgai in your PostgreSQL database.
    1. Install pgai.
    2. Use pgai to integrate AI from your provider:
    • Ollama - configure pgai for Ollama, then use the model to embed, chat complete and generate.
    • OpenAI - configure pgai for OpenAI, then use the model to tokenize, embed, chat complete and moderate. This page also includes advanced examples.
    • Anthropic - configure pgai for Anthropic, then use the model to generate content.
    • Cohere - configure pgai for Cohere, then use the model to tokenize, embed, chat complete, classify, and rerank.
  • Extension contributor: Contribute to pgai and improve the project.

Learn more about pgai: To learn more about the pgai extension and why we built it, read this blog post pgai: Giving PostgreSQL Developers AI Engineering Superpowers.

Installation

The fastest ways to run PostgreSQL with the pgai extension are to:

  1. Create your database environment. Either:

  2. Enable the pgai extension.

  3. Use pgai.

Use a pre-built Docker container

Run the TimescaleDB Docker image.

Install from source

You can install pgai from source on an existing PostgreSQL server. Ensure you have Python3 and pip installed system-wide. You can check if they are already installed with:

python3 --version
pip --version

Additionally, you will need to install the plpython3 and pgvector extensions. To check if the extensions are already available in your database, run the query:

select * from pg_available_extensions where name in ('vector', 'plpython3u')

You should have one row per extension:

-[ RECORD 1 ]-------------------------
name              | plpython3u
default_version   | 1.0
installed_version | 1.0
comment           | PL/Python3U untrusted procedural language
-[ RECORD 2 ]-------------------------
name              | vector
default_version   | 0.7.2
installed_version | 0.7.2
comment           | vector data type and ivfflat and hnsw access methods

To install them, run the queries:

create extension plpython3u;
create extension vector;

Otherwise, for pgvector you can follow the install instructions from the official repository.

For plpython3, follow the How to install Postgres 16 with plpython3u: Recipes for macOS, Ubuntu, Debian, CentOS, Docker instructions from the postgres-ai repository.

Note

For macOS users, unfortunately the standard brew the standard postgresql formula in Homebrew is missing the plpython3 extension. The instructions above suggest an alternative brew formula.

If you are installing PostgreSQL using the Postgresql plugin for the asdf version manager, set the --with-python option during installation:

POSTGRES_EXTRA_CONFIGURE_OPTIONS=--with-python asdf install postgres 16.3

After installing these prerequisites, run:

make install

Python virtual environment

The extension requires several python packages, if you prefer working with python virtual environments, set the PYTHONPATH and VIRTUAL_ENV environment variables when starting your PostgreSQL server.

PYTHONPATH=/path/to/venv/lib/python3.12/site-packages \
VIRTUAL_ENV=/path/to/venv \
pg_ctl -D /path/to/data -l logfile start

Use a Timescale Cloud service

Create a new Timescale Service.

If you want to use an existing service, pgai is added as an available extension on the first maintenance window after the pgai release date.

Enable the pgai extension in your database

  1. Connect to your database with a postgres client like psql v16 or PopSQL.

    psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>"
  2. Create the pgai extension:

    CREATE EXTENSION IF NOT EXISTS ai CASCADE;

    The CASCADE automatically installs pgvector and plpython3u extensions.

Use pgai

Now, use pgai to integrate AI from Ollama and OpenAI. Learn how to moderate and embed content directly in the database using triggers and background jobs.

Get involved

pgai is still at an early stage. Now is a great time to help shape the direction of this project; we are currently deciding priorities. Have a look at the list of features we're thinking of working on. Feel free to comment, expand the list, or hop on the Discussions forum.

To get started, take a look at how to contribute and how to set up a dev/test environment.

About Timescale

Timescale is a PostgreSQL database company. To learn more visit the timescale.com.

Timescale Cloud is a high-performance, developer focused, cloud platform that provides PostgreSQL services for the most demanding AI, time-series, analytics, and event workloads. Timescale Cloud is ideal for production applications and provides high availability, streaming backups, upgrades over time, roles and permissions, and great security.