Trasa is a simple demo of a Retrieval-Augmented Generation (RAG) system implemented in Elixir, using Postgres and Ecto. This project serves as a technical example and is not intended to be a production-ready library. The goal of Trasa is to demonstrate how to build and integrate a basic RAG system in Elixir.
Retrieval-Augmented Generation (RAG) is a hybrid knowledge retrieval approach that combines retrieval-based methods with generative models (LLMs) to produce more accurate and contextually relevant responses. RAG systems typically involve two main components:
- Retriever: This component retrieves relevant documents or pieces of information from a large dataset or database based on a given query. It ensures that the generative model has access to relevant context.
- Generator: The generative model uses the retrieved documents to generate a coherent and contextually accurate response. This model is typically a sequence-to-sequence transformer or a similar neural network.
In Trasa, the retriever component leverages Postgres (with the pgvector extension) and Ecto to efficiently query and retrieve relevant information. The generator component can be integrated with any suitable Elixir-compatible generative model. This demo uses the OpenAI API for speed & simplicity, but can easily be swapped out for other APIs or models running locally.
To use Trasa, clone the repository and set up the dependencies:
git clone https://github.com/jonklein/trasa.git
cd trasa
mix deps.get
Make sure you have the pgvector Postgres extension installed. On Mac OS X, you can install it with brew install pgvector
.
The default database is trasa_dev
with Postgres running locally. You can modify the configuration in config.exs if desired. Once configured, create & migrate the database:
mix ecto.create
mix ecto.migrate
Because Trasa uses the OpenAI API for the generation step, you'll need to set an API key in your environment:
OPENAI_API_KEY=xxxxxxx
If you wish to use a different LLM model for the generation step, you can implement an alternative Trasa.Rag.LLMProvider behavior, and configure via config.exs:
config :trasa, llm: Trasa.CustomLLMProvider
This repository contains a number of demo documents that can be used to try out the system. In order to ensure that the demo answers provided are retrieved from the dataset context, and not factual answers from the LLM's previous training, the dataset consists of three short stories generated by ChatGPT:
- a story about a boy getting lost in the forest
- a scifi story about a space explorer landing on a mysterious plant
- a detective story about an unusual household mystery
Once you've setup the environment, you can load your document data into Trasa.Rag.Document:
iex -S mix
iex> documents = Trasa.DemoData.load
Then, you'll need to index the documents to create the embeddings. The documents are broken apart into chunks (default 1024 characters), and an embedding is computed and stored for each chunk. This step can take a minute or so to complete:
iex> context = Trasa.Rag.Context.new
iex> Trasa.Rag.Indexing.index_all(context)
Finally, with the embeddings created, you can ask questions of the data:
iex> Trasa.Rag.query(context, "who did the boy meet in the forest?")
"In the forest, the little boy named Arttu met a small, wise-looking owl named Onni. Onni offered to help Arttu find his way home and also gathered other animal friends to assist in the journey, including Vesa the rabbit, Väinö the tortoise, Lumi the deer, Kettu the fox, and other friendly animals along the way."