/rag

Repo for Retrieval Augmented Generation samples, guidance, experimentation, and more

Retrieval Augmented Generation samples and guidance

Maintainer

Purpose

This repo serves as the center of gravity for resources I’ve created and/or collected that demonstrate, experiment with, guide use of, or expound upon current best practices for Retrieval Augmented Generation.

The samples directory in particular consists of foundational proofs of concept: actual code implementations of varying complexities, but starting from the very simple of course. Currently this consists of the following sample projects using Java and Spring Boot:

  • azureopenai_azureaisearch/rag-demo - Azure OpenAI for chat and embeddings, Azure AI Search as vector store

  • openai_chroma/rag-demo - OpenAI for chat and embeddings, Chroma as vector store

  • llama_chroma/rag-demo - Llama 3 model running locally in ollama, Chroma as vector store

Azure OpenAI and Azure AI Search sample specifics

Before running the Azure version, you’ll need to create the appropriate resources in Azure, of course. Please refer to this repo for sample scripts that can be used to create those resources.

Important
Before creating and running resources on any cloud platform, check estimated costs for the resources you’re about to create and run those resources only as long as you need to do so; this will maximize your results while minimizing your costs. For Azure, please use the Azure Pricing Calculator to determine your needs and anticipated costs.

OpenAI and Chroma sample specifics

Before running the OAI+Chroma version, you’ll need to have Chroma running locally. One way to do so is just to run it via local Docker like so:

docker run -d --name chroma -p 8000:8000 chromadb/chroma

Llama and Chroma sample specifics

Before running the Llama+Chroma version, you’ll need to have Chroma running locally. One way to do so is just to run it via local Docker like so:

docker run -d --name chroma -p 8000:8000 chromadb/chroma

I ran the Llama 3 model using Ollama on my Mac, but it too can be run using Docker, of course. To run the model using ollama, here is the command I used:

ollama run llama3

Note
In this sample’s application.properties file, I’ve commented out the line to initialize the Chroma schema. This line needs to be uncommented the first time the application is run on a fresh database, otherwise a runtime error will result. For subsequent executions, you will likely want to comment it out again or remove it, but it does serve as a useful reminder. :)

Essential information for all versions

Before running any of the versions above, you’ll also need to provide certain expected values via variables, in one form or another, to your Spring AI-based application per the Spring Boot docs reference on Externalized Configuration. Please refer to the Spring AI documentation for the variables applicable to your choice of models/implementations.

TODO: I’ll be polishing and publishing additional scripts soon. For now, please refer to those above (for Azure OpenAI) and the references immediately above.

In conclusion

Please star+watch to be notified of updates, as there will be many.