Prepare your Data for AI Using Aiven for OpenSearch and LangChain

This workshop aims to take unprepared data and make it usable with a Retrieval Augementation Generation (RAG) Pattern for a chat bot.

In this workshop, we'll be using Aiven for OpenSearch and LangChain to:

Chunk transcription data and generate embeddings
Configure our OpenSearch index for Known Nearest Neighbors (KNN) and perform a similarity search
Connect our search responses to an Large Language Model (LLM) to generate informed answers using LangChain
Compare the performance of multiple LLMs

Getting Started

Our instructions and notebooks are in the workshop folder.

Aiven for Apache Kafka®️ and Python tutorial is licensed under the Apache license, version 2.0. Full license text is available in the LICENSE file.

Please note that the project explicitly does not require a CLA (Contributor License Agreement) from its contributors.

Bug reports and patches are very welcome, please post them as GitHub issues and pull requests at https://github.com/Aiven-Labs/preparing-data-for-opensearch-and-rag

To report any possible vulnerabilities or other serious issues please see our security policy.

Report Code of Conduct issues according to our policy