/preparing-data-for-opensearch-and-rag

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Prepare your Data for AI Using Aiven for OpenSearch and LangChain

This workshop aims to take unprepared data and make it usable with a Retrieval Augementation Generation (RAG) Pattern for a chat bot.

Learn more about regularly occuring Aiven workshops

Click to Get Started

In this workshop, we'll be using Aiven for OpenSearch and LangChain to:

  • Chunk transcription data and generate embeddings
  • Configure our OpenSearch index for Known Nearest Neighbors (KNN) and perform a similarity search
  • Connect our search responses to an Large Language Model (LLM) to generate informed answers using LangChain
  • Compare the performance of multiple LLMs

Getting Started

Our instructions and notebooks are in the workshop folder.

Click to Get Started

License

Aiven for Apache Kafka®️ and Python tutorial is licensed under the Apache license, version 2.0. Full license text is available in the LICENSE file.

Please note that the project explicitly does not require a CLA (Contributor License Agreement) from its contributors.

Conduit Podcast Transcripts by Jay Miller, Kathy Campbell, original downloads from whisper work done by Pilix is licensed under Attribution-NonCommercial-ShareAlike 4.0 International

Contact

Bug reports and patches are very welcome, please post them as GitHub issues and pull requests at https://github.com/Aiven-Labs/preparing-data-for-opensearch-and-rag

To report any possible vulnerabilities or other serious issues please see our security policy.

Report Code of Conduct issues according to our policy