This is a RAG implementation using Open Source stack. BioMistral 7B has been used to build this app along with PubMedBert as an embedding model, Qdrant as a self hosted Vector DB, and Langchain & Llama CPP as an orchestration frameworks.
- Build a cutting-edge Medical Retrieval Augmented Generation (RAG) Application using a suite of powerful technologies tailored for the medical domain.
- BioMistral 7B, a new large language model specifically designed for medical applications, offering unparalleled accuracy and insight into complex medical queries.
- Qdrant, a self-hosted vector database that we run inside a Docker container. This robust tool serves as the backbone for managing and retrieving high-dimensional data vectors, such as those generated by our medical language model.
- To enhance our model's understanding of medical texts, I utilize PubMed BERT embeddings, an embeddings model specifically crafted for the medical domain.
- This ensures our application can grasp the nuances of medical literature and queries, providing more precise and relevant answers.
- A crucial component of our setup is Llama.cpp, a library that enables the inference of large language models on CPU machines. This quantized model approach allows for efficient and cost-effective deployment without compromising on performance.
- For orchestrating our application components, I introduce LangChain, an orchestration framework that seamlessly integrates our tools and services, ensuring smooth operation and scalability.
- On the backend, I leverage FastAPI, a modern, fast (high-performance) web framework for building APIs with Python 3.7+. FastAPI provides the speed and ease of use needed to create a responsive and efficient backend for our medical RAG application.
- Finally, for the web UI, I employ Bootstrap 5.3, the latest version of the world’s most popular front-end open-source toolkit. This enables us to create a sleek, intuitive, and mobile-responsive user interface that makes our medical RAG application accessible and easy to use.
- We set up the environment to integrate these technologies into a cohesive and functional medical RAG application.
Distributed under the MIT License. See LICENSE
for more information.