Welcome to my exciting journey in the world of Language Models and Dentistry! This repository is part of my quest to find the perfect LLM that knows everything about dentistry, a crucial component of the Periospot AI project.
This project focuses on evaluating various Large Language Models (LLMs), specifically different implementations of LLaMA 3.1 70B, to determine their effectiveness in understanding and generating dental knowledge. It's a fun and intriguing exploration into the intersection of artificial intelligence and dentistry.
This repository contains:
rag_evaluation_test.ipynb
: A Jupyter notebook for testing Retrieval-Augmented Generation (RAG) using OpenAI and Groq APIs.run_evals.ipynb
: A notebook for running evaluations on different LLaMA 3.1 70B implementations across various providers.- Custom evaluation metrics and prompts tailored for dental scenarios.
To dive into this dental AI adventure:
- Clone this repository
- Install the required dependencies:
pip install -r requirements.txt
- Set up environment variables:
Copy the
.env.example
file to a new file named.env
:Then, open thecp .env.example .env
.env
file and add your API keys:- Add your Weights & Biases API key to
WANDB_API_KEY
- Add your OpenRouter API key to
OPENROUTER_API_KEY
- Add your Groq API key to
GROQ_API_KEY
- Add your Together API key to
TOGETHER_API_KEY
- Add your OpenAI API key to
OPENAI_API_KEY
- Add your Weights & Biases API key to
- Run the Jupyter notebooks to see the evaluations in action
Dentistry is a complex field with vast amounts of specialized knowledge. By leveraging the power of LLMs, we aim to create an AI assistant that can:
- Assist dental professionals in diagnosis and treatment planning
- Provide instant access to up-to-date dental research
- Help patients understand complex dental procedures
Our journey involves:
- Evaluating LLaMA 3.1 70B implementations from various providers:
- OctoAI
- Novita
- DeepInfra
- Fireworks
- Groq
- Together
- Creating dental-specific prompts and test cases
- Analyzing model performance on dental terminology and concepts
- Comparing the performance of different LLaMA 3.1 70B implementations
Stay tuned for exciting discoveries and dental AI breakthroughs!
(You can add a brief summary of your findings here, or link to a more detailed results page)
- Weights & Biases for experiment tracking and visualization
- Python for scripting and data processing
- Jupyter Notebooks for interactive development
- OpenAI API for embeddings and baseline comparisons
- OpenRouter API for accessing various LLaMA 3.1 70B implementations
- Groq API for fast inference
- Together API for additional LLaMA 3.1 70B implementation
A huge thank you to the dental community for their expertise and to the AI researchers pushing the boundaries of what's possible with language models. Special thanks to the teams behind LLaMA, OpenAI, Groq, Together, and all the providers offering LLaMA 3.1 70B implementations.
This project is open source and available under the MIT License.
If you're as excited about the intersection of AI and dentistry as I am, let's connect!
- Name: Francisco Teixeira Barbosa
- Email: cisco@periospot.com
- Personal Portfolio: https://franciscodds.framer.ai/
- GitHub: https://github.com/Tuminha
- Twitter/X: @Cisco_research
Let's revolutionize dentistry with AI! ๐ฆท๐ค