medplexity: A Jupyter Notebook repository from MaksymPetyak

Medplexity

Medplexity explorer • Frontend GitHub repository • Substack

Medplexity is a python library to help with evaluation of LLMs for medical applications.

It is designed to help with the following tasks:

Evaluating performance of LLMs on existing medical datasets and benchmarks. E.g. MedQA, PubMedQA, etc.
Comparing performance of different prompts, models, and architectures.
Exporting results of evaluation for visualisation and further analysis.

The goal is to help answer questions like "How much better would GPT-4 perform given a vector database to load certain resources?".

🔧 Quick install

pip install medplexity

📖 Documentation

Documentation can be found here.

Example

See our "Getting Started" notebook for a full example with MedMCQA dataset.

Contributions

Contributions are welcome! Check out the todos below, and feel free to open a pull request. Remember to install pre-commit to be compliant with our standards:

pre-commit install

Feel free to raise any questions on Discord

Explorer

In addition to the library, we are also building a web app to explore the results of evaluations. The explorer is available at medplexityai.com. It's also open-sourced, see the frontend repository.

📜 License

Medplexity is licensed under the MIT License. See the LICENSE file for more details.