/biochatter

Backend library for conversational AI in biomedicine

Primary LanguagePythonMIT LicenseMIT

BioChatter

License License: MIT Python Python
Package PyPI version Downloads DOI Build status CI Docs
Tests Coverage coming soon. Docker Latest image Image size
Development Project Status: Active – The project has reached a stable, usable state and is being actively developed. Code style Contributions PRs Welcome Contributor Covenant

Description

Generative AI models have shown tremendous usefulness in increasing accessibility and automation of a wide range of tasks. Yet, their application to the biomedical domain is still limited, in part due to the lack of a common framework for deploying, testing, and evaluating the diverse models and auxiliary technologies that are needed. This repository contains the biochatter Python package, a generic backend library for the connection of biomedical applications to conversational AI.

The library is described in this preprint and used in various demo applications for showcasing its use:

BioChatter is part of the BioCypher ecosystem, connecting natively to BioCypher knowledge graphs. The BioChatter paper is being written here.

Installation

To use the package, install it from PyPI, for instance using pip (pip install biochatter) or Poetry (poetry add biochatter).

Extras

The package has some optional dependencies that can be installed using the following extras (e.g. pip install biochatter[xinference]):

  • xinference: support for querying open-source LLMs through Xorbits Inference

  • podcast: support for podcast text-to-speech (for the free Google TTS; the paid OpenAI TTS can be used without this extra)

  • streamlit: support for streamlit UI functions (used in ChatGSE)

Usage

Check out the documentation for examples, use cases, and more information. Many common functionalities covered by BioChatter can be seen in use in the ChatGSE code base. Built with Material for MkDocs

More information about LLMs

Check out this repository for more info on computational biology usage of large language models.

Troubleshooting

If you're on Apple Silicon, you may encounter issues with the grpcio dependency (grpc library, which is used in pymilvus). If so, try to install the binary from source after removing the installed package from the virtual environment from here:

pip uninstall grpcio
export GRPC_PYTHON_LDFLAGS=" -framework CoreFoundation"
pip install grpcio==1.53.0 --no-binary :all: