Gene set interpretation and more, assisted by large language models. Find the deployed app at https://chat.biocypher.org. We are currently heavily work in progress, but we are commited to open source and very open to comments, criticisms, and contributions! Read the preprint here!
This repository contains only the frontend code of our streamlit app. The code base used for communication with the LLMs, vector databases, and other components of our project is developed at https://github.com/biocypher/biochatter. Check there if you have your own UI and are looking for a way to connect it to the world of LLMs!
To stay up to date with the project, please star the repository and watch the
zulip community chat (free to join) at https://biocypher.zulipchat.com.
ChatGSE-related discussion happens in the #chatgse
stream.
We are very happy about contributions from the community, large and small! If you would like to contribute to the platform's development, please refer to our contribution guidelines. :)
Importantly, you don't need to be an expert on any of the technical aspects of the project! As long as you are interested and would like to help make this platform a great open-source tool, you're good. 🙂
Imposter syndrome disclaimer: We want your help. No, really. There may be a little voice inside your head that is telling you that you're not ready, that you aren't skilled enough to contribute. We assure you that the little voice in your head is wrong. Most importantly, there are many valuable ways to contribute besides writing code.
This disclaimer was adapted from the Pooch project.
You can discuss your favourite prompt setups and share the corresponding JSON files in the discussion here! You can go here to find inspiration for things the model can do, such as creating formatted markdown output to create mindmaps or other visualisations.
You can use the document summarisation feature to upload documents and use similarity search to inject context into your prompts. The document summarisation feature is currently only available on local builds of ChatGSE (see below). It requires a connection to a vector database (currently only Milvus is supported). We follow these instructions to mount a Docker instance on your machine (using the standard ports). We provide a Docker compose setup to mount the Milvus containers and the ChatGSE container together:
git clone https://github.com/biocypher/ChatGSE.git
cd ChatGSE
docker compose up -d
This command creates three containers for Milvus and one for ChatGSE. After a short startup time, you can access the ChatGSE app at http://localhost:8501.
To deploy/develop the app locally, we recommend using VS Code with the included
devcontainer setup. This requires Docker and the Remote
Containers
extension. After cloning the repository, open the folder in VS Code and click
the Reopen in Container
button that appears in the bottom right corner (or
use the command palette to find the command). This will build a Docker image
of the app and open it in VS Code. You can then run the app by adding a
configuration similar to this one to your launch.json:
{
"name": "Streamlit",
"type": "python",
"request": "launch",
"program": "/usr/local/bin/streamlit",
"console": "integratedTerminal",
"justMyCode": true,
"cwd": "${workspaceFolder}",
"args": [
"run",
"app.py"
]
}
Note that if you want to use the document summarisation feature or other
connected services, you will still need to start these separately. For the
vector DB component of the docker-compose.yml
file, you can do it like so:
docker compose up -d standalone
Once the other docker containers are running, they should be discoverable from
within the devcontainer. If you add your own containers, make sure that they
use the same network as your devcontainer (e.g. milvus
).
Using docker, run the following commands to deploy a local browser app (without the additional containers for the vector database):
git clone https://github.com/biocypher/ChatGSE.git
cd ChatGSE
docker build -t chatgse .
docker run -p 8501:8501 chatgse
Note that the community key feature is not available locally, so you need to provide your own API key (either in the app or as an environment variable).
Instead of manually entering the key, you can provide it to the Docker run
command as an environment variable. With a text file (e.g. local.env
) that
contains the keys:
OPENAI_API_KEY=sk-...
...
you can run the following command:
docker run --env-file local.env -p 8501:8501 chatgse
Local installation can be performed using Poetry (or other package managers
that can work with a pyproject.toml
file):
git clone https://github.com/biocypher/ChatGSE.git
cd ChatGSE
poetry install
For Apple Silicon machines, this must be followed by the following commands
(inside the activated environment using poetry shell
):
pip uninstall grpcio
mamba install grpcio # alternatively, conda
This step is necessary due to incompatibilities in the standard ARM grpcio package. Currently, only conda-forge provides a compatible version. To avoid this issue, you can work in a devcontainer (see above).