Cohere Toolkit

Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

Try Toolkit
About Toolkit
Deploy Toolkit
Develop and troubleshoot
Component Guides
What's on our roadmap
How to contribute
Try Coral Showcase

Quick start

Try the default Toolkit application yourself by deploying it in a container locally. You will need to have Docker and Docker-compose >= 2.22 installed.

docker run -e COHERE_API_KEY='>>YOUR_API_KEY<<' -p 8000:8000 -p 4000:4000 ghcr.io/cohere-ai/cohere-toolkit:latest

Go to localhost:4000 in your browser and start chatting with the model. This will use the model hosted on Cohere's platform. If you want to add your own tools or use another model, follow the instructions below to fork the repository.

Building and running locally

Clone the repo and run

make first-run

Follow the instructions to configure the model - either AWS Sagemaker, Azure, or Cohere's platform. This can also be done by running make setup (See Option 2 below), which will help generate a file for you, or by manually creating a .env file and copying the contents of the provided .env-template. Then replacing the values with the correct ones.

Environment variables

Cohere Platform

COHERE_API_KEY: If your application will interface with Cohere's API, you will need to supply an API key. Not required if using AWS Sagemaker or Azure. Sign up at https://dashboard.cohere.com/ to create an API key.
NEXT_PUBLIC_API_HOSTNAME: The backend URL which the frontend will communicate with. Defaults to http://localhost:8000
DATABASE_URL: Your PostgreSQL database connection string for SQLAlchemy, should follow the format postgresql+psycopg2://USER:PASSWORD@HOST:PORT.

AWS Sagemaker

To use the toolkit with AWS Sagemaker you will first need the cohere model (a command version) which powers chat deployed in Sagemaker. Follow Cohere's guide and notebooks to deploy a command model and create an endpoint which can then be used with the toolkit.

Then you will need to set up authorization, see more details here. The default toolkit set up uses the configuration file (after aws configure sso) with the following environment variables:

SAGE_MAKER_REGION_NAME: The region you configured for the model.
SAGE_MAKER_ENDPOINT_NAME: The name of the endpoint which you created in the notebook.
SAGE_MAKER_PROFILE_NAME: Your AWS profile name

Hosted tools

PYTHON_INTERPRETER_URL: URL to the python interpreter container. Defaults to http://localhost:8080.
TAVILY_API_KEY: If you want to enable internet search, you will need to supply a Tavily API Key. Not required.

Deploy locally

Once your environment variables are set, you're ready to deploy the Toolkit locally! Pull the Docker images from Github Artifact registry or build files from source. See the Makefile for all available commands.

Requirements:

Option 1 - Install locally with Docker:

Ensure your shell is authenticated with GHCR.

Pull the Single Container Image from Github's Artifact Registry

docker pull ghcr.io/cohere-ai/cohere-toolkit:latest

Run the images locally:

docker run --name=cohere-toolkit -itd -e COHERE_API_KEY='Your Cohere API key here' -p 8000:8000 -p 4000:4000 ghcr.io/cohere-ai/cohere-toolkit

Option 2 - Build locally from scratch:

Option 2.1 - Run everything at once

Run make first-run to start the CLI, that will generate a .env file for you. This will also run all the DB migrations and run the containers

make first-run

Option 2.1 - Run each command separately

Run make setup to start the CLI, that will generate a .env file for you:

make setup

Then run:

make migrate
make dev

If you did not change the default port, visit http://localhost:4000/ in your browser to chat with the model.

What is included in Toolkit?

Components in this repo include:

src/interfaces/coral_web - A web app built in Next.js. Includes a simple SQL database out of the box to store conversation history in the app.
src/backend - Contains preconfigured data sources and retrieval code to set up RAG on custom data sources (called "Retrieval Chains"). Users can also configure which model to use, selecting from Cohere's models hosted on either Cohere's platform, Azure, and AWS Sagemaker. By default, we have configured a Langchain data retriever to test RAG on Wikipedia and your own uploaded documents.

Deployment Guides

Looking to serve your application in production? Deploy the Toolkit to your preferred cloud provider by following our guides below:

Other deployment options

Single Container Setup: Useful as a quickstart to run the Toolkit, or deploy to AWS on an EC2 instance.
AWS ECS Fargate Deployment: Deploy the Toolkit single container to AWS ECS(Fargate).
AWS ECS EC2 Deployment: Deploy the Toolkit single container to AWS ECS(EC2).
Google Cloud Platform: Help setup your Cloud SQL instance, then build, push and deploy backend+frontend containers to Cloud Run.

Deploying to Azure

You can deploy Toolkit with one click to Microsoft Azure Platform:

Setup for Development

Setting up Poetry

Use for configuring and adding new retrieval chains.

Install your dependencies:

poetry install

Run linters:

poetry run black .
poetry run isort .

Setting up Your Local Database

The docker-compose file should spin up a local db container with a PostgreSQL server. The first time you setup this project, and whenever new migrations are added, you will need to run:

make migrate

This will apply all existing database migrations and ensure your DB schema is up to date.

If ever you run into issues with Alembic, such as being out of sync and your DB does not contain any data you'd like to preserve, you can run:

make reset-db
make migrate
make dev

This will delete the existing db container volumes, restart the containers and reapply all migrations.

Testing the Toolkit

Run:

make dev

To spin the test_db service for you. After, you can run:

make run-tests

Making Database Model Changes

When making changes to any of the database models, such as adding new tables, modifying or removing columns, you will need to create a new Alembic migration. You can use the following Make command:

make migration

Important: If adding a new table, make sure to add the import to the model/__init__.py file! This will allow Alembic to import the models and generate migrations accordingly.

This should generate a migration on the Docker container and be copied to your local /alembic folder. Make sure the new migration gets created.

Then you can migrate the changes to the PostgreSQL Docker instance using:

make migrate

Troubleshooting

Community features are not accessible

Make sure you add USE_COMMUNITY_FEATURES=True to your .env file.

Multiple errors after running make dev for the first time

Make sure you run the following command before running make dev:

make migrate

Error: pg_config executable not found.

Make sure that all requirements including postgres are properly installed.

If you're using MacOS, run:

brew install postgresql

For other operating systems, you can check the postgres documentation.

Debugging locally

To debug any of the backend logic while the Docker containers are running, you can run:

make dev

This will run the Docker containers with reloading enabled, then in a separate shell window, run:

make attach

This will attach an interactive shell to the backend running, now when your backend code hits any

import pdb; pdb.set_trace()

it will allow you to debug.

Component Guides

How to use community features

By default, the toolkit runs without community tools or deployments. If you want to enable them, add the following to the .env file or use make setup to set this variable:

USE_COMMUNITY_FEATURES=True

How to add your own model deployment

A model deployment is a running version of one of the Cohere command models. The Toolkit currently supports the model deployments:

Cohere Platform (model_deployments/cohere_platform.py)
- This model deployment option call the Cohere Platform with the Cohere python SDK. You will need a Cohere API key. When you create an account with Cohere, we automatically create a trial API key for you. You can find it here.
Azure (model_deployments/azure.py)
- This model deployment calls into your Azure deployment. To get an Azure deployment follow these steps. Once you have a model deployed you will need to get the endpoint URL and API key from the azure AI studio https://ai.azure.com/build/ -> Project -> Deployments -> Click your deployment -> You will see your URL and API Key. Note to use the Cohere SDK you need to add /v1 to the end of the url.
SageMaker (model_deployments/sagemaker.py)
- This deployment option calls into your SageMaker deployment. To create a SageMaker endpoint follow the steps here, alternatively follow a command notebook here. Note your region and endpoint name when executing the notebook as these will be needed in the environment variables.
To add your own deployment:
1. Create a deployment file, add it to /community/model_deployments folder, implement the function calls from BaseDeployment similar to the other deployments.
2. Add the deployment to src/community/config/deployments.py
3. Add the environment variables required to the env template.
To add a Cohere private deployment, use the steps above copying the cohere platform implementation changing the base_url for your private deployment and add in custom auth steps.

How to call the backend as an API

It is possible to just run the backend service, and call it in the same manner as the Cohere API. Note streaming and non streaming endpoints are split into 'http://localhost:8000/chat-stream' and 'http://localhost:8000/chat' compared to the API. For example, to stream:

curl --location 'http://localhost:8000/chat-stream' \
--header 'User-Id: me' \
--header 'Content-Type: application/json' \
--data '{
    "message": "Tell me about the aya model"
}
'

How to add your own chat interface

Currently the core chat interface is the Coral frontend. To add your own interface, take the steps above for call the backend as an API in your implementation and add it alongside src/community/interfaces/.

How to add a connector to the Toolkit

If you have already created a connector, it can be used in the toolkit with ConnectorRetriever. Add in your configuration and then add the definition in community/config/tools.py similar to Arxiv implementation with the category Category.DataLoader. You can now use the Coral frontend and API with the connector.

How to set up web search with the Toolkit

To use Coral with web search, simply use the Tavily_Internet_Search tool by adding your API key to the env file. Alternatively you can use any search provider of your choosing, either with your own implementation or an integration implementation (such as LangChain) by following these steps below.

How to set up PDF Upload with the Toolkit

To use Coral with document upload, simply use the File_Upload_LlamaIndex or File_Upload_Langchain (this needs a cohere API key in the .env file) tool or by adding your API key to the env file. Alternatively you can use any document uploader of your choosing, either with your own implementation or an integration implementation (such as LangChain) by following these steps below.

How to create your own tools and retrieval sources

Toolkit includes some sample tools that you can copy to configure your own data sources:

File loaders - Parses a PDF file and performs RAG. Enables users to upload PDF in Toolkit UI. Users have an option to use either Langchain or Llamaindex, whichever is preferred. Langchain is used by default.
Data loaders - This tool queries a data source and then performs RAG on extracted documents. We used Langchain's Wikiretriever as the sample data source.
Functions - Python interpreter and calculator tools.

To create your own tools or add custom data sources, see our guide: tools and retrieval sources overview

Experimental Features

Please note that these are experimental features.

Langchain Multihop

Chatting with multihop tool usage through Langchain is enabled by setting experimental feature flag to True in .env.

USE_EXPERIMENTAL_LANGCHAIN=True

By setting this flag to true, only tools that have a Langchain implementation can be utilized. These exist under LANGCHAIN_TOOLS and require a to_lanchain_tool() function on the tool implementation which returns a langchain compatible tool. Python interpreter and Tavily Internet search are provided in the toolkit by default once the environment is set up.

Example API call:

curl --location 'http://localhost:8000/langchain-chat' \
--header 'User-Id: me' \
--header 'Content-Type: application/json' \
--data '{
    "message": "Tell me about the aya model",
    "tools": [{"name": "Python_Interpreter"},{"name": "Internet Search"},]
}'

Currently, citations are not supported in lanchain multihop.

Roadmap

Set env variables in UI
Include citations for multi hop tools
Display images for python interpreter tool
Add a slack bot as an available interface
White labelling: Changing fonts, logos, and colours.
User management and authentication system: Toolkit is currently configured with one user role and no authentication.

Contributing

Contributions are what drive an open source community, any contributions made are greatly appreciated. To get started, check out our documentation.

rndaorg/cohere-toolkit