/lang-memgpt

A bot with memory, built on LangGraph Cloud.

Primary LanguagePythonMIT LicenseMIT

Lang-MemGPT

This repo provides a simple example of memory service you can build and deploy using LanGraph.

Inspired by papers like MemGPT and distilled from our own works on long-term memory, the graph extracts memories from chat interactions and persists them to a database. This information can later be read or queried semantically to provide personalized context when your bot is responding to a particular user.

Process

The memory graph handles thread process deduplication and supports continuous updates to a single "memory schema" as well as "event-based" memories that can be queried semantically.

Memory Diagram

Project Structure

├── langgraph.json # LangGraph Cloud Configuration
├── lang_memgpt
│   ├── __init__.py
│   └── graph.py # Define the agent w/ memory
├── poetry.lock
├── pyproject.toml # Project dependencies
└── tests # Add testing + evaluation logic
    └── evals
        └── test_memories.py

Quickstart

This quick start will get your agent with long-term memory deployed on LangGraph Cloud. Once created, you can interact with it from any API.

Prerequisites

This example defaults to using Pinecone for its memory database, and nomic-ai/nomic-embed-text-v1.5 as the text encoder (hosted on Fireworks). For the LLM, we will use accounts/fireworks/models/firefunction-v2, which is a fine-tuned variant of Meta's llama-3.

Before starting, make sure your resources are created.

  1. Create an index with a dimension size of 768. Note down your Pinecone API key, index name, and namespac for the next step.
  2. Create an API Key to use for the LLM & embeddings models served on Fireworks.

Deploy to LangGraph Cloud

Note: (Closed Beta) LangGraph Cloud is a managed service for deploying and hosting LangGraph applications. It is currently (as of 26 June, 2024) in closed beta. If you are interested in applying for access, please fill out this form.

To deploy this example on LangGraph, fork the repo.

Next, navigate to the 🚀 deployments tab on LangSmith.

If you have not deployed to LangGraph Cloud before: there will be a button that shows up saying Import from GitHub. You’ll need to follow that flow to connect LangGraph Cloud to GitHub.

Once you have set up your GitHub connection, select +New Deployment. Fill out the required information, including:

  1. Your GitHub username (or organization) and the name of the repo you just forked.
  2. You can leave the defaults for the config file (langgraph.config) and branch (main)
  3. Environment variables (see below)

The default required environment variables can be found in .env.example and are copied below:

# .env
PINECONE_API_KEY=...
PINECONE_INDEX_NAME=...
PINECONE_NAMESPACE=...
FIREWORKS_API_KEY=...

# You can add other keys as appropriate, depending on
# the services you are using.

You can fill these out locally, copy the .env file contents, and paste them in the first Name argument.

Assuming you've followed the steps above, in just a couple of minutes, you should have a working memory service deployed!

Now let's try it out.

Part 2: Setting up a Slack Bot

The langgraph cloud deployment exposes a general-purpose stateful agent via an API. You can connect to it from a notebook, UI, or even a Slack or Discord bot.

In this repo, we've included an event_server to listen in on Slack message events so you can talk with your bot from slack.

The server is a simple FastAPI app that uses Slack Bolt to interact with Slack's API.

In the next step, we will show how to deploy this on GCP's Cloud Run.

How to deploy as a Discord bot

So now you've deployed the API, how do you turn this into an app?

Check out the event server README for instructions on how to set up a Discord connector on Cloud Run.

How to evaluate

Memory management can be challenging to get right. To make sure your schemas suit your applications' needs, we recommend starting from an evaluation set, adding to it over time as you find and address common errors in your service.

We have provided a few example evaluation cases in the test file here. As you can see, the metrics themselves don't have to be terribly complicated, especially not at the outset.

We use LangSmith's @test decorator to sync all the evalutions to LangSmith so you can better optimize your system and identify the root cause of any issues that may arise.