🚀 MemGPT Q1 2024 Developer Roadmap

Question

🚀 MemGPT Q1 2024 Developer Roadmap

cpacker opened this issue 10 months ago · 0 comments

See our Q2 2024 roadmap!

Q1 2024 Roadmap

🚀 Link to GitHub project board tracking all of the roadmap items
👋 Looking for smaller things to work on? Check the community contributions or bug tracker project boards
✍️ Leave comments or message us on Discord to suggest changes to the roadmap

Developer API [v1 end of February]

Link to API documentation

MemGPT API

Support for authentication and user identification (generating keys / tokens)
Deprecate passing user_id to user-specific API calls
#1005
Production-ready stable API

OpenAI Assistants API

Allow MemGPT to serve as a drop-in replacement for developers using the OpenAI Assistants API (api.openai.ai/assistants)
Tracking: #892

Python Client

Create a Python RESTClient for users to interact with a memgpt server REST endpoint
Create an Admin client for service administrators to manage keys and users
Tracking: #1042

Chat UI [v1 end of March]

Support a full-fledged chat UI that uses the MemGPT API
- Chatting with agents, viewing + editing agent’s memories, attaching documents to agents, …
Distribute MemGPT as a one-click install packaged executable that includes the ChatUI + MemGPT server
- Bundle pymemgpt (including necessary DBs) with React frontend inside of Electron / Tauri

Hosted service [v1 end of March]

Hosted service [end of March]
- Release hosted MemGPT server so that developers can directly interact with the API
Allows developers to use the MemGPT API without requiring any self-hosting (just API keys)
- Release hosted chat UI app (with guest + login modes) to allow easy use / experimentation with MemGPT via chat UI only
- Accounts are shared with the hosted API server (allows interacting with the same agents via hosted API + hosted chat UI)

Miscellaneous features (beyond Q1)

👥 Split thread agent

Support alternate “split-thread” MemGPT agent architecture
SplitThreadAgent-v0.1 runs two “prompt” / “context” threads
- DialogueThread that generates conversations and calls utility functions (e.g. run_google_search(...) or call_smart_home(...))
- MemoryThread that is a passive reader of the ongoing conversation, and is responsible for memory edits, insertion, and search
  - core_memory_replace , core_memory_append
  - archival_memory_search, archival_memory_insert
  - conversation_search, conversation_search_date
  - Question: should these be usable by the DialogueThread too?

⚡ Streaming (token-level) support

Allow streaming back of POST requests (to MemGPT server)
In MemGPT function calling setup, this likely means:
- Stream back inner thoughts first
- Then stream back function call
- Then attempt to parse function call (validate if final full streamed response was OK)

🦙 Specialized MemGPT models

Release (on HuggingFace) and serve (via the free endpoint) models that have been fine-tuned specifically for MemGPT
- “MemGPT-LM-8x7B-v0.1” (e.g. Mixtral 8x7B fine-tuned on MemGPT data w/ DPO)
- Goal is to bridge the gap between open models and GPT-4 for MemGPT performance

👁️ Multi-modal support

Start with gpt-4-vision support first to work out the necessary refactors required
- Will require modifications to the current data_types stored in the database
Work backwards to LLaVA

👾 Make MemGPT a better coding assistant

Coding requires some coding-specific optimizations
- Better support for generating coding blocks with parsing errors
- Add specific grammars / model wrappers for coding
Add support for block-level code execution
- CodeInterpreter style

📄 Make MemGPT a better document research assistant

Add more complex out-of-the-box archival_memory_search replacements
- e.g. using LlamaIndex RAG pipelines

🔧 Better default functions

E.g. better out-of-the-box internet search

⏱️ Asynchronous tool use support

Support non-blocking tool use (#1062)
- E.g. image generation that takes ~10s+ should not block the main conversation thread
- Implement by returning the "TBD" tool response immediately, then inject full response later

🧠 Better memory systems

core_memory_v2, archival_memory_v2, etc.
- e.g. core_memory_v2
  - add more structure (insertions are key, value style only)
- e.g. archival_memory_v2
  - add more metadata tagging at insertion time
    - type: [memory, feeling, observation, reflection, …]
  - add an asynchronous “memory consolidation” loop
    - every N minutes (or once per day), a task runner starts that tries to consolidate all the archival memories
  - add more structure in the storage
    - not just a vector database
    - knowledge graph?
    - hierarchical storage?