cpacker/MemGPT

🚀 MemGPT Q2 2024 Developer Roadmap

cpacker opened this issue · 1 comments

Q2 2024 Roadmap

🚀 Link to GitHub project board tracking all of the roadmap items
👋 Looking for smaller things to work on? Check the community contributions or bug tracker project boards
✍️ Leave comments or message us on Discord to suggest changes to the roadmap


More MemGPT LLM backends / API supported [early April]

  • Groq
  • Claude
  • Cohere
  • Gemini
  • Together
  • Mistral
  • Add unit tests for all the officially supported API providers (with reference configs contained in main) to avoid regression
    • OpenAI, Anthropic, Groq, Together, Mistral, Gemini
  • Consolidate inference backend options into a single config style:
    • i.e.: openai-chat-completions (= OpenAI, Azure, APIs with function calling), openai-completions (= vLLM, lmstudio, ollama), ...

MemGPT developer examples

  • Native multi-agent interaction (via multiple MemGPT agents running on a MemGPT server process)
    • Example "agent orchestrator" meta-agent where one agent controls send_message calls to other agents
      • in this example, all the agents share a groupchat state similar to AutoGen
    • Example free-chat where each agent can freely broadcast send_message calls to other agents
      • need to handle "race conditions" where agent is busy (the calling agent should receive an informative message reply)
      • in this example, the only shared state between agents is via communication on send_message calls
  • Discord / Slack chatbot examples (connecting send_message to external APIs)
    • Example Discord + Slack + Twilio send_message tool + message listen hook (needs a dedicated section in dev portal)
  • GitHub / Discord support chatbot examples (e.g. Dosubot)
    • Example read_issue, list_issues + comment-posted-to-API-call hooks

MemGPT server

  • One-click deployment container [early April]
  • Instructions / tutorial for deploying MemGPT server to Azure/GCP/AWS/... [mid April]

MemGPT API

  • Token streaming support [mid-April]
  • Production-ready stable API [mid-April]

OpenAI Assistants API

  • Continued improvements to OpenAPI Assistants API support: #892

MemGPT Client / SDK

  • Javascript/Typescript client

Developer portal / Chat UI

  • Alpha release [early April]
  • Addition of missing features for beta release in MemGPT v0.4 [early April]
    • Preset creation / editing
    • Custom function / tool creation (via the UI)
    • Message (user+system+assistant) editing + rerunning / regenerating messages
    • Custom prompt formatting
  • Cron job scheduling inside of dev portal
    • Make it easy to schedule automated jobs that hit the MemGPT server
    • Cron-style custom functions: MemGPT can schedule one off or recurring messages to itself, ie ‘Scheduled Inner Monologue’

Hosted service

  • Hosted service [end of April]
    • Release hosted MemGPT server so that developers can directly interact with the API
  • Allows developers to use the MemGPT API without requiring any self-hosting (just API keys)
    • Release hosted chat UI app (with guest + login modes) to allow easy use / experimentation with MemGPT via chat UI only
    • Accounts are shared with the hosted API server (allows interacting with the same agents via hosted API + hosted chat UI)

⚡ Streaming (token-level) support

  • Add streaming support for CLI interface with OpenAI-compatible endpoints
  • Allow streaming back of POST requests (to MemGPT API / server)
    • #1280
    • In MemGPT function calling setup, this likely means:
      • Stream back inner thoughts first
      • Then stream back function call
      • Then attempt to parse function call (validate if final full streamed response was OK)

Miscellaneous features (Q2+)

👥 Split thread agent

  • Support alternate “split-thread” MemGPT agent architecture
  • SplitThreadAgent-v0.1 runs two “prompt” / “context” threads
    • DialogueThread that generates conversations and calls utility functions (e.g. run_google_search(...) or call_smart_home(...))
    • MemoryThread that is a passive reader of the ongoing conversation, and is responsible for memory edits, insertion, and search
      • core_memory_replace , core_memory_append
      • archival_memory_search, archival_memory_insert
      • conversation_search, conversation_search_date
      • Question: should these be usable by the DialogueThread too?

🦙 Specialized MemGPT models

  • Release (on HuggingFace) and serve (via the free endpoint) models that have been fine-tuned specifically for MemGPT
    • “MemGPT-LM-8x7B-v0.1” (e.g. Mixtral 8x7B fine-tuned on MemGPT data w/ DPO)
    • Goal is to bridge the gap between open models and GPT-4 for MemGPT performance

👁️ Multi-modal support

  • Start with gpt-4-vision support first to work out the necessary refactors required
    • Will require modifications to the current data_types stored in the database
  • Work backwards to LLaVA

👾 Make MemGPT a better coding assistant

  • Coding requires some coding-specific optimizations
    • Better support for generating coding blocks with parsing errors
    • Add specific grammars / model wrappers for coding
  • Add support for block-level code execution
    • CodeInterpreter style

📄 Make MemGPT a better document research assistant

  • Add more complex out-of-the-box archival_memory_search replacements
    • e.g. using LlamaIndex RAG pipelines

🔧 Better default functions

  • E.g. better out-of-the-box internet search

⏱️ Asynchronous tool use support

  • Support non-blocking tool use (#1062)
    • E.g. image generation that takes ~10s+ should not block the main conversation thread
    • Implement by returning the "TBD" tool response immediately, then inject full response later

🧠 Better memory systems

  • core_memory_v2, archival_memory_v2, etc.
    • e.g. core_memory_v2
      • add more structure (insertions are key, value style only)
    • e.g. archival_memory_v2
      • add more metadata tagging at insertion time
        • type: [memory, feeling, observation, reflection, …]
      • add an asynchronous “memory consolidation” loop
        • every N minutes (or once per day), a task runner starts that tries to consolidate all the archival memories
      • add more structure in the storage
        • not just a vector database
        • knowledge graph?
        • hierarchical storage?

Hi, just wanted to say this is an awesome project, and that the ability to have a fully featured ui to get started is really important to me. Nice to see it on the roadmap. That along with non-trivial examples makes it an easy choice to go with memGPT.