🚀 MemGPT Q2 2024 Developer Roadmap
cpacker opened this issue · 2 comments
cpacker commented
Q2 2024 Roadmap
🚀 Link to GitHub project board tracking all of the roadmap items
👋 Looking for smaller things to work on? Check the community contributions or bug tracker project boards
✍️ Leave comments or message us on Discord to suggest changes to the roadmap
More MemGPT LLM backends / API supported [early April]
- Groq
- Claude
- Cohere
- Gemini
- Together
- Mistral
- Add unit tests for all the officially supported API providers (with reference configs contained in main) to avoid regression
- OpenAI, Anthropic, Groq, Together, Mistral, Gemini
- Consolidate inference backend options into a single config style:
- i.e.:
openai-chat-completions
(= OpenAI, Azure, APIs with function calling),openai-completions
(= vLLM, lmstudio, ollama), ...
- i.e.:
MemGPT developer examples
- Native multi-agent interaction (via multiple MemGPT agents running on a MemGPT server process)
- Example "agent orchestrator" meta-agent where one agent controls
send_message
calls to other agents- in this example, all the agents share a groupchat state similar to AutoGen
- Example free-chat where each agent can freely broadcast
send_message
calls to other agents- need to handle "race conditions" where agent is busy (the calling agent should receive an informative message reply)
- in this example, the only shared state between agents is via communication on
send_message
calls
- Example "agent orchestrator" meta-agent where one agent controls
- Discord / Slack chatbot examples (connecting
send_message
to external APIs)- Example Discord + Slack + Twilio
send_message
tool + message listen hook (needs a dedicated section in dev portal)
- Example Discord + Slack + Twilio
- GitHub / Discord support chatbot examples (e.g. Dosubot)
- Example
read_issue
,list_issues
+ comment-posted-to-API-call hooks
- Example
MemGPT server
- One-click deployment container [early April]
- Instructions / tutorial for deploying MemGPT server to Azure/GCP/AWS/... [mid April]
MemGPT API
- Token streaming support [mid-April]
- Production-ready stable API [mid-April]
OpenAI Assistants API
- Continued improvements to OpenAPI Assistants API support: #892
MemGPT Client / SDK
- Javascript/Typescript client
Developer portal / Chat UI
- Alpha release [early April]
- Addition of missing features for beta release in MemGPT
v0.4
[early April]- Preset creation / editing
- Custom function / tool creation (via the UI)
- Message (user+system+assistant) editing + rerunning / regenerating messages
- Custom prompt formatting
- Cron job scheduling inside of dev portal
- Make it easy to schedule automated jobs that hit the MemGPT server
- Cron-style custom functions: MemGPT can schedule one off or recurring messages to itself, ie ‘Scheduled Inner Monologue’
Hosted service
- Hosted service [end of April]
- Release hosted MemGPT server so that developers can directly interact with the API
- Allows developers to use the MemGPT API without requiring any self-hosting (just API keys)
- Release hosted chat UI app (with guest + login modes) to allow easy use / experimentation with MemGPT via chat UI only
- Accounts are shared with the hosted API server (allows interacting with the same agents via hosted API + hosted chat UI)
⚡ Streaming (token-level) support
- Add streaming support for CLI interface with OpenAI-compatible endpoints
- Allow streaming back of POST requests (to MemGPT API / server)
- #1280
- In MemGPT function calling setup, this likely means:
- Stream back inner thoughts first
- Then stream back function call
- Then attempt to parse function call (validate if final full streamed response was OK)
Miscellaneous features (Q2+)
👥 Split thread agent
- Support alternate “split-thread” MemGPT agent architecture
SplitThreadAgent-v0.1
runs two “prompt” / “context” threadsDialogueThread
that generates conversations and calls utility functions (e.g.run_google_search(...)
orcall_smart_home(...)
)MemoryThread
that is a passive reader of the ongoing conversation, and is responsible for memory edits, insertion, and searchcore_memory_replace
,core_memory_append
archival_memory_search
,archival_memory_insert
conversation_search
,conversation_search_date
- Question: should these be usable by the
DialogueThread
too?
🦙 Specialized MemGPT models
- Release (on HuggingFace) and serve (via the free endpoint) models that have been fine-tuned specifically for MemGPT
- “MemGPT-LM-8x7B-v0.1” (e.g. Mixtral 8x7B fine-tuned on MemGPT data w/ DPO)
- Goal is to bridge the gap between open models and GPT-4 for MemGPT performance
👁️ Multi-modal support
- Start with
gpt-4-vision
support first to work out the necessary refactors required- Will require modifications to the current
data_types
stored in the database
- Will require modifications to the current
- Work backwards to LLaVA
👾 Make MemGPT a better coding assistant
- Coding requires some coding-specific optimizations
- Better support for generating coding blocks with parsing errors
- Add specific grammars / model wrappers for coding
- Add support for block-level code execution
- CodeInterpreter style
📄 Make MemGPT a better document research assistant
- Add more complex out-of-the-box archival_memory_search replacements
- e.g. using LlamaIndex RAG pipelines
🔧 Better default functions
- E.g. better out-of-the-box internet search
⏱️ Asynchronous tool use support
- Support non-blocking tool use (#1062)
- E.g. image generation that takes ~10s+ should not block the main conversation thread
- Implement by returning the "TBD" tool response immediately, then inject full response later
🧠 Better memory systems
core_memory_v2
,archival_memory_v2
, etc.- e.g.
core_memory_v2
- add more structure (insertions are key, value style only)
- e.g.
archival_memory_v2
- add more metadata tagging at insertion time
type: [memory, feeling, observation, reflection, …]
- add an asynchronous “memory consolidation” loop
- every N minutes (or once per day), a task runner starts that tries to consolidate all the archival memories
- add more structure in the storage
- not just a vector database
- knowledge graph?
- hierarchical storage?
- add more metadata tagging at insertion time
- e.g.
atljoseph commented
Hi, just wanted to say this is an awesome project, and that the ability to have a fully featured ui to get started is really important to me. Nice to see it on the roadmap. That along with non-trivial examples makes it an easy choice to go with memGPT.