Implement Experimental `RepositoryMemoryEmbedding`
Opened this issue · 0 comments
Issue:
To enhance the agent's ability to learn from past interactions, we are considering the addition of a "RepositoryMemoryEmbedding" feature. The central premise is to vectorize each GitHub commit or event, enabling the system to treat the GitHub repository as a "memory". This would allow for indexing and searching of past changes, comments, issues, and pull requests, increasing the context awareness of our system.
With proper implementation, this repository memory could provide valuable insights and make our system more context-aware. By learning from past interactions with the code, our agent could offer improved assistance and recommendations.
Implementation:
A new class RepositoryMemoryEmbeddingHandler could be created that inherits from the MemoryEmbeddingHandler. This handler will need an instance of the GithubAPIHandler and an EmbeddingProvider to vectorize GitHub events and store these vectors into a suitable database.
The RepositoryMemoryEmbeddingHandler class should at minimum contain the following methods:
get_embedding(self, repo_id: str) -> Any: This method retrieves a GitHub event using the repo_id, turns it into a string, and generates an embedding using the EmbeddingProvider. This embedding is then returned.
update_embedding(self, repo_id: str, new_event: GithubEvent) -> None: This method adds a new GitHub event and updates the corresponding event embedding.
Error Handling: Please include comprehensive error handling, especially when interacting with the GitHub API and while creating embeddings.
Performance: Consider the performance of the system as the number of GitHub events grows. We might need to contemplate how to periodically prune old GitHub events if they are no longer necessary, or consider employing a more scalable storage solution.
Testing: Comprehensive testing should be conducted to ensure the system's robustness, particularly since this is an innovative feature. Both the functionality of RepositoryMemoryEmbeddingHandler and any changes made to the GithubAPIHandler need to be thoroughly tested.
Tasks:
- Create a new class RepositoryMemoryEmbeddingHandler.
- Implement get_embedding and update_embedding methods in RepositoryMemoryEmbeddingHandler.
- Update GithubAPIHandler to support event embeddings.
- Implement unit tests for RepositoryMemoryEmbeddingHandler.
- Implement integration tests for RepositoryMemoryEmbeddingHandler and GithubAPIHandler.
For inspiration, look at how auto-gpt utilizes memories. Please feel free to ask questions or seek clarification. Your input to this project is greatly appreciated!