LoLLMsVectorDB: A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem. Supports various vectorization methods and directory bindings for efficient text data management.
- Flexible Vectorization: Supports multiple vectorization methods including TF-IDF and Word2Vec.
- Directory Binding: Automatically updates the vector store with text data from a specified directory.
- Efficient Search: Provides fast and accurate search results with metadata to locate the original text chunks.
- Modular Design: Easily extendable to support new vectorization methods and functionalities.
pip install lollmsvectordb
from lollmsvectordb import TFIDFVectorizer, VectorDatabase, DirectoryBinding
# Initialize the vectorizer
tfidf_vectorizer = TFIDFVectorizer()
tfidf_vectorizer.fit(["This is a sample text.", "Another sample text."])
# Create the vector database
db = VectorDatabase("vector_db.sqlite", tfidf_vectorizer)
# Bind a directory to the vector database
directory_binding = DirectoryBinding("path_to_your_directory", db)
# Update the vector store with text data from the directory
directory_binding.update_vector_store()
# Search for a query in the vector database
results = directory_binding.search("This is a sample text.")
print(results)
To add a new vectorization method, create a subclass of the Vectorizer
class and implement the vectorize
method.
from lollmsvectordb import Vectorizer
class CustomVectorizer(Vectorizer):
def vectorize(self, data):
# Implement your custom vectorization logic here
pass
Contributions are welcome! Please fork the repository and submit a pull request.
This project is licensed under the MIT License.
For any questions or suggestions, feel free to reach out to the author:
- Twitter: @ParisNeo_AI
- Discord: Join our Discord
- Sub-Reddit: r/lollms
- Instagram: spacenerduino
Special thanks to the LoLLMs community for their continuous support and contributions.