SuperDuperDB/superduperdb

[REL-CLT] High level description of changes and key-features

Opened this issue · 6 comments

[REL-CLT] High level description of changes and key-features
  • Added graph mode Fixed bugs in Vector search
  • Fixed bugs in training of LLMs
  • Optimised LLM training
  • Added ray support for model serving
  • Improve vector search with optimisations
  • Added REST server
  • Fixed miscellaneous bugs
  • Improved testing suite
  • Allow developers to write Listeners and Graph in a single formalism
  • Simplification of developer contracts around model
  • Enabled lazy loading of artifact for low memory footprint.

We need a complete description of all those items @kartik4949

Description:

  • Graph Mode Support:

    • Build complex model graphs by connecting superduperdb Model components. These graphs can be saved in the database like Models and used for inference similar to Models. This feature enables users to visually represent and manage intricate relationships between components, enhancing the understanding and organization of their data models.
  • REST Server Support:

    • Access superduperdb core via REST. Perform database queries, create components, upload artifacts, etc., using the REST interface as an alternative to the Python client. With this feature, users can interact with superduperdb from any programming language or platform that supports RESTful communication, expanding accessibility and integration capabilities.
  • Local Cluster Creation with tmux Utility:

    • Easily set up a debuggable cluster environment using the tmux cluster utility. Users automatically receive a tmux session with windows running as superduperdb service for straightforward debugging. This feature streamlines the process of creating and debugging cluster environments, providing users with a seamless experience for testing and troubleshooting their applications.
  • Custom Usecase Creation:

    • With our updated documentation, users can create their own unique use cases. These can be downloaded as notebooks, enabling a wide range of possibilities with different databases and modalities from a single documentation source. This feature empowers users to tailor superduperdb to their specific needs and workflows, fostering innovation and flexibility in application development.
  • Bulk Query Execution:

    • Create and execute bulk queries in one go. This feature allows users to combine various types of queries (insert, update, delete, etc.) into a single bulk query for efficient backend execution. By minimizing the number of round trips to the database, users can significantly improve performance and optimize resource utilization in data-intensive applications.
  • Lazy File Datatype:

    • Utilize lazy file datatype to create file references that can be retrieved later. This feature simplifies file management by allowing users to load files when needed. Lazy file datatype enhances efficiency by deferring file loading until necessary, reducing memory usage and improving system responsiveness, especially when dealing with large files or datasets.
  • Auto Decorator for Models:

    • Apply objectmodel and torchmodel decorators directly on callables to easily create superduperdb basic model objects. This streamlined interface speeds up the process of model creation. By automating the application of decorators, users can focus more on building and refining their models, accelerating development cycles and boosting productivity.
  • Pandas Directory Support:

    • Create a pandas datalayer instance with a directory. This directory contains future model output tables, data inputs, etc. Preexisting CSV files in the directory are treated as data tables and can be referenced after datalayer creation. This feature simplifies data management by integrating pandas functionality directly into superduperdb, allowing users to seamlessly work with structured data and leverage the powerful capabilities of pandas within their workflows.
  • Simplified Model Prediction Interface:

    • Access a simplified model interface with predict_one and predict APIs for single and multi-datapoint prediction tasks. This feature provides a user-friendly and intuitive way to perform model predictions, simplifying the integration of machine learning models into applications and workflows.
  • File Datatype Support:

    • Add file datatype type to support saving and reading files/folders in artifact_store. This feature enhances the flexibility of managing files and folders within the system. By supporting file datatype, superduperdb becomes a more versatile platform for handling various types of data, including unstructured data such as documents, images, and multimedia files, alongside structured data.

@jieguangzhou to provide comments. @kartik4949 to provide latest version.

thgnw commented

@blythed don't we want to use this to write a blog post? this is too low level for website and press release. may be good for new readme?