data-version-control
There are 38 repositories under data-version-control topic.
dolthub/dolt
Dolt – Git for Data
iterative/dvc
🦉 ML Experiments and Data Management with Git
activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
splitgraph/sgr
sgr (command line client for Splitgraph) and the splitgraph Python library
data-drift/data-drift
Metrics Observability & Troubleshooting
daefresh/awesome-data-temporality
A curated list to help you manage temporal data across many modalities 🚀.
ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
NtreevSoft/Crema
Meta data server & client tools for game development
GitDataAI/jiaozifs
An Git-like version control file system for data lineage & data collaboration.
zincware/ZnTrack
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
wrgl/wrgl
Git-like data versioning.
aws-samples/amazon-sagemaker-experiments-dvc-demo
SageMaker Experiments and DVC
martysai/artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
datopian/ckanext-versions
A CKAN extension for data versioning.
Ezzaldin97/Batch-Serving-ML-Pipeline
create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools
data-mill-cloud/mastro
Metadata management in Go
datopian/ckanext-versioning
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
zensors/droplet
A JSON-based format for working with machine learning data, with a focus on data interoperability.
ClimateImpactLab/DataFS
An abstraction layer for data storage systems
data-as-code/dac
Python Data as Code core implementation
ejhusom/d2m
A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).
mlrepa/dvc-2-data-versioning
Lesson 2 tutorial: Versioning Data and Model for the ML REPA School course: Machine Learning experiments reproducibility and engineering with DVC
RuiFilipeCampos/git-datasets
Declaratively create, transform, manage and version ML datasets.
KalyanM45/Data-Version-Control-Demo
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
Michael95-m/simple_demo_dvc
Demonstration about how to use DVC(Data Version Control)
VineetKT/ML_fastapi_on_Heroku_CI-CD
Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.
DiegoBiagini/NatuReddit
Personal project aimed at developing a ML service which resembles a production environment system
joseph-nagel/dvc-playground
Playground for learning DVC
MArpogaus/dvc-stage
Stop programming common dvc stages. Configure them.
abdmuffid/DVC-Basics
In this repository, an ML-Ops task is undertaken to practice configuring and storing data using DVC on GitHub. The goal is to explore how DVC seamlessly integrates for efficient data management, enhancing reproducibility and scalability in machine learning workflows.
blaz-cerpnjak/dvc-git-example
DVC - Data Version Control Basics
ericdasse28/dvc-test
Just to try out DVC
MartinKalema/dvc-test
Data Versioning with dvc
thedatanerdz/MLP-80
Using DVC for Data Versioning