data-version-control
There are 50 repositories under data-version-control topic.
dolthub/dolt
Dolt – Git for Data
iterative/dvc
🦉 Data Versioning and ML Experiments
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
data-drift/data-drift
Metrics Observability & Troubleshooting
splitgraph/sgr
sgr (command line client for Splitgraph) and the splitgraph Python library
daefresh/awesome-data-temporality
A curated list to help you manage temporal data across many modalities 🚀.
GitDataAI/jzfs
Git based Version Control File System for joint management of code, data, model and their relationship.
ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
NtreevSoft/Crema
Meta data server & client tools for game development
git-lfs-fuse/git-lfs-fuse
Mount remote repositories, models and datasets managed by Git LFS instantly.
zincware/ZnTrack
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
wrgl/wrgl
Git-like data versioning.
aws-samples/amazon-sagemaker-experiments-dvc-demo
SageMaker Experiments and DVC
martysai/artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
Ezzaldin97/Batch-Serving-ML-Pipeline
create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools
data-as-code/dac
Python Data as Code core implementation
datopian/ckanext-versions
A CKAN extension for data versioning.
data-mill-cloud/mastro
Metadata management in Go
datopian/ckanext-versioning
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
ejhusom/d2m
A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).
zensors/droplet
A JSON-based format for working with machine learning data, with a focus on data interoperability.
Shuyib/data-version-ctrl
Data version control with Makefile and DVC for a regression task to estimate insurance costs for certain individuals.
ClimateImpactLab/DataFS
An abstraction layer for data storage systems
mlrepa/dvc-2-data-versioning
Lesson 2 tutorial: Versioning Data and Model for the ML REPA School course: Machine Learning experiments reproducibility and engineering with DVC
RuiFilipeCampos/git-datasets
Declaratively create, transform, manage and version ML datasets.
KalyanM45/Data-Version-Control-Demo
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
Michael95-m/simple_demo_dvc
Demonstration about how to use DVC(Data Version Control)
RaedAddala/AI-Engineering-Project-Template
ML Project Template containing data, data collection, feature engineering, model trainings, model config files, tests, and serialized models.
VineetKT/ML_fastapi_on_Heroku_CI-CD
Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.
AbhaySingh71/MLops-emotion-detection
An end-to-end MLOps pipeline for emotion detection. Features data versioning with DVC + AWS S3, model training and evaluation with MLflow, CI/CD via GitHub Actions, FastAPI serving, Docker containerization, AWS EC2 deployment, and experiment tracking on DagsHub.
HarshStats/End-to-End-Deep-Learning-Project-Chicken-Disease-Classification
The Chicken Disease Classification Using MLOps DVC Pipeline project utilizes the VGG16 architecture to analyze images of chicken fecal matter, enabling early disease detection and reducing economic losses in poultry farming.
Md-Emon-Hasan/DVC-Turotial
📂 Comprehensive guide on using DVC for efficient and reproducible machine learning projects, covering essential commands and workflows.
aliyzd95/project-dnn-ser-pipeline
This repository contains a complete machine learning pipeline for Speech Emotion Recognition (SER) using Deep Neural Networks (DNNs).
Faisal-AlDhuwayhi/Deploying-ML-Model-to-Cloud
Deploying a ML Model to Cloud Platform with FastAPI applying CI/CD practices
Roshankikani/food-delivery-eta-prediction-system
A full-stack machine learning architecture for food delivery ETA prediction, leveraging a DVC-driven pipeline, automated CI/CD workflows, cloud artifact management, and LGBM-based stacked regression ensemble for high-fidelity time estimations.