data-versioning
There are 42 repositories under data-versioning topic.
dolthub/dolt
Dolt – Git for Data
wandb/wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
iusztinpaul/energy-forecasting
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 2.5 𝘩𝘰𝘶𝘳𝘴 𝘰𝘧 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 & 𝘷𝘪𝘥𝘦𝘰 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴
Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
koordinates/kart
Distributed version-control for geospatial and tabular data
RecallGraph/RecallGraph
A versioning data store for time-variant graph data.
BemiHQ/bemi
Automatic data change tracking for PostgreSQL
leeper/data-versioning
Collecting thoughts about data versioning
daefresh/awesome-data-temporality
A curated list to help you manage temporal data across many modalities 🚀.
layerai-archive/sdk
Metadata store for Production ML
ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
BemiHQ/bemi-prisma
Automatic data change tracking for Prisma
GitDataAI/jiaozifs
An Git-like version control file system for data lineage & data collaboration.
wrgl/wrgl
Git-like data versioning.
jomariya23156/full-stack-on-prem-cv-mlops
"1 config, 1 command from Jupyter Notebook to serve Millions of users", Full-stack On-Premises MLOps system for Computer Vision from Data versioning to Model monitoring and drift detection.
aws/amazon-finspace-examples
This repo contains sample code and sample notebooks to illustrate how to work with Amazon FinSpace
BemiHQ/bemi-typeorm
Automatic data change tracking for TypeORM
martysai/artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
pier4all/mongoose-versioned
Document versioning library for MongoDB using the mongoose package.
d-lowl/bunny-party
A demonstration of how DVC and MLFlow can be used in the task of data relabeling
datopian/ckanext-versions
A CKAN extension for data versioning.
datopian/ckanext-versioning
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
zensors/droplet
A JSON-based format for working with machine learning data, with a focus on data interoperability.
data-as-code/dac
Python Data as Code core implementation
dolthub/kedro-dolt
Kedro-Dolt Hook Plugin
KalyanM45/Data-Version-Control-Demo
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
NewronAI/newron-sdk
Newron is a data-centric ML platform to easily build, manage, deploy and continuously improve models through data driven development.
VineetKT/ML_fastapi_on_Heroku_CI-CD
Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.
OElesin/modeldb-aws
Verta ai ModelDB on AWS Cloud with integration into Amazon SageMaker for ML training data versioning and experiment tracking
pier4all/data-versioning
Repository for evaluating the different approaches to data versioning
tahonick/MLOps-Data-versioning-with-ClearML
Learning data and model versioning with ClearML while cleaning and modeling happiness by country with a Kaggle dataset
albagc/auto-data-version
Obtain data versioning tag using ML models
cs-uche/Car-Prices-Prediction
Advanced Machine Learning Regression: Predicting Car Prices
ksm26/LLMOps
In this course navigates through the LLMOps pipeline, enabling you to preprocess training data for supervised fine-tuning and deploy custom Large Language Models (LLMs).