synthetic-data
There are 514 repositories under synthetic-data topic.
stefan-jansen/machine-learning-for-trading
Code for Machine Learning for Algorithmic Trading, 2nd edition.
lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
DLR-RM/BlenderProc
A procedural Blender pipeline for photorealistic training image generation
nucleuscloud/neosync
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
sdv-dev/SDV
Synthetic data generation for tabular data
synthetichealth/synthea
Synthetic Patient Population Simulator
unrealcv/unrealcv
UnrealCV: Connecting Computer Vision to Unreal Engine
shuttle-hq/synth
The Declarative Data Generator
ydataai/ydata-synthetic
Synthetic data generators for tabular and time-series data
sdv-dev/CTGAN
Conditional GAN for generating synthetic tabular data.
argilla-io/distilabel
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
nicolas-hbt/pygraft
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
SciPhi-AI/synthesizer
A multi-purpose LLM framework for RAG and data creation.
paulbricman/thisrepositorydoesnotexist
A curated list of awesome projects which use Machine Learning to generate synthetic content.
gretelai/gretel-synthetics
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
BatsResearch/bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
sdv-dev/Copulas
A library to model multivariate data using copulas.
GreenmaskIO/greenmask
PostgreSQL database anonymization tool
plaitpy/plaitpy
plait.py - a fake data modeler
vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
GeorgeCazenavette/mtt-distillation
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
wenbowen123/iros20-6d-pose-tracking
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Unity-Technologies/SynthDet
SynthDet - An end-to-end object detection pipeline using synthetic data
yandex-research/tab-ddpm
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Nicholasli1995/EvoSkeleton
Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded deep monocular 3D human pose estimation wth evolutionary training data"
BMW-InnovationLab/BMW-Labeltool-Lite
This repository provides you with an easy-to-use labeling tool for State-of-the-art Deep Learning training purposes. It supports Auto-Labeling.
sparkfish/augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Data-Centric-AI-Community/awesome-data-centric-ai
Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖
nickkunz/smogn
Synthetic Minority Over-Sampling Technique for Regression
LinkedAi/flip
Synthetic Image generation with Flip. Generate thousands of new 2D images from a small batch of objects and backgrounds.
tirthajyoti/pydbgen
Random dataframe and database table generator
microsoft/genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
Unity-Technologies/PeopleSansPeople
Unity's privacy-preserving human-centric synthetic data generator
ZumoLabs/zpy
Synthetic data for computer vision. An open source toolkit using Blender and Python.