synthetic-data
There are 568 repositories under synthetic-data topic.
stefan-jansen/machine-learning-for-trading
Code for Machine Learning for Algorithmic Trading, 2nd edition.
lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
nucleuscloud/neosync
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
DLR-RM/BlenderProc
A procedural Blender pipeline for photorealistic training image generation
sdv-dev/SDV
Synthetic data generation for tabular data
synthetichealth/synthea
Synthetic Patient Population Simulator
unrealcv/unrealcv
UnrealCV: Connecting Computer Vision to Unreal Engine
argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
ydataai/ydata-synthetic
Synthetic data generators for tabular and time-series data
shuttle-hq/synth
The Declarative Data Generator
sdv-dev/CTGAN
Conditional GAN for generating synthetic tabular data.
jofpin/synthBTC
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
GreenmaskIO/greenmask
PostgreSQL database anonymization tool
nicolas-hbt/pygraft
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
BatsResearch/bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
SciPhi-AI/synthesizer
A multi-purpose LLM framework for RAG and data creation.
paulbricman/thisrepositorydoesnotexist
A curated list of awesome projects which use Machine Learning to generate synthetic content.
gretelai/gretel-synthetics
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
sdv-dev/Copulas
A library to model multivariate data using copulas.
plaitpy/plaitpy
plait.py - a fake data modeler
vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
magpie-align/magpie
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
GeorgeCazenavette/mtt-distillation
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
wenbowen123/iros20-6d-pose-tracking
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
yandex-research/tab-ddpm
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Unity-Technologies/SynthDet
SynthDet - An end-to-end object detection pipeline using synthetic data
sparkfish/augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Nicholasli1995/EvoSkeleton
Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded deep monocular 3D human pose estimation wth evolutionary training data"
BMW-InnovationLab/BMW-Labeltool-Lite
This repository provides you with an easy-to-use labeling tool for State-of-the-art Deep Learning training purposes. It supports Auto-Labeling.
Data-Centric-AI-Community/awesome-data-centric-ai
Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖
databrickslabs/dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
nickkunz/smogn
Synthetic Minority Over-Sampling Technique for Regression
microsoft/genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
LinkedAi/flip
Synthetic Image generation with Flip. Generate thousands of new 2D images from a small batch of objects and backgrounds.