/ydata-synthetic

Open repository with GAN architectures for tabular data implemented using Tensorflow 2.0.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Synthetic Data Logo

Join us on slack

What is Synthetic Data?

Synthetic data is artificially generated data that is not collected from real world events. It replicates the statistical components of real data without containing any identifiable information, ensuring individuals' privacy.

Why Synthetic Data?

Synthetic data can be used for many applications:

  • Privacy
  • Remove bias
  • Balance datasets
  • Augment datasets

ydata-synthetic

This repository contains material related with Generative Adversarial Networks for synthetic data generation, in particular regular tabular data and time-series. It consists in a set of different GANs architectures developed ussing Tensorflow 2.0. An example Jupyter Notebook is included, to show how to use the different architectures.

Quickstart

pip install git+https://github.com/ydataai/ydata-synthetic.git

Examples

Here you can find usage examples of the package and models to synthesize tabular data.

Credit Fraud dataset Open in Colab

Stock dataset Open in Colab

Project Resources

In this repo you can find the following GAN architectures:

Tabular data

Sequential data