This repository features material related to content that spreads across multiple folders. For the time being, it is related to my new book Synthetic Data and Generative AI, available here, and published by Elsevier.
It also includes:
- NoGAN code, a tabular data synthesizer running 1000x faster than GenAI methods based on neural networks, and consistently delivering better results regardless of the evaluation metric (including state-of-the-art new quality metrics capturing a lot more than traditional distances), both on categorical and numerical features, or a mix of both. For details, see technical paper #29, available here.
- DeepResampling code, another fast NoGAN based on resampling and distribution-free Hierarchical Bayesian Models, with hyperparameter auto-tuning. For details, see technical paper #31, available here.
- NoGAN_Hellinger code (two scripts), with loss function replaced by the Hellinger model evaluation metric. A blend of NoGAN and DeepResampling. For details, see section 2.4 in the project textbook, here.