Model Comparison and Evaluation

In this section, we will compare the performance of the two models for synthesizing tabular data using statistical methods and similarity measures. We will focus on K.L divergence for statistical comparison and cosine similarity for measuring similarity.

1. Statistical Comparison - K.L Divergence

1.1 GAN with Sigmoid Activation vs. GAN with Softmax Activation

  • File: tickets-gans.ipynb, Section 1 and 2

K.L Divergence Results:

  • Describe the process and results of comparing the synthesized data from GAN with Sigmoid Activation and GAN with Softmax Activation using K.L divergence.
  • Interpret the K.L divergence values and discuss the statistical significance.

1.2 VAE with Sigmoid Activation vs. VAE with Softmax Activation

  • File: tickets-gans.ipynb, Section 3 and 4

K.L Divergence Results:

  • Describe the process and results of comparing the synthesized data from VAE with Sigmoid Activation and VAE with Softmax Activation using K.L divergence.
  • Interpret the K.L divergence values and discuss the statistical significance.

2. Similarity Measures - Cosine Similarity

2.1 GAN with Sigmoid Activation vs. GAN with Softmax Activation

  • File: tickets-gans.ipynb, Section 1 and 2

Cosine Similarity Results:

  • Describe the process and results of comparing the synthesized data from GAN with Sigmoid Activation and GAN with Softmax Activation using cosine similarity.
  • Discuss the implications of the cosine similarity values on the similarity between the datasets.

2.2 VAE with Sigmoid Activation vs. VAE with Softmax Activation

  • File: tickets-gans.ipynb, Section 3 and 4

Cosine Similarity Results:

  • Describe the process and results of comparing the synthesized data from VAE with Sigmoid Activation and VAE with Softmax Activation using cosine similarity.
  • Discuss the implications of the cosine similarity values on the similarity between the datasets.

Conclusion

  • Summarize the findings from both statistical comparison and similarity measures.
  • Provide insights into the strengths and limitations of each model.
  • Discuss potential areas for improvement or further exploration.

These comparisons provide a comprehensive understanding of how well the synthesized data aligns with the original data and the differences between the two models in achieving this goal.

Models Link https://drive.google.com/drive/u/2/folders/1eEw2LhuK3aMJziK-6VmmvY3kVS-44vxg