Sampling method parameters are unclear and it crashes Google Colab
tirthajyoti opened this issue · 2 comments
Hi,
I am trying out this excellent implementation of tabular-GAN
While the parameters for the fit
are well illustrated in the docs and the Jupyter notebook example, the sample
method parameters are not.
Currently, the sample
method runs very slow. That may be fine for a GAN to generate data, but is there a way to control how much iterations it will run before generating the data? Can some more documentation be added for the sample
method?
For example, I was just trying out your example Notebook in Google Colab and the instance crashed after using all the Memory!
Fitting was fine. I changed the number of steps/epochs to 1000. Crash happened after 44 minutes of GPU compute for sampling. Output log showed 144328it [44:19, 54.66it/s]
Is this normal?
@tirthajyoti you can find all the API documentation here: https://dai-lab.github.io/TGAN/api/tgan.model.html
In particular, the sample
method is here: https://dai-lab.github.io/TGAN/api/tgan.model.html#tgan.model.TGANModel.sample
As you can see, there isn't much documentation about the sampling arguments because there is only one argument, which needs no documentation: the number of samples to generate.
Regarding the Google Colab and the memory consumption, we cannot tell you whether it is normal or not because we do not have any insight on your data. Also, we have never executed TGAN on Google Colab ourselves.
However, yes, TGAN is memory intensive, just like any other GAN or data synthesization tool, and yes, it's normal that the memory consumption increases during sampling, as you are generating and trying to allocate new data that didn't exist before you started sampling.
One option that you have, if you have limited resources and can fit but not sample, is to delete the data variables and collect garbage after fitting, before sampling, to make sure that you have enough space to allocate the new data as it is generated.
Hi @qwerkkk I think that you are hitting a different issue. Please have a look at the issue #41 that I just opened.
As you can see there, there is a problem that makes TGAN go into an infinite loop if the number of samples is lower than the batch_size, which defaults to 200.
So, in you case, the snippet of code that you pasted will never end. However, if you change SAMPLES to 200 or higher, the fit process will take a couple of minutes and then the sampling will be almost immediate.