Issue with GetBatches function

Question

Issue with GetBatches function

SnehaChou opened this issue 4 years ago · 4 comments

I am trying to replicate your code with the same dataset which you used for training. My code stops running once it enters run_epoch function in syngcn.py code.
The issue seems to appear in self.getBatches(shuffle), however I ran make command and BatchGen.so is created. So not really sure why my code stops running without any error.

Answer 1 · 2020-03-14T05:10:11.000Z

I think you can try to check batch_generator.cpp. The path of the data.txt in reset() is correct or not. If you change it, please run make command again. @SnehaChou

Answer 2 · 2020-03-14T22:22:15.000Z

Thanks @cairouchong. I checked the path of reset function in batch_generator.cpp and the path was correct. However when I commented the line "self.lib.reset()" in syngcn.py, the code starts running fine. Not sure whether it is the right way to do it.

Also, even after 25 epochs my training loss is nan. Is it normal or because of some other issue?

Answer 3 · 2020-03-15T15:08:59.000Z

Hi @SnehaChou,
Commenting self.lib.reset() should be not be the solution. Also, you should not get Nan as the loss, it seems like something is not working correctly. Please verify that all the dependencies are met and the batches are getting generated properly. You can try using pdb.set_trace() in batch generation function to see whether you are getting something from cpp file or not.

Answer 4 · 2020-03-15T15:12:14.000Z

Okay @svjan5. I will look into it. Thanks