Unable to find the Discriminator Instance & the saved disc_path

Question

Unable to find the Discriminator Instance & the saved disc_path

Closed this issue 5 years ago · 3 comments

ayush2051 commented 5 years ago

Hi Tom,
I was trying to use the policy gradient technique. I used the latest commit regarding the LM and QA saved models. But I wasn't able to find the Discriminator instance saved model i.e. disc_emb files, etc. which are being used in the calculation of RL_score. Can you help me in this regard?

Answer 1 · 2019-07-30T13:03:29.000Z

Sorry, I don't have a pre-trained discriminator model that I can provide. The discriminator reward didn't work very well anyway, so your best bet would be to hard code those values to zero and comment out the code that creates the discriminator model.

Answer 2 · 2019-08-01T15:27:42.000Z

Tom, I tried to fine-tune it further using RL but ran into a RunTime error of Out of Memory even after using CUDA visible devices. I guess its a nested tqdm error or maybe an issue of tqdm version compatibility. Am I doing it right to fine-tune using RL or any specific changes I need to incorporate for using Policy Gradient?

Secondly, as experimented by you, you couldnt find any significant improvement even after using RL right? So I shoud stick to doing experiments only on Seq2Seq model maybe?

Answer 3 · 2019-08-05T12:24:08.000Z

You will probably have to reduce the batch size when using RL. The tqdm is a red herring, it shows that because tqdm catches the exception inside the main loop and rethrows it.

Yes, as found in our paper, RL training didn't improve question quality with the rewards we tried.