memray/seq2seq-keyphrase-pytorch

Stop training after validation

Closed this issue · 6 comments

Hi Rui,

The training process stopped after validation. How could I change the settings to make the model resumes after validation?
Thanks!

Can you provide some other information to help me locate the problem? I think it's due to a buggy setting, but I don't have any clue now. Can you confirm the program has ended or just no print-out?

Hi Rui,
Sorry, it was my mistake. I thought the training was suspended because the GPU utilization suddenly dropped from 90% to 3% for a long period of time. Now, I realize that the validation process takes a much longer time than I expected. Is there any command I can use in the command line to quickly shorten the validation time? Like I just use a small part of the validation set?
Training with a GPU is very expensive, I just want to save some money :)
Thank you in advance!

I found the code to reduce validation set in train.py. Thanks!

Sorry for my late reply. The validation is indeed very slow (currently is with beam search, we probably can change it to simple perplexity). Maybe you can skip the validation. Just train for like 5 epochs (I guess should be enough) on GPU and test on CPU.

@memray i was using CPU to train the model and how many epochs do you recommend me to run ?

It's the same as using GPU, just slower.