Saving and loading model seems to be regressing to lower performance
aperiamegh opened this issue · 1 comments
Hi, it is a great experience working with your library so far. I wanted to ask what the best way is to save and load the model.
I save:
model = ViT(hyperparams)
train(model)
torch.save(model.state_dict(), save_loc + f"model_e{epoch+1}.pth")
I load:
model = ViT(hyperparams) # exact same
model.load_state_dict(torch.load(f"models/VITM/model_e13.pth"))
However, I get the loaded model has lower train and test scores on the exact same dataset. Are there other things that I will need to save that I am missing? What could be the reason for this?
I am using 0.1 dropouts, but I do not expect that to cause this discrepancy.
My eval setup had me just run the training part of the code with model.eval()
. The loss.backward()
and optimizer.step()
weren't commented out because I thought it won't have an effect since I am running in eval mode. However this was inherently doing something weird because if I only evaluated on the test set things worked as expected.