lxtGH/SFSegNets

Unable to reproduce train

lucasjinreal opened this issue · 6 comments

Loss divergency in epoch 2... even I narrow down lr:

image

My pytorch version is 1.7 @lxtGH Please not using pytorch >=1.2 if you didn't test higher version.

At least on my side, pytorch 1.7 don't support a lot APIs in your old code.

lxtGH commented

It maybe the version problem. My enviroment is pt.1.3 and pt.1.4

@lxtGH Thanks for your reply. Would u help test on pytorch 1.7? It might be version issue but I don't know where it be. It can run, but doesn't make sense pytorch will break loss convergency.

lxtGH commented

Hi ! I will try to use pt1.4 and pt1.6 to run the code since pt1.7 is not supported by our server. I will update the results once I get the results. @jinfagang

@lxtGH Thanks. Once you done pls add more details about it such as model arch use, lr, GPUs etc. Better with loss curve

lxtGH commented

for pt1.4, pt1.6 I use the default setting in the cofing. The results are normal