Training metrics are different for the same config and same seed for different runs
Westerby opened this issue · 1 comments
Is there an existing issue for this?
- I have searched the existing issues
Problem summary
Training metrics used in consecutive trainings, with same parameters, are different for our own configs.
Code for reproduction
Can provide our config code via e-mail or private message.
Actual outcome
We developed our configs with 8495a2e InnerEye commit,
When training on 8495a2e the resulting metrics are the same for different runs and same seed.
When training on last committ: d902e02, the resulting metrics are slighlty different for different runs and same seed.
We made a separate test for Lung.py config which comes with InnerEye, and the metrics are the same when tested on both 8495a2e and d902e02 .
In our config we use only random module from standard library, which is supposed to be seed with seed_everything method from pytorch lightning. We checked some values in randomly generated numbers in our module, and they were the same for different runs.
Error messages
No response
Expected outcome
We want to have exactly the same results for all runs with the same config and seed value.
System info
System: Ubuntu 18.04.5 LTS
env.txt
Okay, there's new parameter added for LightningContainer, which we missed after changing IE version.
self.pl_deterministic = True
With this added, reproducibility works.