Is it normal that I cannot exactly reproduce the results?
Closed this issue · 1 comments
I am trying to reproduce the results of DeepFM on criteo_4x_001 dataset. I have setup my enviroment as follows:
python 3.6.13
cuda 11.7
torch 1.10.2
fuxictr 1.0.2
h5py 3.1.0
numpy 1.19.5
pandas 1.1.5
scipy 1.5.4
Is is not exactly the same as the environment in this repo, but at least I have set up fuxictr version 1.0.2 exactly.
Then I followed the config as
https://github.com/reczoo/BARS/tree/main/ranking/ctr/DeepFM/DeepFM_criteo_x4_001
The results were slightly differerent. I noticed that the AUC results in the original exepreiment had a big jump from 0.809407 to 0.813303 at epoch 4 to 5. My results also had such kind of a jump, but it came later at epoch 8 to 9, where AUC jump from 0.809898 to 0.813443.
Yes, the result is quite normal. You cannot expect to run the same result within an different environment. Runing on GPUs is always non-deterministic.