There seems no reproducibility in FLSim
Closed this issue ยท 2 comments
๐ Bug
We add the code below in the function run()
before main()
in the cifar10_example.py and run the cifar10_example.py as the tutorial did. However, we get the different results running the same code for three times as shown below.
SEED = 1234
random.seed(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
if torch.cuda.is_available():
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
To Reproduce
See the description.
Expected behavior
We can get the same results when running the same code in the same environment.
Environment
- PyTorch Version (e.g., 1.0): 1.12.1+cu116
- OS (e.g., Linux): Ubuntu 22.04.1 LTS (x86_64)
- How you installed PyTorch (
conda
,pip
, source): pip - Build command you used (if compiling from source):
- Python version: 3.9.15
- CUDA/cuDNN version: 11.6/
- GPU models and configuration: GPU 0: NVIDIA GeForce MX450
- Any other relevant information:
Thank you. I think I have solved this problem.
It is due to the self.rng = torch.Generator()
(line 131) of class ActiveUserSelector
in simple_user_selector.py, which makes the get_user_indices()
return different results (Note that I have added the code for reproducing of my previous comment in this experiment). So we can make self.rng = None
and comment line 132-135 to make our code reproducible. Meanwhile, we can also use the interface provided by FLSim, i.e., add "user_selector_seed": 1234
in the cifar10_config.json.
The results verified our conclusions above.
Thanks, yes you need to set the selector seed. Sorry for taking so long to get back to you.