facebookresearch/FLSim

There seems no reproducibility in FLSim

Closed this issue ยท 2 comments

๐Ÿ› Bug

We add the code below in the function run() before main() in the cifar10_example.py and run the cifar10_example.py as the tutorial did. However, we get the different results running the same code for three times as shown below.

SEED = 1234
random.seed(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

image

To Reproduce

See the description.

Expected behavior

We can get the same results when running the same code in the same environment.

Environment

  • PyTorch Version (e.g., 1.0): 1.12.1+cu116
  • OS (e.g., Linux): Ubuntu 22.04.1 LTS (x86_64)
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source):
  • Python version: 3.9.15
  • CUDA/cuDNN version: 11.6/
  • GPU models and configuration: GPU 0: NVIDIA GeForce MX450
  • Any other relevant information:

Thank you. I think I have solved this problem.

It is due to the self.rng = torch.Generator() (line 131) of class ActiveUserSelector in simple_user_selector.py, which makes the get_user_indices() return different results (Note that I have added the code for reproducing of my previous comment in this experiment). So we can make self.rng = None and comment line 132-135 to make our code reproducible. Meanwhile, we can also use the interface provided by FLSim, i.e., add "user_selector_seed": 1234 in the cifar10_config.json.

image

The results verified our conclusions above.

image

Thanks, yes you need to set the selector seed. Sorry for taking so long to get back to you.