facebookresearch/swav

A question about dataloader setting : `shufflu`?

Classmate-Huang opened this issue · 2 comments

Excellent work!

I see that your training script does not use the shuffle=True setting when loading data. I wonder if this setting has any effect for performance?

Does using shuffle=True have a positive effect? Or negative effects?

Same question. Have you reached a conclusion? Thx

Stick an explanation:
Shuffle in the DistributedSampler is true(default). If you set shuffle in the DistributedSampler to true, you do not need to set shuffle in the DataLoader that uses the sampler, because the DistributedSampler generates different random seeds for each process in a distributed environment to mess up the data. Therefore, in a distributed environment, it is recommended to set the shuffle only in the DistributedSampler.