Is this package running in parallel or single worker?
spacegoing opened this issue · 8 comments
It seems like it can only use 1 worker even though I opened 20
Hi @spacegoing, I think this issue might be related to #4. I'll look into it soon.
Yes please:D
@spacegoing Do you have a minimal piece of code that I can test with?
@msamogh Sorry I do not have a minimal one. I have a paper due tomorrow. I am currently hurry on that. I will try to write one after that. Many thanks:D
Hi, I have been playing around with nonechucks a bit. I observed, that if I use SafeDataset
together with standard DataLoader
(using default sequential sampler), my CPUs are fully loaded. However, when I use the DataLoader
with SafeSampler
, then I see usually only one process running and the others are sleeping (probably waiting for synchronization). Could it be that in SafeSampler __next__()
method the threads needs to be synchronized due to the while loop? It is a really HUGE difference in performance between using and not using SafeSampler...
However, I understand that if I use DataLoader without SafeSampler, then the sampled examples can be returned several times, which is not usable in my case.
@brejchajan Thanks for the detailed description. Could you open this as a new issue?
@spacegoing I'll assume this issue to be solved and close it. If your issue still isn't resolved, please let me know.