question on async sampler
jnhwkim opened this issue · 1 comments
jnhwkim commented
Using coco2
branch, I expand to another dataset, Visual QA. w.r.t multithreading, I didn't gain speed up with this status.
- iowait: 9%
- read/s: 20 mb/s
- batch size: 100
- image feature (binary): 400kb
- nThread: 2 or 4
I don't know why read/s is too slow; it's similar to synced one (for sure, I called async()
to dp.RandomSampler
and it works fine).
When I check the queue size in real time, it keeps 4 for self._send_batches
and 1 for self._recv_batches
(nThread=4).
nicholas-leonard commented
@jnhwkim Not sure what it going wrong. I have always found the multithreading difficult to optimize. The Queue size is hardcoded in the threads package (always equal to num threads). I do remember that I myself did see a small speedup when using datasource:multithread() + sampler:async(). But it is always disappointing.