nicholas-leonard/dp

question on async sampler

jnhwkim opened this issue · 1 comments

Using coco2 branch, I expand to another dataset, Visual QA. w.r.t multithreading, I didn't gain speed up with this status.

  • iowait: 9%
  • read/s: 20 mb/s
  • batch size: 100
  • image feature (binary): 400kb
  • nThread: 2 or 4

I don't know why read/s is too slow; it's similar to synced one (for sure, I called async() to dp.RandomSampler and it works fine).

When I check the queue size in real time, it keeps 4 for self._send_batches and 1 for self._recv_batches (nThread=4).

@jnhwkim Not sure what it going wrong. I have always found the multithreading difficult to optimize. The Queue size is hardcoded in the threads package (always equal to num threads). I do remember that I myself did see a small speedup when using datasource:multithread() + sampler:async(). But it is always disappointing.