rapidsai/distributed-join

Improve overlapping efficiency

Opened this issue · 0 comments

We should profile and improve the computation-communication overlap efficiency on

  • a single node DGX with NVLink
  • multiple DGX nodes connected with IB