This is fantastic, great work! Just to be clear...

Question

This is fantastic, great work! Just to be clear...

slerman12 opened this issue 3 years ago · 1 comments

Just making sure, this lazy wrapper somehow divvies up the computations per GPU budget, right? it doesn't just... sub-sample a smaller batch and ignore the remainder, right?

Answer 1 · 2021-12-09T08:04:09.000Z

Haha, no it doesn't. It divides the original batch into smaller batches and accumulates gradients accordingly, so it should effectively be the same as the computing the original. There seems to be a bug #16, however, I'm a little busy this week to really look into what happened.

As for devices, currently it only works on a single GPU (as a proof of concept), I'm still figuring out a way for it to work across multiple GPUs.