xrsrke/pipegoose

Dataloader and Sampler for 3D Parallelism

xrsrke opened this issue · 0 comments

xrsrke commented

When we train a model with pipeline parallelism, different stages require different data, some stages even do not load data. So we try to make the different stages only get their needed data, without loading the full dataset.
And turn a regular pytorch dataloader to a distributed dataloader

Reading