pytorch/PiPPy

CPU offloading?

Opened this issue · 2 comments

It seems like pipelining could possibly greatly simplify the implementation of a feature such as fairscale's OffloadModel https://fairscale.readthedocs.io/en/latest/deep_dive/offload.html

Is this something that is feasible?

Can you elaborate more? Do you mean that one may offload the entire stage after its forward pass? (And meanwhile bring back another stage from CPU to do the next forward?)

Can you elaborate more? Do you mean that one may offload the entire stage after its forward pass? (And meanwhile bring back another stage from CPU to do the next forward?)

yes, that is what I was thinking