Why use offload_param in CPU？

Question

xesdiny opened this issue 2 years ago · 1 comments

I think the model fragment loading can be completed under the 6.7B parameter, why use parameterized offload to the cpu?

        "offload_param": {
            "device": "cpu",
            "pin_memory": true
        },

Answer 1 · 2023-04-04T23:28:35.000Z

We want to train with a larger batchsize