AetherCortex/Llama-X

Why use offload_param in CPU?

xesdiny opened this issue · 1 comments

I think the model fragment loading can be completed under the 6.7B parameter, why use parameterized offload to the cpu?

        "offload_param": {
            "device": "cpu",
            "pin_memory": true
        },

We want to train with a larger batchsize