Lightning-AI/litgpt

Add support for memory-efficient and faster optimizers

rasbt opened this issue · 1 comments

Maybe GaLore (#1192) should be changed from GaloreArgs to OptimizerArgs after all. Then we can also more easily consider other variants such as BAdam (BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models, https://arxiv.org/abs/2404.02827).

The experiments from here look very compelling. And it only adds 1 hyperparameter:

Screenshot 2024-04-27 at 8 36 56 AM

Agreed