OpenMOSS/CoLLiE

bf16是否支持?

Closed this issue · 5 comments

bf16是否支持?

支持,可以这样设置

config.ds_config = {
    "bf16": {"enabled": True},
}

@KaiLv69
我的ds_config是

config.ds_config = {
        "bf16": {
            "enabled": True
        },
        "zero_allow_untested_optimizer": True,
        "zero_force_ds_cpu_optimizer": False,
        "zero_optimization": {
            "stage": 3,
            "offload_optimizer": {
                "device": "cpu",
                "pin_memory": False
            }
        }
}

报错:
RuntimeError: output tensor must have the same type as input tensor
image

你好。你的PyTorch版本是2.0吗?可以尝试调低PyTorch版本。应该是DeepSpeed的问题,可以参考这个回复microsoft/DeepSpeed#3654 (comment)

@KaiLv69
好像是lomo优化器问题,torch.optim.AdamW是支持bf16的

你可以尝试一下这个pr里的代码:#74 (已经被merge到dev分支了)