[Bug] MMDeepSpeedEngineWrapper bf16 bug

I have searched Issues and Discussions but cannot get the expected help.
The bug has not been fixed in the latest version(https://github.com/open-mmlab/mmengine).

master branch

Line 195 in e43bbb5

new_inputs.append(v.half())

deepspeed: bf16 enable

RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (CUDABFloat16Type) should be the same

When using deepspeed bf16, v.half() should change to v.to(torch.bfloat16).

Line 195 in e43bbb5

new_inputs.append(v.half())

Thanks very much! We have fixed it in #1400 .