TKassis opened this issue 2 years ago · 1 comments
Just to confirm, these optimizes don't support 16 bit precision training yet, correct?
They don't have native support. I've used them within fairseq (which provides float16 support by wrapping the optimizer) in some of my experiments.