Is it worth to user --use_16bit flag? Doesn't hurt model's perfomace?
martinenkoEduard opened this issue · 3 comments
Is it worth to user --use_16bit flag? Doesn't hurt model's perfomace?
Theoretically, the 16bit_flag lowers the computational precision from 32bits to 16bits, implying a loss in performance. Now, in practice, I haven't seen a noticeable accuracy impact when training with 16bits. But it may depend on the task you are dealing with. Now, from a computational resources point of view, that's another story. Training in float 16 implies a drop in ram consumption, thus allowing you to use models larger than the one you can store in 32bits . Also on recent GPUs, 16bits calculations will very likely imply a computational speed increase, making you training quicker (this improvement can reach a *2 factor). Be aware that this speed improvement is not for every gpus. I would advise you to check on techpowerup whether your GPUs benefits from 16bits calculations or not (if not a huge drop in performance is to be expected).
Theoretically, the 16bit_flag lowers the computational precision from 32bits to 16bits, implying a loss in performance. Now, in practice, I haven't seen a noticeable accuracy impact when training with 16bits. But it may depend on the task you are dealing with. Now, from a computational resources point of view, that's another story. Training in float 16 implies a drop in ram consumption, thus allowing you to use models larger than the one you can store in 32bits . Also on recent GPUs, 16bits calculations will very likely imply a computational speed increase, making you training quicker (this improvement can reach a *2 factor). Be aware that this speed improvement is not for every gpus. I would advise you to check on techpowerup whether your GPUs benefits from 16bits calculations or not (if not a huge drop in performance is to be expected).
Will it be the same for training and for inference alike?
So if I trained a model using 16bit precision will I be able to do inference using 32 precision?
I think it is doable, see for instance : https://stackoverflow.com/questions/73454134/change-dtype-of-weights-for-pytorch-pretrained-model