LiyuanLucasLiu/RAdam

Any concern for using `math.sqrt` instead of `torch.sqrt`

wenmin-wu opened this issue · 2 comments

I find you use a lot of math.sqrt in your implementation. Any concern for not using torch.sqrt instead? I think math.sqrt is slower than torch.sqrt because it's on CPU.

I believe since those computations take float numbers as inputs (instead of tensor), it should not be a problem.

Got it, thanks