kssteven418/I-BERT

How is the scaling factor S implemented with integer?

Opened this issue · 0 comments

deJQK commented

Thanks for sharing the code. This is really a great paper with clear and impressive ideas. I feel sorry that I might have some misunderstanding and is a little confused about some details. In the paper, the author claims the model is implemented with integer-only arithmetic, yet in Algorithms 1-4 as well as in Equations 1 and 2, the scaling factor is not an integer but seems to be some floating number. In other words, it needs to do floating-point scaling, which is also demonstrated on this line. I wonder if I misunderstood this and how this is implemented with integer arithmetic. Thanks a lot!