Add description how to use fused_scale_softmax
loopinf opened this issue · 1 comments
loopinf commented
Describe a TODO feature
- It is hard to know how to use fused scale mask softmax
- what is scale value and how it is used in attention layer.
- missing test case for scale value result for not scale = 1.0
Assignees
hyunwoongko commented
How about holding a meeting for this?
Please see discord.