cryscan/web-rwkv

LayerNorm improvements

Closed this issue · 1 comments

https://fleetwood.dev/posts/layernorm-as-fast-as-possible

Your layernorm is subject to precision errors! Also you're missing eps.

Thanks! One of my friends pointed out that I should use Welford's algorithm earlier in development but I was too busy then. Now that I have time, I will do a quick fix.