bkitano/llama-from-scratch
Llama from scratch, or How to implement a paper without crying
Jupyter Notebook
Issues
- 1
next level
#3 opened by UmarIgan - 3
Incorrect RMSNorm
#4 opened by arunmallya - 1
get_rotary_matrix
#8 opened by nkkbr - 0
- 0
RoPEMaskedAttentionHead
#6 opened by nkkbr - 1
no need to softmax before cross_entrpoy
#5 opened by nkkbr - 1
Just to thank you!
#2 opened by Andreh1982 - 1
TypeError: MultiheadAttention.forward() got an unexpected keyword argument 'is_causal'
#1 opened by bjpcjp