simple-llama3

A simple tested pytorch implementation of llama3 without fairscale.

Not only simpler but 25% faster than the original from meta: https://github.com/meta-llama/llama3

If you want to understand the transformer model I recommend you to read my implementation of a vanilla transformer, since I rehuse some code here.

Soon I will be adding an explanation of the Roformers.

mr-raccoon-97/simple-llama3