/llama2.net

Inference Llama 2 in one file of pure C#

Primary LanguageC#

llama2.net

This is a pure C# port of Alfonso² Peterssen's Java port of Andrej Karpathy's awesome llama2.c, a very simple implementation to run inference of models with a Llama2-like transformer-based LLM architecture.

Build

Only needs the .NET 7 SDK.
The code expects tokenizer.bin in the current directory.
The sample stories15M.bin model can be found here

To build and run:

dotnet run -c Release stories15M.bin