ereny1a/llama2_rs

RustMIT

llama2 in Rust!

This was derived from https://github.com/karpathy/llama2.c to run multi-threaded inference.

It's 3+ times faster to run inference using this Rust port than the original llama2.c.