racingllama/benchmark

Racing Llama Benchmark (rlb)

PythonMIT

rlb

The Racing Llama Benchmark (rlb) is designed to provide consistent LLM benchmarking across Linux, MacOS, and Windows.

NOTE: This project is in Alpha state and is currently being developed for MacOS. Linux and Windows support will be added next.

TODO

Support Running llama.cpp against multiple parameter and quantization types
Graph output
Comparing thread counts
Improved benchmark prompts
Memory benchmarking
CPU benchmarking
Possible HPL-MxP support
Possible jitter measurements
JSON output
Versions in output, such as llama.cpp, python-llama-cpp
Linux support
Windows Support