The Racing Llama Benchmark (rlb) is designed to provide consistent LLM benchmarking across Linux, MacOS, and Windows.
NOTE: This project is in Alpha state and is currently being developed for MacOS. Linux and Windows support will be added next.
- Support Running llama.cpp against multiple parameter and quantization types
- Graph output
- Comparing thread counts
- Improved benchmark prompts
- Memory benchmarking
- CPU benchmarking
- Possible HPL-MxP support
- Possible jitter measurements
- JSON output
- Versions in output, such as llama.cpp, python-llama-cpp
- Linux support
- Windows Support