title | emoji | python_version | app_file | sdk | sdk_version | pinned | tags | colorFrom | colorTo | ||
---|---|---|---|---|---|---|---|---|---|---|---|
ML.ENERGY Leaderboard |
⚡ |
3.9 |
app.py |
gradio |
3.35.2 |
true |
|
black |
black |
How much energy do LLMs consume?
This README focuses on explaining how to run the benchmark yourself. The actual leaderboard is here: https://ml.energy/leaderboard.
- For models that are directly accessible in Hugging Face Hub, you don't need to do anything.
- For other models, convert them to Hugging Face format and put them in
/data/leaderboard/weights/lmsys/vicuna-13B
, for example. The last two path components (e.g.,lmsys/vicuna-13B
) are taken as the name of the model.
We have our pre-built Docker image published with the tag mlenergy/leaderboard:latest
(Dockerfile).
$ docker run -it \
--name leaderboard0 \
--gpus '"device=0"' \
-v /path/to/your/data/dir:/data/leaderboard \
-v $(pwd):/workspace/leaderboard \
mlenergy/leaderboard:latest bash
The container internally expects weights to be inside /data/leaderboard/weights
(e.g., /data/leaderboard/weights/lmsys/vicuna-7B
), and sets the Hugging Face cache directory to /data/leaderboard/hfcache
.
If needed, the repository should be mounted to /workspace/leaderboard
to override the copy of the repository inside the container.
We run benchmarks using multiple nodes and GPUs using Pegasus. Take a look at pegasus/
for details.
You can still run benchmarks without Pegasus like this:
$ docker exec leaderboard0 python scripts/benchmark.py --model-path /data/leaderboard/weights/lmsys/vicuna-13B --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled.json
$ docker exec leaderboard0 python scripts/benchmark.py --model-path databricks/dolly-v2-12b --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled.json