nuprl/MultiPL-E

How to run a multi-GPU model for inference testing?

Closed this issue · 2 comments

llama3 70b

python3 automodel.py --name /home/models/Meta-Llama-3-70B/ --root-dataset humaneval --lang $lang --temperature 0.2 --batch-size 40 --completion-limit 1 --output-dir-prefix $output

thanks

The following command line can solve the running problem.

Is there any difference in the effect between automodel.py and automodel_vllm.py?

python3 automodel_vllm.py --name /home/models/Meta-Llama-3-70B/ --revision main --num-gpus 2 --root-dataset humaneval --lang $lang --temperature 0.2 --batch-size 40 --completion-limit 1 --output-dir-prefix $output

VLLM uses VLLM, whereas the other one uses transformers. I suggest using the VLLM script if it works for you.

Alternatively, use MultiPL-E from here: https://github.com/bigcode-project/bigcode-evaluation-harness