TristanBilot/mlx-benchmark

Revised run command for CUDA testing

Opened this issue · 2 comments

Under the Run on other devices section, you have the command as:

python run_benchmark.py --include_mps=False --include_mlx_gpu=False --include_mlx_cpu=False --include_cuda=True --include_cpu=True

but it also needs to turn off the --include_mlx_gpu_compile option, so the correct command would be:

python run_benchmark.py --include_mps=False --include_mlx_gpu=False --include_mlx_gpu_compile=False --include_mlx_cpu=False --include_cuda=True --include_cpu=True

Also, in run_benchmark.py around line 126, you need to replace:

if backend in ["mps", "cuda"]:
        torch.mps.empty_cache()
        torch.cuda.empty_cache()

with

if backend == "cuda":
        torch.cuda.empty_cache()

if backend == "mps":
        torch.mps.empty_cache()

on a CUDA machine, the current code to empty the cache fails since there's no mps version of empty_cache on the cuda version of pytorch.

Thanks @DaveSprague for reporting this issue, it was indeed an important error!