pika-org/pika

Out-of-bounds access on CPU mask on Grace CPU with hwloc 2.11

msimberg opened this issue · 1 comments

On Grace CPUs with hwloc 2.11 it seems like pika thinks there are 8 NUMA nodes each with 72 cores, which is false (there are only 4). This may be related to the following in the hwloc release notes (https://raw.githubusercontent.com/open-mpi/hwloc/v2.11/NEWS):

Don't hide the GPU NUMA node on NVIDIA Grace Hopper.

though I'm not 100% sure yet.

A workaround for anyone hitting this issue and not interested in upgrading to pika 0.28.0 is to export HWLOC_KEEP_NVIDIA_GPU_NUMA_NODES=0. This will tell hwloc to ignore the GPU NUMA nodes that it is able to detect from 2.11 onwards.