Out-of-bounds access on CPU mask on Grace CPU with hwloc 2.11
msimberg opened this issue · 1 comments
msimberg commented
On Grace CPUs with hwloc 2.11 it seems like pika thinks there are 8 NUMA nodes each with 72 cores, which is false (there are only 4). This may be related to the following in the hwloc release notes (https://raw.githubusercontent.com/open-mpi/hwloc/v2.11/NEWS):
Don't hide the GPU NUMA node on NVIDIA Grace Hopper.
though I'm not 100% sure yet.
msimberg commented
A workaround for anyone hitting this issue and not interested in upgrading to pika 0.28.0 is to export HWLOC_KEEP_NVIDIA_GPU_NUMA_NODES=0
. This will tell hwloc to ignore the GPU NUMA nodes that it is able to detect from 2.11 onwards.