KarypisLab/ParMETIS

MPI fails when graph size larger than INT_MAX

JingchengYu94 opened this issue · 1 comments

We are using DGL + ParMETIS to run some experiments on vary large datasets, with 400 million nodes and 3.7 billion edges. But MPI communication will fail on this line

The reason is global graph size is larger than INT_MAX, which makes rdispls contains elements that larger than INT_MAX. And there is a direct cast from 64bit idx_t type rdispls to 32bit int type lrdispls here, which makes lrdispls overflow.

Is there any easy solution to handle this?

This is an issue with MPI prior to 4.0. ParMetis has been updated to use MPI 4.0's APIs that resolve this problem.