MPI-PR fails without good error messages when /dev/shm is full
Closed this issue · 1 comments
jeffhammond commented
NWChem crashed a few times and ended filling up /dev/shm
. That's not ideal, but I'm less worried about that. What was hard about this is that the error message is useless:
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 28219 RUNNING AT klondike
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================
I don't know where ComEx MPI-PR allocates these files but there needs to be error checking there.
All the files in /dev/shm
had cmx
in the name, which I assume will lead me to the code where they are allocated.
cmx00000010000000025680000001
...
sem.cmx00000010000000025680000000
...
ajaypanyala commented
fixed via #254