NOAA-GFDL/FMS

test_data_override sporadic failures in CI testing

Closed this issue · 0 comments

Describe the bug
occasionally the data_override monotonically increasing/decreasing tests fail in the CI but passes on subsequent runs:

expecting success of test_data_override2_mono.1 'test_data_override with monotonically increasing and decreasing data sets (r4)': 
    mpirun -n 6 ../test_data_override_ongrid_${KIND}
    
NOTE from PE     0: MPP_DOMAINS_SET_STACK_SIZE: stack size set to    32768.
NOTE from PE     0: MPP_DOMAINS_SET_STACK_SIZE: stack size set to 17280000.
 test_data_override_emc domain decomposition
whalo =    2, ehalo =    2, shalo =    2, nhalo =    2
  X-AXIS =  180 180

FATAL from PE     4: NetCDF: Unknown file format: netcdf_file_open:INPUT/grid_spec.nc

#0  0x7795ff7006dd in ???
#1  0x7795ffb4d34f in ???
#2  0x7795ffbace4f in ???
#3  0x7795ffbab916 in ???
#4  0x7795ffb5197f in ???
#5  0x7795ffc4cec9 in ???
#6  0x7795ffc4d212 in ???
#7  0x7795ffb35c24 in ???
#8  0x7795ffb33713 in ???
#9  0x7795ffdbd9c6 in ???
#10  0x7795ffdc33ad in ???
#11  0x4034be in ???
#12  0x4094fa in ???
#13  0x7795fdfd5eaf in ???
#14  0x7795fdfd5f5f in ???
#15  0x402324 in ???
#16  0xffffffffffffffff in ???
Abort(1) on node 4 (rank 4 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 4
error: last command exited with $?=1
  Y-AXIS =   60  60  60
not ok 1 - test_data_override with monotonically increasing and decreasing data sets (r4)
FAIL: test_data_override2_mono.sh 1 - test_data_override with monotonically increasing and decreasing data sets (r4)
#	
#	    mpirun -n 6 ../test_data_override_ongrid_${KIND}
#	    
FAIL: test_data_override2_mono.sh 1 - test_data_override with monotonically increasing and decreasing data sets (r4)

To Reproduce
TBD, only seen this pop up in CI testing every once in a while, and always disappears on a rerun. Most likely due to some type of file system slowness on the github hosted runner, I've seen similar errors while running on the cloud.

Expected behavior
not fail

System Environment
Describe the system environment, include:
CI image (gcc 12+ mpich)