NOAA-GFDL/GFDL_atmos_cubed_sphere

uninitialized or incorrectly initialized memory in fv_regional_bc.F90

Closed this issue · 1 comments

Describe the bug

The ps_reg variable in fv_regional_bc.F90 is initialized to -9999999 Pa, which propagates throughout the model. All other boundary arrays were initialized to signalling NaN. Not initializing this one to signalling NaN masked many other bugs.

These are some of the bugs it masked inside the GFDL_atmos_cubed_sphere repository; there are others outside it:

  1. Remapping winds does a two point average of the current point and -9999999.
  2. Remapping winds uses -9999999 as pressure on velocity boundary points beyond the scalar boundary points.
  3. The -9999999 is copied to Atm%ps throughout the domain, beyond the boundary, on all processors that touch the domain boundary.
  4. Due to item 3, the RRFS hasn't been able to run in DEBUG=ON mode since this bug was introduced.
  5. The pressure data is not read in on velocity points outside the scalar boundary. (It can't be, since the the boundary condition files lack that data entirely.)
  6. The blending code blends data beyond boundary regions.
  7. The code accesses i-1 and j-1 indices, but processes the south boundary before the east and west boundary. That means two south boundary processes are accessing uninitialized regions instead of the east and west boundary.

There are other issues, outside the dynamical core, which were masked by this bug. I'll link to those issues once I create or find them.

It should also be noted that many users have found dubious values the model receives from the boundary conditions. These bugs may be the cause. However, we'll have to run parallels of multiple configurations to confirm that.

To Reproduce

  1. Read the code and find the ps_reg=-9999999
  2. Replace that with ps_reg=real_snan
  3. Compile in DEBUG=ON mode.
  4. Run it and see the bugs.
  5. Fix the bugs one by one and see more appear.

Expected behavior

  1. Boundary data should be correct.
  2. The boundary code should not write to non-boundary regions within the domain.
  3. Blending code should not blend data beyond the domain boundary.
  4. The best practice of initializing to real_snan should be followed.
  5. The code should pass in DEBUG=ON
  6. Changing the threads, layout, or task count, should not change the results.

System Environment
The ufs-weather-model build system targets hera.intel and hera.gnu, which use:

  • OS: CentOS 7.9.2009
  • Compiler(s): Intel 2022.1.2 and GCC 9.2.0
  • MPI type, and version: Intel MPI using the SLURM srun launcher
  • netCDF Version: 4.7.4
  • Configure options: cmake option "-DDEBUG=ON"

Additional context
None.

Closing this as PR #219 addressed this issue.