NOAA-GFDL/GFDL_atmos_cubed_sphere

Building 202204, gfortran-10 compiler error

Closed this issue · 3 comments

Running in ubuntu arm64/m1 docker container, I am building the most recent releases, using gfortran10 (the build process has generally failed for me with gcc-9, but gcc-10 has worked so far):

ARG SHiELD_VERSION="FV3-202204-public"
ARG FV3_VERSION="FV3-202204-public"

I updated the mkmf setup from:

#RUN list_paths -o ${BUILD_ROOT}/pathnames_gfs \
#    /atmos_cubed_sphere/driver/SHiELD/gfdl_cloud_microphys.F90 \
#    /atmos_cubed_sphere/driver/SHiELD/cloud_diagnosis.F90 \
#    /fv3_gfsphysics/gsmphys/  \
#    /fv3_gfsphysics/GFS_layer/ \
#    /fv3_gfsphysics/IPD_layer

to:

# Create the file lists for the build
RUN list_paths -o ${BUILD_ROOT}/pathnames_gfs \
    /atmos_cubed_sphere/model/cld_eff_rad.F90 \
    /atmos_cubed_sphere/model/gfdl_cld_mp.F90 \
    /fv3_gfsphysics/gsmphys/  \
    /fv3_gfsphysics/GFS_layer/ \
    /fv3_gfsphysics/IPD_layer

I build the SHiELD physics first, and it seems to compile with no problems:

# mk_make
RUN cd /exec  \
    && make -j ${MAKEJOBS} OPENMP=Y AVX=${avx} -f Makefile_gfs \
    && mkdir -p /opt/gfs && mkdir -p ${GFS_LIB} && mv libgfs.a ${GFS_LIB}

The error comes when I build the SHiELD exectuable:

RUN cd /exec  \
    && make -j ${MAKEJOBS} OPENMP=Y AVX=${avx} NETCDF=3 32BIT=Y NCEP_LIBS="${FMS_LIB}/libFMS.a ${GFS_LIB}/libgfs.a ${NCEP_LIBS}" -f Makefile_fv3 \
    && mv /exec/test.x ${SHiELD_BIN}/SHiELD_${type}.${comp}.${bit}.x

However I get an error that looks like it might be something not caught by your test compiler? The error seems to be in GFDL_atmos_cubed_sphere:

mpif90 -Duse_libMPI -Duse_netCDF -DHAVE_SCHED_GETAFFINITY -DSPMD -DUSE_LOG_DIAG_FIELD_INFO -Duse_LARGEFILE -DUSE_GFSL63 -DGFS_PHYS -DNO_CMIP_DIAG -DINTERNAL_FILE_NML -DOVERLOAD_R8 -DMOIST_CAPPA -DUSE_COND -Duse_netCDF -DHAVE_SCHED_GETAFFINITY -Duse_LARGEFILE  -I/opt/netcdf/include -I/opt/netcdf/include -fallow-argument-mismatch -fcray-pointer -fdefault-double-8 -fdefault-real-8 -Waliasing -ffree-line-length-none -fno-range-check -O3 -fopenmp -I/opt/fms/include -L/opt/fms/lib -L/opt/gfs/lib  -c	/atmos_cubed_sphere/tools/fv_io.F90
#45 19.32 /atmos_cubed_sphere/tools/fv_diagnostics.h:101:0:
#45 19.32 
#45 19.32   101 | #endif _FV_DIAG__
#45 19.32       | 
#45 19.32 Warning: extra tokens at end of #endif directive
#45 19.48 /atmos_cubed_sphere/tools/fv_diagnostics.F90:1355:126:
#45 19.48 
#45 19.48  1355 |     call fv_diag_column_init(Atm(n), yr_init, mo_init, dy_init, hr_init, do_diag_debug, do_diag_sonde, sound_freq, m_calendar)
#45 19.48       |                                                                                                                              1
#45 19.48 Error: More actual than formal arguments in procedure call at (1)
#45 19.48 make: *** [Makefile_fv3:51: fv_diagnostics.o] Error 1
#45 19.48 make: *** Waiting for unfinished jobs....
------
executor failed running [/bin/sh -c cd /exec      && make -j ${MAKEJOBS} OPENMP=Y AVX=${avx} NETCDF=3 32BIT=Y NCEP_LIBS="${FMS_LIB}/libFMS.a ${GFS_LIB}/libgfs.a ${NCEP_LIBS}" -f Makefile_fv3     && mv /exec/test.x ${SHiELD_BIN}/SHiELD_${type}.${comp}.${bit}.x]: exit code: 2

I looked at the two files:
https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/blob/main/tools/fv_diag_column.F90
https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/blob/main/tools/fv_diagnostics.F90

But I do not see where these conflict.

I tried adding the "-std=legacy" option to the mkmf file creation as FFLAGS:

ARG FV3_FCFLAGS='-fallow-argument-mismatch -std=legacy'

but this didn't seem to have an impact.

I found some discussion here about the error, which indicates potential issues depending on the type of the input arguments, though I'm not sure if it's helpful for this case because as far as I can tell the types look ok:
https://community.intel.com/t5/Intel-Fortran-Compiler/FFTW3-real-to-complex-and-inverse-transforms/m-p/1156491/highlight/true?profile.language=ja

@StevePny - Along with the dycore and SHiELD_physics release, there was a corresponding SHiELD_build release which was used for downloading appropriate versions and building executables. The build system and RTS/CI solo model test scripts were tested with various versions if Intel as well as gcc 9.2 (sles 15) and gcc 10.2 (Centos 7). Please try that system and modify the site/environment.gnu.sh as needed.

Hi @bensonr, I've set up my dockerfile now with the SHiELD_build release. However, I ended up defining the environment variables 'manually' in the dockerfile outside of site/environment.gnu.sh. I opened some issues there regarding portability. It would be helpful if there was a generic template that could take these perhaps as input arguments so I can keep the same structure as your build steps without too many sed-based modifications.

@StevePny - Thanks for alerting me to the issues in SHiELD_build. It was a new repo and I hadn't yet modified my options to watch it. If it's okay, I'll close out this issue and take up the conversation over there.