ufs-community/ccpp-physics

Physics cap compile error with empty suite group (duplicate 'cdata')

Opened this issue · 14 comments

Description

An empty suite group results in cdata being passed and declared twice in the init_cap function.

Steps to Reproduce

I am updating a capability to run an idealized periodic case. In order to turn off radiation, I removed the scheme calls from the radiation group (suite_FV3_mp_nssl_ideal.xml, attached as a text file):

  <group name="radiation">
    <subcycle loop="1">
    </subcycle>
  </group>

That used to work fine, but now cmake/prebuild is adding an extra 'cdata' in the call arguments [and defining it twice, the second time as intent(in)], which results in a compile error (ccpp_FV3_mp_nssl_ideal_cap.F90):

   function FV3_mp_nssl_ideal_init_cap(cdata,one,GFS_Control,levh2o,h2o_coeff,con_t0c,cdata,con_g,con_rd,con_cp, &
                  con_rv,con_cliq,con_csol,con_eps,GFS_Data) result(ierr)

      use ccpp_types, only: ccpp_t
      use GFS_typedefs, only: GFS_control_type
      use machine, only: kind_phys
      use GFS_typedefs, only: GFS_data_type

      implicit none

      integer :: ierr
      type(ccpp_t), intent(inout) :: cdata
      integer, intent(in) :: one
      type(gfs_control_type), intent(in) :: GFS_Control
      integer, intent(in) :: levh2o
      integer, intent(in) :: h2o_coeff
      real(kind_phys), intent(in) :: con_t0c
      type(ccpp_t), intent(in) :: cdata

It will compile OK if I comment the group out completely, but then I seem to get a runtime error:

An error occurred in ccpp_physics_run for group radiation, block    1 and thread    1 (ntX=   1):
Group radiation not found

Adding code to GFS_typedefs.F90 to set nsswr=-1 and nslwr=-1 works to turn off radiation as a work-around for the time being.

It also compiles and runs if I manually fix the extra cdata in the function and the call. I didn't have any problems with ccpp-framework as of commit 1b6352fb24f05 (Jan 2023), if that is any help.

Additional Context

  • Machine: Mac (intel CPU)
  • Compiler: GNU 11.3
  • Suite Definition File or Scheme: Custom

Output

Please include any relevant log files, screenshots or other output here.

suite_FV3_mp_nssl_ideal.xml.txt

@grantfirl Any thoughts on this? The prebuild problem forces the suite to have a valid radiation scheme call, so I made a work-around by adding a flag 'do_radiation' to the fv_core_nml namelist that can set the radiation interval such that radiation is never called. That works but deviates from the spirit of a suite, i.e., that a suite should contain exactly what will be used.

I think that this is probably my fault from NCAR/ccpp-framework#503. I'll see if I can reproduce and fix it.

Although, if the group is empty, you should just be able to delete the entire group, and I think it should be OK. This could be a temp fix for you. Have you tried that?

Yes, I tried deleting the group, and that compiles, but then there is a runtime error, "Group radiation not found". Thus the need for the workaround to prevent radiation from being called.

Yes, I tried deleting the group, and that compiles, but then there is a runtime error, "Group radiation not found". Thus the need for the workaround to prevent radiation from being called.

Ah yes, I didn't read the description carefully enough, sorry.

OK, the runtime error shows up because in the UFS, the calls for each CCPP group happen individually as opposed to the entire suite at once, as is the case with the SCM. That is, even though you remove the group from the SDF, the host model (atmos_model.F90) is still trying to call the radiation group, hence the runtime error message. You can get around this error by either editing FV3/ccpp/driver/CCPP_driver.F90 to just return when step = "radiation" or to comment out where the CCPP_step() subroutine is called with step="radiation" in FV3/atmos_model.F90.

This should solve your immediate problem, but, regardless, if there is an empty group in the SDF, the CCPP framework should be able to handle this by creating empty caps. That is perhaps an easier way for users to handle this situation than going into the host code and modifying calls.

Maybe another tact is to just ignore the missing group error (or not generate it?). I'll look at that. It would be more satisfying and intuitive to have no radiation group at all than to have an empty one.

Maybe a do_radiation flag is the way to go, after all. There are flags if FV3/atmos_model.F90 for calling the stochastic physics, so that the stochastics group can be absent from a suite, so why not radiation, as well? And mkstatic.py can be updated to set a specific error code (86 -- IYKYK) if a group is not found. And then I can use that error code to set the do_radiation flag if the user didn't set it in the namelist. And that way no fix is needed for the prebuild. (It could also only not be in the namelist and only set to false if that error code comes up?)

I think that there is a simple bugfix for making sure that cdata does not appear more than once in that argument list: NCAR/ccpp-framework#517

This should get us back to the way it worked before (still needing an empty radiation group in the SDF, but compiling and running fine). Whether or not we also need to modify the host CCPP interface to not force us to have specific groups is another question. I can bring it up with others and see what they think about your suggestions to provide more flexibility with SDFs.

Nice if it's a simple fix! Since we already have other changes to fv3 for our update to support an idealized domain, it's easy enough to include the code that detects if there is no radiation suite. (I guess I have been assuming that the SCM won't care -- correct?) That way it can work with either an empty group or no group at all.

Nice if it's a simple fix! Since we already have other changes to fv3 for our update to support an idealized domain, it's easy enough to include the code that detects if there is no radiation suite. (I guess I have been assuming that the SCM won't care -- correct?) That way it can work with either an empty group or no group at all.

Ya, I'd support your suggested changes to fv3 for the reasons you've mentioned. I'm going to go ahead and submit NCAR/ccpp-framework#517 as a real PR so that it can be merged in and host models can point to it, since it is a bugfix. Do you think that you really need, for example, https://github.com/NCAR/ccpp-framework/blob/0eca5c2c8885f3acadfed1f4945d19d1e97bb25f/scripts/mkstatic.py#L380C17-L380C17 changed to have a special error code? The reason why I'm reticent to do include this now is that there is momentum to switch to the next version of the CCPP framework that may have more complete error-checking and I don't want to necessarily change too much in the first version (ccpp_prebuild.py) except for bugfixes. If it is really necessary to introduce an error code different than 1 to complete the functionality in FV3 that you intend, we can bring it up with some of the other CCPP Framework folks and see if they're OK with it.

Ok, thanks for pointing that out. I suppose I don't really need to change mkstatic.py ierr if I can parse the errmsg in CCPP_driver.F90 and set that level's ierr to a unique value. I'll look into that.

@MicroTed I started a discussion here: NCAR/ccpp-framework#518

If the better error handling isn't coming any time soon, perhaps we can at least add a meaningful error code besides 1 that you can use for this purpose. It makes perfect sense, I just don't want to ignore or derail other efforts that may be ongoing that could also be useful for your (and others!) purpose. We'll see if the discussion goes anywhere.

Having to put code in to parse error message strings seems pretty tedious compared to setting integer error codes.

A conditional on errmsg works just fine in the FV3 CCPP_driver.F90, so I can use that for the time being and not mess with CCPP framework.