NOAA-GFDL/MDTF-diagnostics

Framework unable to take pressure level slices of "ta" (air temperature) data under CMIP convention

Closed this issue · 4 comments

Bug Severity

  • 1 = Minor problem that does not affect total framework functionality (e.g., computation error in a POD, problem with logging output, or an issue on a single system

Describe the bug
My POD in development requires air temperature only on two pressure levels, 50 and 100 hPa (I will refer to these as t50 and t100). However, I am unable to get MDTF to extract these individual levels under the CMIP naming convention because tas and ta both share the same standard_name of air_temperature, resulting in a failure to translate the request as specified in my settings.jsonc file.

To be clear:

  • My in-development POD works in its entirety if I do not ask it to take pressure level slices, and instead use the full 4-dimensional (time, plev, lat, lon) ta fields.
  • My in-development POD works in its entirety if I specifically write-out individual files for t50 and t100 (i.e., if it's unncessary for MDTF to take pressure level slices)
  • My settings.jsonc file does have the t50 and t100 entries alternate to full 4-dimensional (time, plev, lat, lon) ta files, but I think the framework does not understand it's supposed to take slices, since MDTF writes a rather large (~2GB when working with CMIP6 monthly mean AMIP/historical files) file for this data containing all pressure levels. This behavior contrasts with what happens with other fields like ua and va when asking for a pressure level slice.

Steps To Reproduce
I first edited data/fieldlist_CMIP.jsonc so that the ta entry has a scalar_coord_template, since I assumed it was necessary:

"ta": {
      "standard_name": "air_temperature",
      "units": "K",
      "ndim": 4
      "scalar_coord_templates": {"plev": "ta{value}"},
    },

The varlist portion of my settings.jsonc file looks like:

"varlist": {
    "v100": {
      "standard_name": "northward_wind",
      "units": "m s-1",
      "frequency": "mon",
      "dimensions": ["time", "lat", "lon"],
      "scalar_coordinates": {"lev": 100},
      "alternates" : ["va"]
    },
    "t100": {
      "standard_name": "air_temperature",
      "units": "K",
      "frequency": "mon",
      "dimensions": ["time", "lat", "lon"],
      "scalar_coordinates": {"lev": 100},
      "alternates" : ["ta"]
    },
    "t50": {
        "standard_name": "air_temperature",
        "units": "K",
        "frequency": "mon",
        "dimensions": ["time", "lat", "lon"],
        "scalar_coordinates": {"lev": 50},
        "alternates": ["ta"]
    },
    "zg": {
      "standard_name" : "geopotential_height",
      "units" : "m",
      "frequency": "mon",
      "dimensions": ["time", "lev", "lat", "lon"]
    },
    "va": {
      "standard_name" : "northward_wind",
      "units" : "m s-1",
      "frequency": "mon",
      "dimensions": ["time", "lev", "lat", "lon"],
      "requirement": "alternate"
    },
    "ta": {
      "standard_name" : "air_temperature",
      "units" : "K",
      "frequency": "mon",
      "dimensions": ["time", "lev", "lat", "lon"],
      "requirement": "alternate"
    }

Log information and/or terminal output
When I run my POD with the above changes/options I get messages like the following:

WARNING: Deactivated <#JCv9:stc_eddy_heat_fluxes.t100> due to PodConfigError("Couldn't parse the settings.jsonc file: Caught exception while translating name of <#JCv9:stc_eddy_heat_fluxes.t100>: ValueError("Variable name in convention 'CMIP' not uniquely determined by standard name 'air_temperature'.").").
Request for <#JCv9:stc_eddy_heat_fluxes.t100> (='(not translated)' @ 1mo, 100 hPa) failed; looking for alternate data.
Selected alternate set #1: [<#JCIM:stc_eddy_heat_fluxes.ta>].
WARNING: Deactivated <#JCzN:stc_eddy_heat_fluxes.t50> due to PodConfigError("Couldn't parse the settings.jsonc file: Caught exception while translating name of <#JCzN:stc_eddy_heat_fluxes.t50>: ValueError("Variable name in convention 'CMIP' not uniquely determined by standard name 'air_temperature'.").").
Request for <#JCzN:stc_eddy_heat_fluxes.t50> (='(not translated)' @ 1mo, 50 hPa) failed; looking for alternate data.
Selected alternate set #1: [<#JCIM:stc_eddy_heat_fluxes.ta>].

which results in the following fetching messages shortly thereafter:

Fetching <#JDue:stc_eddy_heat_fluxes.v100> (='va' @ 1mo, 100 hPa).
Fetching <#JCIM:stc_eddy_heat_fluxes.ta> (='ta' @ 1mo).
Fetching <#JCNb:stc_eddy_heat_fluxes.zg> (='zg' @ 1mo).

Notice in the above that the framework correctly recognizes it needs to take a pressure level slice of the va data, as is evidenced by:

Preprocessing <#JDue:stc_eddy_heat_fluxes.v100> (='va' @ 1mo, 100 hPa).
Cropped date range of <#JDue:stc_eddy_heat_fluxes.v100> from '1850-01-15 -- 2014-12-15' to '1979-01-15 -- 2014-12-15'.
Converted units on <#JDue:stc_eddy_heat_fluxes.v100>.
Extracted 100 hPa level from Z axis ('plev') of <#JDue:stc_eddy_heat_fluxes.v100>.
Writing 91 mb to $WK_DIR/stc_eddy_heat_fluxes/mon/CESM2-WACCM_CMIP_historical_r1i1p1f1.v100.mon.nc

but it does not do this with the temperature data:

Preprocessing <#JCIM:stc_eddy_heat_fluxes.ta> (='ta' @ 1mo).
Cropped date range of <#JCIM:stc_eddy_heat_fluxes.ta> from '1850-01-15 -- 2014-12-15' to '1979-01-15 -- 2014-12-15'.
Converted units on <#JCIM:stc_eddy_heat_fluxes.ta>.
Writing 1731 mb to $WK_DIR/stc_eddy_heat_fluxes/mon/CESM2-WACCM_CMIP_historical_r1i1p1f1.ta.mon.nc

Additional note
As a quick test to see if I was understanding the issue correctly, I tried editing the data/fieldlist_CMIP.jsonc so that the tas entry has a different standard name:

"tas" : {
      "standard_name": "surface_air_temperature",
      "units": "K",
      "ndim": 3,
      "modifier": "atmos_height"
    },

Afterward the framework is then able to correctly take slices from the pressure level air temperature file(s), since I get console messages like the following (and the POD completes successfully):

Fetching <#Qrch:stc_eddy_heat_fluxes.v100> (='va' @ 1mo, 100 hPa).
Fetching <#Qrgw:stc_eddy_heat_fluxes.t100> (='ta' @ 1mo, 100 hPa).
Fetching <#QrkC:stc_eddy_heat_fluxes.t50> (='ta' @ 1mo, 50 hPa).
Fetching <#Qqp6:stc_eddy_heat_fluxes.zg> (='zg' @ 1mo).

Of course, this is probably not an appropriate fix since tas and ta do actually share the same standard_name in the CMIP conventions.

@zdlawrence I see the conundrum. As with issue #328, could you share your POD branch and test data (or let me know if this is the same POD) so I can replicate the problem and try to fix it?

Also, I appreciate your detailed information--it is exactly what I need to point me in the right direction instead of trying to guess what is causing the problem.

@wrongkindofdoctor - This all centers around the same POD, so the branch/ftp info from my other comment on issue #328 also applies here.

As it pertains to this issue: Using the default CNRM-CM-6-1 data (with "days since" encoding) will allow the POD to complete, but you should see the behavior in the terminal/logs about the framework being unable to take pressure level slices of the temperature data.

@zdlawrence I fixed the logic so that the framework can correctly differentiate between 3-D fields with and without a scalar_coords attribute, and tested the patch with the POD obs data you shared, and some synthetic model data (the model data in the POD tarball was in one large netcdf file rather than the required 1 variable per file). Please merge the updates from the develop branch into your branch, and let me know if you are still having issues.

@wrongkindofdoctor Sorry for the delay on this -- your fix seems to have done the trick. My POD now correctly takes slices of the pressure level temperatures. Thanks for your help! I will close this issue.