E3SM-Project/e3sm_diags

[Refactor]: CDAT Migration: `cdms2.open` slice flag adds additional time coordinate which affects downstream time series operations and diffs with Xarray code

Closed this issue · 6 comments

Is your feature request related to a problem?

In the CDAT version of the Dataset class, cdms2.open is being called with a slice flag (either "co"/"ccb"). This can cause result in major differences between the refactored Xarray code and the CDAT code

Related lines of code on main:

In #754, "co" is being set because time coordinates start at the beginning of the month. This adds an additional time coordinate point to the end of the time series file, which affects the subsequent climatology and spatial averaging calculations. I found omitting the extra time coordinate point before calculating the climatology and spatial averaging results in identical results to the xCDAT version.

Describe the solution you'd like

Should we consider a feature on the slice flag? I found almost no documentation on how to use this slice flag.

TODO

  • 1. Find out what "co" and "ccb" do when opening datasets
  • 2. Implement a function that mimics these two cases

Describe alternatives you've considered

No response

Additional context

  • Documentation: https://cdms.readthedocs.io/en/latest/manual/cdms_2.html#axis-methods-additional-to-coordinateaxis-cont-d
    • [left, right, indicator, cycle] -> e.g,. "co", "ccb"
    • "c" - interval is closed
    • "o" - interval is open
    • 'b' -select axis elements for which the corresponding cell boundary intersects the interval
    • Default is "ccn"
      "
  • The first 2 letters represents the bounds of the retrievd segment, they can be "c" or "o"
  • The third letter represents the search method, it can be "b" ,"n", "e", or "s"
    • The cell will be considered valid if the bounds ("b") or ("n") node are within the interval defined

Nice finding...I tried a few times to find docs on those slice options, and only found them through a xcdat discussion by @jypeter. The link to the documentation is here:https://cdms.readthedocs.io/en/latest/manual/cdms_2.html#axis-methods-additional-to-coordinateaxis-cont-d

I thought there was discussion in xcdat to consider this slicing feature, but not sure if there was a consensus.

Thanks @chengzhuzhang! I should have documented this when I ran into it while implementing the new Xarray climatology function in April. It came back to haunt me.

The good news is that I confirmed the slice flag is what is producing the large differences for the specific variables in #658 (specifically NET_FLUX_SRF and RESTOM spatial averages). Here is my validation notebook with those findings: https://github.com/E3SM-Project/e3sm_diags/blob/cdat-migration-fy24/auxiliary_tools/cdat_regression_testing/671-lat-lon/12-4-23-qa-no-cdms-slice.ipynb.

I re-ran main without the slice flag and found results are basically identical! So there is no bug that I'm aware of in the cdat-migration-fy24 branch so far.

I thought there was discussion in xcdat to consider this slicing feature, but not sure if there was a consensus.

Yeah we had a discussion but it fell off the radar. If we want to produce nearly 100% identical results with the CDAT version of e3sm_diags, we should consider revisiting this topic.

@chengzhuzhang the document you probably want about the slices, is page 9 of Unlocking_CDATs_Best_Kept_Secrets.pdf, available through xCDAT/xcdat#170. The option is probably also described in cdms5.pdf, but I think the figure in the Secrets document is better

At our meeting on 1/16, we thought of three options:

  1. Implement this feature in e3sm_diags
    • Optionally, port over this feature in xCDAT in the future (if there is user desire, demand)
  2. Implement this feature in xCDAT
    • Seems like a low priority feature request and xCDAT follows its own development cycle
  3. Update code on main to remove slice flag operation
    • I did this temporarily in #754 and confirmed removing the slice flag makes results nearly identical.
    • However, it is not recommended because it will affect all past production results.

For this GitHub issue, we decided to go with 1. as a near-term solution.

Closed by #750 and #815