cerfacs-globc/icclim

BUG: Unit handling for temperature difference indices

DamienIrving opened this issue · 4 comments

  • icclim version: 6.1.3
  • Python version: 3.10.8

Description

It appears that for indices that involve calculating the difference between two temperatures (e.g. DTR, ETR, VDTR), unit conversion is performed after the difference is taken. This is problematic if the input units are Kelvin, because the difference between two temperatures in Kelvin is equivalent to the difference in Celsius and hence no unit conversion is required.

Minimal reproducible example

Example 2: Index ETR

files_tasmax = [
    "tasmax_day_CNRM-CM5_historical_r1i1p1_19300101-19341231.nc",
    "tasmax_day_CNRM-CM5_historical_r1i1p1_19350101-19391231.nc",
]
files_tasmin = [
    "tasmin_day_CNRM-CM5_historical_r1i1p1_19300101-19341231.nc",
    "tasmin_day_CNRM-CM5_historical_r1i1p1_19350101-19391231.nc",
]

out_f = "ETR_year_CNRM-CM5_historical_r1i1p1_1930-1939.nc"  # OUTPUT FILE: annual values of ETR

icclim.index(
    index_name="ETR",
    in_files=[files_tasmax, files_tasmin],
    var_name=["tasmax", "tasmin"],
    slice_mode="year",
    out_file=out_f,
)

Output received

The output (in out_f) is a data array (with units = C) whose values range from -270 to -182 because unit conversion (i.e. subtracting 273.15 to move from K to C) appears to have been performed after the difference in temperatures (in Kelvin) was calculated.

import xarray as xr

ds = xr.open_dataset('ETR_year_CNRM-CM5_histrorical_r1i1p1_1930-1939.nc')

data_min = ds['ETR'].values.min()
data_max = ds['ETR'].values.max()
data_units = ds['ETR'].attrs['units']

print('Data minimum:', data_min)
print('Data maximum:', data_max)
print('Data units:', data_units)
Data minimum: -270.46213
Data maximum: -182.72186
Data units: C
bzah commented

Thanks @DamienIrving for the well documented issue.

Can we generalize this to any index involving a difference ?
This would be the following generic indices:

  • MeanOfDifference (DTR)
  • DifferenceOfExtremes (ETR)
  • MeanOfAbsoluteOneTimeStepDifference (vDTR)
  • DifferenceOfMeans (anomaly)

As a fix, we could convert to degC before doing the computation so that we would have the expected ECAD output unit (degC). However, with farennheit it means the output might be unexpected:

x = 10 degF # ~= -12 degC
y = 50 degF # == 10degC
y - x == "40 degF"

x_c =  -12 degC
y_c = 10 degC
y_c - x_c = 32degC

not actual code

Otherwise we can choose to not follow the ECAD recommended unit for these indices and avoid unit conversion completely.
edit: It means the output would have the same unit as one of the input.

A related CF issue: cf-convention/discuss#101

Yes, this issue can be generalised to any index involving a difference.

That CF issue is interesting and captures the root of the problem - we're dealing with relative as opposed to absolute temperatures with these indices involving a difference.

I think I'd lean towards converting to degC before doing the calculation. I appreciate that creates a slight interpretation issue with respect to Fahrenheit temperatures, but I think it's the best option.

bzah commented

I finally have some time to work on a fix, I will try to publish the PR soon.