BUG: Unit handling for temperature difference indices
DamienIrving opened this issue · 4 comments
- icclim version: 6.1.3
- Python version: 3.10.8
Description
It appears that for indices that involve calculating the difference between two temperatures (e.g. DTR, ETR, VDTR), unit conversion is performed after the difference is taken. This is problematic if the input units are Kelvin, because the difference between two temperatures in Kelvin is equivalent to the difference in Celsius and hence no unit conversion is required.
Minimal reproducible example
files_tasmax = [
"tasmax_day_CNRM-CM5_historical_r1i1p1_19300101-19341231.nc",
"tasmax_day_CNRM-CM5_historical_r1i1p1_19350101-19391231.nc",
]
files_tasmin = [
"tasmin_day_CNRM-CM5_historical_r1i1p1_19300101-19341231.nc",
"tasmin_day_CNRM-CM5_historical_r1i1p1_19350101-19391231.nc",
]
out_f = "ETR_year_CNRM-CM5_historical_r1i1p1_1930-1939.nc" # OUTPUT FILE: annual values of ETR
icclim.index(
index_name="ETR",
in_files=[files_tasmax, files_tasmin],
var_name=["tasmax", "tasmin"],
slice_mode="year",
out_file=out_f,
)
Output received
The output (in out_f
) is a data array (with units = C) whose values range from -270 to -182 because unit conversion (i.e. subtracting 273.15 to move from K to C) appears to have been performed after the difference in temperatures (in Kelvin) was calculated.
import xarray as xr
ds = xr.open_dataset('ETR_year_CNRM-CM5_histrorical_r1i1p1_1930-1939.nc')
data_min = ds['ETR'].values.min()
data_max = ds['ETR'].values.max()
data_units = ds['ETR'].attrs['units']
print('Data minimum:', data_min)
print('Data maximum:', data_max)
print('Data units:', data_units)
Data minimum: -270.46213
Data maximum: -182.72186
Data units: C
Thanks @DamienIrving for the well documented issue.
Can we generalize this to any index involving a difference ?
This would be the following generic indices:
- MeanOfDifference (DTR)
- DifferenceOfExtremes (ETR)
- MeanOfAbsoluteOneTimeStepDifference (vDTR)
- DifferenceOfMeans (anomaly)
As a fix, we could convert to degC before doing the computation so that we would have the expected ECAD output unit (degC). However, with farennheit it means the output might be unexpected:
x = 10 degF # ~= -12 degC
y = 50 degF # == 10degC
y - x == "40 degF"
x_c = -12 degC
y_c = 10 degC
y_c - x_c = 32degC
not actual code
Otherwise we can choose to not follow the ECAD recommended unit for these indices and avoid unit conversion completely.
edit: It means the output would have the same unit as one of the input.
A related CF issue: cf-convention/discuss#101
Yes, this issue can be generalised to any index involving a difference.
That CF issue is interesting and captures the root of the problem - we're dealing with relative as opposed to absolute temperatures with these indices involving a difference.
I think I'd lean towards converting to degC before doing the calculation. I appreciate that creates a slight interpretation issue with respect to Fahrenheit temperatures, but I think it's the best option.
I finally have some time to work on a fix, I will try to publish the PR soon.