xarray-contrib/cf-xarray

"Unit expression cannot have a scaling factor."

VeckoTheGecko opened this issue · 4 comments

I've been working with a dataset with units of "bar" and "amp hours". Looking through the CF conventions on units, it mentions that the conventions use the same units that UDUNITS supports (databases for units). Since bar and amp hours aren't noted in these databases, instead opting for 100 kPa and 3600 C respectively, which seems to be valid in UDUNITS world .

When using cf_xarray with this, I get the pint error:

invalid units for variable 'variable': 100 kPa (attribute) (reason: Unit expression cannot have a scaling factor.)

My questions:

  • Is there support in cf_xarray for scaled/shifted units? (or is this some of the missing functionality hinted to in #225)
  • Is this support planned in future?
  • What would be a good work around? Doing a manual conversion from <my_units> to SI then quantifying? Registering extra quantities? (I am very new to the world of pint, and would greatly appreciate any links to helpful resources or a quick code snippet)

Issue tracker searches I've tried:

  • units
  • Unit expression cannot have a scaling factor
  • UDUNITS

Might be a separate issue, but I also just realised that cf_xarray interprets C as Celsius, but UDUNITS uses C for coulombs and °C for celcius (UDINTS Derived SI). Does cf_xarray work best with the symbols specified, or with singular names?

Hi @VeckoTheGecko, I believe the 100 kPa problem indeed comes from pint. Looking at UDUNITS syntax doc, I believe pint is "missing" support for the "scaled", "offset". "logarithmic" and "grouped" string types. Indeed, this is a shortfall as discussed in #225.

However, cf-xarray is also not bounded to UDUNITS, it seems that "bar" and "amp hour" are supported. May be the workaround is to ingest non-cf-compliant units first ? If you want the output to be cf-compliant, you might not be able to use 100 kPa or 3600 C, but you can use kPa and coulomb.

As for the "C" issue, indeed the unit registry of cf-xarray was imported from xclim where we chose to use it as an alternative of "°C". The reason was that, in our experience, this mistake was quite common. Or at least, that it was more common to see this typo than it was to see coulomb used in climate data...

I believe pint is "missing" support for the "scaled", "offset". "logarithmic" and "grouped" string types. I

A PR adding this info to https://cf-xarray.readthedocs.io/en/latest/units.html would be very welcome!

As for the "C" issue, indeed the unit registry of cf-xarray was imported from xclim where we chose to use it as an alternative of "°C". The reason was that, in our experience, this mistake was quite common. Or at least, that it was more common to see this typo than it was to see coulomb used in climate data..

I think overriding Coulomb does make sense for the Climate world. I haven't actually seen "C" or "℃" in the wild. "degC" is the most common one I've seen.

Thanks for the advice and quick responses @aulemahal and @dcherian ! I misunderstood the package thinking it had parity with UDUNITS. I've now solved my issue by switching to bar and ampere_hour which are supported.

For anyone else finding this, I found the following code useful to list all units in cf_xarray:

from pint import application_registry as ureg
import cf_xarray.units

for unit in ureg:
    print(unit)