r-quantities/units

Simplification of units fails for `hectares`

bart1 opened this issue · 3 comments

bart1 commented

I just encountered the following case, I would have expected it to behave the same as the second case where the units become [1]. There is a simple solution to convert the units before from ha to m^2 but would be nice to work out of the box:

require(units)
#> Loading required package: units
#> udunits database from /usr/share/xml/udunits/udunits2.xml
# expected units [1] but becomes [ha/km^2]
set_units(1,"ha")*set_units(4,'1/km^2')
#> 4 [ha/km^2]
# works as expected
set_units(1,"m^2")*set_units(4,'1/km^2')
#> 4e-06 [1]
# also does correct conversion of units with hectares
set_units(1,"ha")+set_units(4,'km^2')
#> 401 [ha]
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2023-10-25
#>  pandoc   2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.1)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#>  evaluate      0.21    2023-05-05 [1] CRAN (R 4.3.1)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
#>  htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.3.1)
#>  knitr         1.43    2023-05-25 [1] CRAN (R 4.3.1)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.3.1)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.1)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.1)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.1)
#>  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.3.1)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.1)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.1)
#>  rmarkdown     2.23    2023-07-01 [1] CRAN (R 4.3.1)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
#>  styler        1.10.1  2023-06-05 [1] CRAN (R 4.3.1)
#>  units       * 0.8-4   2023-09-13 [1] CRAN (R 4.3.1)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.3.1)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.3.1)
#>  xfun          0.39    2023-04-20 [1] CRAN (R 4.3.1)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.1)
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Created on 2023-10-25 with reprex v2.0.2

This is expected behavior. The main limitation is that we cannot rely on UDUNITS2 for simplification, because it only simplifies to SI units. Therefore, we have our own method (if you are curious, it's here) that extracts all the symbols in the numerator and denominator and simply cancels out the ones that are the same or at least convertible.

Here, you have ha in the denominator, and two km in the denominator. km cannot be converted to ha, and therefore no simplification is applied. So, actually, it fails for compositions of units, like km^2. This is something that we cannot address (reasonably), because solving this for the general case would require either (1) reimplementing our own UDUNITS library with this capability, or (2) testing all possible compositions of units, which is a combinatorial problem that escalates very quickly.

bart1 commented

Sounds reasonable. I was a bit surprised it worked for the summation and not the division so that is why I thought I would raise it. In my case changing the input units of a script at once changed the outcome by several orders of magnitude.

Maybe some obvious cases like this one could be caught with an call to ud_are_convertable with the full denominator and numerator in the simplify function?

But then you multiply by just another unit and it stops working, leading to more confusion. I don't think that adding more complexity is worth it. As you said, if the conversion is explicitly requested, then it works, so no big deal. Automatic simplification is just a convenience feature.