ropensci/tidync

Unintended change in types of axes in hyper_tibble output after ver. 0.4.0 update?

Opened this issue · 4 comments

With the update to version 0.4.0, the return type of the axes in hyper_tibble has been changed. Specifically, in the case of sample meteorological data (#114), lat, lon, and time have been converted from numerics to characters. Is this an intentional change?

The output of hyper_transforms remains numeric both before and after the update, and it works with numeric types during filtering as well. In version 0.4.0, it now seems necessary to manually convert the types using something like tidync("gistemp250_GHCNv4.nc") |> hyper_tibble() |> dplyr::mutate(lat = as.numeric(lat), ...), which adds a bit of extra effort.

Many thanks!

# get sample meteorological data; https://github.com/ropensci/tidync/issues/114
url <- "https://data.giss.nasa.gov/pub/gistemp/gistemp250_GHCNv4.nc.gz"
curl::curl_download(url, basename(url))
system(sprintf("gunzip %s", basename(url)))
library(tidync)
tidync("gistemp250_GHCNv4.nc") |> hyper_tibble()

# ver. 0.4.0
# 
# # A tibble: 9,034,094 × 4
# tempanomaly lon   lat   time      
# <dbl> <chr> <chr> <chr>     
# 1       0.180 -179  -47   1880-01-15
# 2       0.180 -177  -47   1880-01-15
# 3       0.180 -175  -47   1880-01-15
# ...

# ver. 0.3.0
# 
# # A tibble: 9,034,094 × 4
# tempanomaly   lon   lat  time
# <dbl> <dbl> <dbl> <dbl>
# 1       0.180  -179   -47 29233
# 2       0.180  -177   -47 29233
# 3       0.180  -175   -47 29233
# ...


tidync("gistemp250_GHCNv4.nc") |> hyper_transforms()

# ver. 0.3.0 and ver. 0.4.0
# 
# $lon
# # A tibble: 180 × 6
# lon index    id name  coord_dim selected
# <dbl> <int> <int> <chr> <lgl>     <lgl>   
# 1  -179     1     1 lon   TRUE      TRUE    
# 2  -177     2     1 lon   TRUE      TRUE    
# 3  -175     3     1 lon   TRUE      TRUE    
# 4  -173     4     1 lon   TRUE      TRUE
# ...
#
# $lat 
# ...
#
# $time
# ...


# filtering by numeric values works with ver. 0.4.0
tidync("gistemp250_GHCNv4.nc") |>
  hyper_tibble(lat = lat > 80, lon = lon > 100)

# # A tibble: 1,560 × 4
# tempanomaly lon   lat   time      
# <dbl> <chr> <chr> <chr>     
# 1        7.07 101   81    1953-11-15
# 2        7.07 103   81    1953-11-15
# 3        7.07 105   81    1953-11-15
# ...
Session Info
Session info ──────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Sonoma 14.6.1
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Asia/Tokyo
 date     2024-10-09
 rstudio  2023.06.2+561 Mountain Hydrangea (desktop)
 pandoc   NAPackages ──────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version    date (UTC) lib source
 cachem         1.1.0      2024-05-16 [1] CRAN (R 4.3.3)
 cellranger     1.1.0      2016-07-27 [1] CRAN (R 4.3.0)
 cli            3.6.3      2024-06-21 [1] CRAN (R 4.3.3)
 devtools       2.4.5      2022-10-11 [1] CRAN (R 4.3.0)
 digest         0.6.37     2024-08-19 [1] CRAN (R 4.3.3)
 dplyr          1.1.4      2023-11-17 [1] CRAN (R 4.3.1)
 ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.3.0)
 fansi          1.0.6      2023-12-08 [1] CRAN (R 4.3.1)
 fastmap        1.2.0      2024-05-15 [1] CRAN (R 4.3.3)
 fs             1.6.4      2024-04-25 [1] CRAN (R 4.3.1)
 funkea         0.0.2.0001 2024-09-09 [1] Github (KeachMurakami/funkea@175917b)
 generics       0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
 glue           1.8.0      2024-09-30 [1] CRAN (R 4.3.3)
 htmltools      0.5.8.1    2024-04-04 [1] CRAN (R 4.3.1)
 htmlwidgets    1.6.4      2023-12-06 [1] CRAN (R 4.3.1)
 httpuv         1.6.15     2024-03-26 [1] CRAN (R 4.3.1)
 later          1.3.2      2023-12-06 [1] CRAN (R 4.3.1)
 lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
 lubridate      1.9.3      2023-09-27 [1] CRAN (R 4.3.1)
 magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
 memoise        2.0.1      2021-11-26 [1] CRAN (R 4.3.0)
 mime           0.12       2021-09-28 [1] CRAN (R 4.3.0)
 miniUI         0.1.1.1    2018-05-18 [1] CRAN (R 4.3.0)
 pillar         1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
 pkgbuild       1.4.4      2024-03-17 [1] CRAN (R 4.3.1)
 pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
 pkgload        1.4.0      2024-06-28 [1] CRAN (R 4.3.3)
 plantecophys   1.4-6      2021-03-31 [1] CRAN (R 4.3.0)
 profvis        0.4.0      2024-09-20 [1] CRAN (R 4.3.3)
 promises       1.3.0      2024-04-05 [1] CRAN (R 4.3.1)
 purrr          1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
 R6             2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
 Rcpp           1.0.13     2024-07-17 [1] CRAN (R 4.3.3)
 RcppRoll       0.3.1      2024-07-07 [1] CRAN (R 4.3.3)
 readxl         1.4.3      2023-07-06 [1] CRAN (R 4.3.0)
 remotes        2.5.0      2024-03-17 [1] CRAN (R 4.3.1)
 rlang          1.1.4      2024-06-04 [1] CRAN (R 4.3.3)
 rstudioapi     0.16.0     2024-03-24 [1] CRAN (R 4.3.1)
 sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 shiny          1.9.1      2024-08-01 [1] CRAN (R 4.3.3)
 stringi        1.8.4      2024-05-06 [1] CRAN (R 4.3.1)
 stringr        1.5.1      2023-11-14 [1] CRAN (R 4.3.1)
 tibble         3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidyr          1.3.1      2024-01-24 [1] CRAN (R 4.3.1)
 tidyselect     1.2.1      2024-03-11 [1] CRAN (R 4.3.1)
 timechange     0.3.0      2024-01-18 [1] CRAN (R 4.3.1)
 urlchecker     1.0.1      2021-11-30 [1] CRAN (R 4.3.0)
 usethis        3.0.0      2024-07-29 [1] CRAN (R 4.3.3)
 utf8           1.2.4      2023-10-22 [1] CRAN (R 4.3.1)
 vctrs          0.6.5      2023-12-01 [1] CRAN (R 4.3.1)
 winter         0.0.0.9002 2023-11-27 [1] Github (KeachMurakami/winter@6b135f6)
 xtable         1.8-4      2019-04-21 [1] CRAN (R 4.3.0)

Definitely not intended, I'll explore ty

Ok this was introduced as part of the CF timestamp change, @pvanlaake - I haven't isolated it yet but will try to do so in coming days. I should have had a test for that, whoops

This is caused by an oversight: in the code the tibble was constructed from the dimnames() of the tidync object and these are indeed of character type. The code has been updated and a PR is waiting to be merged into the main branch.

Note that for the "time" axis, the timestamps in the tibble are, and should be, of character type. Under the CF Metadata Conventions there are 9 different calendars and only 3 are compatible with POSIXt. Character strings can accommodate all of them. In this particular data set the "calendar" attribute is not given, meaning that it is assumed to be a "standard", POSIXt-compatible calendar, but for consistency all timestamps are given as a character string. You can convert to Date by adding a date column, using the as.Date() function.

Thanks @pvanlaake ! I agree about the timestamps