ropensci/geojsonio

Digits with `geojson_list()`

josiekre opened this issue · 16 comments

There does not seem to be control over the number of digits output when passing a data frame to geojson_list(). No matter how many significant digits there are in the lon and lat columns, the resulting list has at most five digits to the right of the decimal, e.g. -77.11739, 38.94512.

I see from #96 that digits were thought about at some point, but I'm not seeing how this translates into geojson_list(). What am I missing?

Thanks.

Session Info
> devtools::session_info()
Session info -----------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.4.2 (2017-09-28)
 system   x86_64, darwin15.6.0        
 ui       RStudio (1.1.383)           
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       Europe/Copenhagen           
 date     2018-08-20                  

Packages ---------------------------------------------------------------------------------------
 package      * version    date       source                             
 assertthat     0.2.0      2017-04-11 CRAN (R 3.4.0)                     
 base         * 3.4.2      2017-10-04 local                              
 bindr          0.1.1      2018-03-13 cran (@0.1.1)                      
 bindrcpp     * 0.2.2      2018-03-29 cran (@0.2.2)                      
 citycastgtn  * 0.0.0.9000 <NA>       local                              
 cli            1.0.0      2017-11-05 CRAN (R 3.4.2)                     
 commonmark     1.4        2017-09-01 CRAN (R 3.4.1)                     
 compiler       3.4.2      2017-10-04 local                              
 crayon         1.3.4      2017-09-16 CRAN (R 3.4.1)                     
 curl           3.0        2017-10-06 CRAN (R 3.4.2)                     
 datasets     * 3.4.2      2017-10-04 local                              
 devtools       1.13.4     2017-11-09 CRAN (R 3.4.2)                     
 digest         0.6.15     2018-01-28 cran (@0.6.15)                     
 dplyr          0.7.6      2018-06-29 cran (@0.7.6)                      
 foreign        0.8-69     2017-06-22 CRAN (R 3.4.2)                     
 geojson        0.2.0      2017-11-08 CRAN (R 3.4.2)                     
 geojsonio      0.5.0.9100 2018-03-22 Github (ropensci/geojsonio@5c39fe7)
 glue           1.3.0      2018-07-17 cran (@1.3.0)                      
 graphics     * 3.4.2      2017-10-04 local                              
 grDevices    * 3.4.2      2017-10-04 local                              
 grid           3.4.2      2017-10-04 local                              
 hms            0.3        2016-11-22 CRAN (R 3.4.0)                     
 httr           1.3.1      2017-08-20 CRAN (R 3.4.1)                     
 jqr            1.0.0      2017-09-28 CRAN (R 3.4.2)                     
 jsonlite       1.5        2017-06-01 CRAN (R 3.4.0)                     
 jsonvalidate   1.0.0      2016-06-13 CRAN (R 3.4.0)                     
 lattice        0.20-35    2017-03-25 CRAN (R 3.4.2)                     
 lazyeval       0.2.1      2017-10-29 CRAN (R 3.4.2)                     
 lubridate      1.7.4      2018-04-11 CRAN (R 3.4.4)                     
 magrittr     * 1.5        2014-11-22 CRAN (R 3.4.0)                     
 maptools       0.9-2      2017-03-25 cran (@0.9-2)                      
 memoise        1.1.0      2017-04-21 CRAN (R 3.4.0)                     
 methods      * 3.4.2      2017-10-04 local                              
 pillar         1.1.0      2018-01-14 cran (@1.1.0)                      
 pkgconfig      2.0.1      2017-03-21 CRAN (R 3.4.0)                     
 purrr          0.2.4      2017-10-18 CRAN (R 3.4.2)                     
 R6             2.2.2      2017-06-17 CRAN (R 3.4.0)                     
 Rcpp           0.12.18    2018-07-23 cran (@0.12.18)                    
 readr          1.1.1      2017-05-16 CRAN (R 3.4.0)                     
 rgdal          1.2-18     2018-03-17 cran (@1.2-18)                     
 rgeos          0.3-26     2017-10-31 cran (@0.3-26)                     
 RJSONIO        1.3-0      2014-07-28 CRAN (R 3.4.0)                     
 rlang          0.2.1.9000 2018-07-30 Github (r-lib/rlang@d97e73d)       
 roxygen2       6.0.1      2017-02-06 CRAN (R 3.4.0)                     
 rstudioapi     0.7        2017-09-07 CRAN (R 3.4.1)                     
 sp             1.3-1      2018-06-05 cran (@1.3-1)                      
 stats        * 3.4.2      2017-10-04 local                              
 stringi        1.2.2      2018-05-02 cran (@1.2.2)                      
 stringr        1.3.1      2018-05-10 cran (@1.3.1)                      
 tibble         1.4.2      2018-01-22 CRAN (R 3.4.3)                     
 tidyr          0.8.0      2018-01-29 cran (@0.8.0)                      
 tidyselect     0.2.4      2018-02-26 cran (@0.2.4)                      
 tools          3.4.2      2017-10-04 local                              
 utf8           1.1.3      2018-01-03 cran (@1.1.3)                      
 utils        * 3.4.2      2017-10-04 local                              
 V8             1.5        2017-04-25 CRAN (R 3.4.0)                     
 withr          2.1.2      2018-05-02 Github (jimhester/withr@79d7b0d)   
 xml2           1.2.0      2018-01-24 cran (@1.2.0)                      
 yaml           2.1.19     2018-05-01 cran (@2.1.19)  

Here is a simple example:

> (s <- dplyr::data_frame(
      id = c("A", "B"),
      lat = c(38.949019, 39.008222),
      lon = c(-77.080369, -76.780363)
  ))
# A tibble: 2 x 3
  id      lat   lon
  <chr> <dbl> <dbl>
1 A      38.9 -77.1
2 B      39.0 -76.8

Let's print out the lon column from the data frame to make sure all the digits we put in are still there. We are looking for 6 digits after the decimal and 6 come out:

> format(s$lon, digits = 10)
[1] "-77.080369" "-76.780363"

Now we'll convert it.

> (g <- geojsonio::geojson_list(s, lat = "lat", lon = "lon"))
$type
[1] "FeatureCollection"

$features
$features[[1]]
$features[[1]]$type
[1] "Feature"

$features[[1]]$geometry
$features[[1]]$geometry$type
[1] "Point"

$features[[1]]$geometry$coordinates
[1] -77.08037  38.94902


$features[[1]]$properties
$features[[1]]$properties$id
[1] "A"



$features[[2]]
$features[[2]]$type
[1] "Feature"

$features[[2]]$geometry
$features[[2]]$geometry$type
[1] "Point"

$features[[2]]$geometry$coordinates
[1] -76.78036  39.00822


$features[[2]]$properties
$features[[2]]$properties$id
[1] "B"




attr(,"class")
[1] "geo_list"
attr(,"from")
[1] "data.frame"

Now when we look again, we have lost a digit (5 instead of 6):

> is(g$features[[1]]$geometry$coordinates)
[1] "numeric" "vector" 
> format(g$features[[1]]$geometry$coordinates, digits = 10)
[1] "-77.08037" " 38.94902"

thanks @josiekre will have a look

one thing is that you're on an older dev version, we're currently on 0.6.0.9100, though I don't think that affects the issue at hand.

have you played with the digits option in R? you can get it by getOption('digits') and set it by options(digits = 8)

options(digits = 8)
s <- dplyr::data_frame(
  id = c("A", "B"),
  lat = c(38.949019, 39.008222),
  lon = c(-77.080369, -76.780363)
)
(g <- geojsonio::geojson_list(input = s, lat = "lat", lon = "lon"))
lapply(g$features, "[[", c("geometry", "coordinates"))
#> [[1]]
#> [1] -77.080369  38.949019
#> 
#> [[2]]
#> [1] -76.780363  39.008222

let me know what you think

I thought I played with that. In the file output (as opposed to interactive output), I don't think it made a difference. But I will rerun and report back in a couple days. Thx @sckott.

A call to options(digits = 8) does impact the writing out, but I cannot figure out the relationship between that and digits in the call to geojson_list(). Can you elaborate on that?

Sorry for the delay @josiekre - was on vacation.

The number of digits written to the console depends on the R option digits, so you can set that outside of the geojsonio package yourself and it affects what geojson_list() returns. Does that make sense?

any thoughts @josiekre ?

I personally think it would be nicer to control the output digits within the function call. It would be better to create a controlled, repeatable function call. I think of options() as things that impact my current R session interactively.

It could be smart to default like this:

geojson_list(..., digits = options('digits')$digits)

thanks @josiekre for the suggestion. i'll think about that.

@josiekre I've experimented with this. there's no way I can see to allow the user to change digits in the outupt, AND not effect the global digits option.

You can format numbers directly with e.g., format, sprintf, etc. but we can't feasibly do that with the complex nested lists, etc. we're dealing with.

I think the best option is to document that if you want to change digits, set them with options(digits = x). Does that sound okay?

The jsonlite::toJSON() function has a digits option that works in this manner. Have you seen this?

thanks @josiekre yes, i have seen that, but did forget about it. In geojson_list some functions do go through jsonlite but some do not.

@jeroen is there a way to use here the approach you have for controlling digits in jsonlite? looks like you drop down to C as far as I can tell

@jeroen any thoughts on this ☝️ ?

@sckott I might have a simple solution by updating geojson_rw to take precision as a variable since it is used in geojson_write which is called by geojson_rw. I already have the code updated and it works for my use case. I can submit a PR if you think it would be useful for others and not break anything anywhere else. In my testing, it hasn't affected anything else that I have noticed.

thanks @ChrisJones687 ! a PR would be great.

So the PR #152 added ability to manipulate digits for sp class objects, but we don't have a solution for data.frame's, vectors, lists, etc.

I tried using sf internally with data.frames just to see if it would be feasible, but its much slower than what we have now. Oh, and I didn't even check that we can manipulate precision through this route, but I assume we can. Anyway ...

geojson_list_data.frame <- function(input, lat = NULL, lon = NULL, group = NULL,
                                    geometry = "point", type = "FeatureCollection", ...) {

  tmp <- guess_latlon(names(input), lat, lon)
  out <- list()
  for (i in seq_len(NROW(input))) {
    out[[i]] <- sf::st_point(as.numeric(c(input[i, tmp$lat], input[i, tmp$lon])))
  }
  tfile <- tempfile(fileext = ".geojson")
  tmpsf <- sf::st_as_sf(input, coords = c("lat", "long"))
  sf::st_write(tmpsf, tfile, quiet = TRUE)
  xx <- as.geo_list(jsonlite::fromJSON(readLines(tfile), TRUE, FALSE, FALSE),
    "data.frame")
  xx$features <- lapply(xx$features, function(z) {
    z$geometry$coordinates <- rev(as.numeric(z$geometry$coordinates))
    z
  })
  xx$name <- NULL
  return(xx)
}