The default handling of missingness in dataframes cannot be reached through parameters
Opened this issue · 1 comments
I was surprised to find the default behavior of calling toJSON
on a dataframe containing NA
s was to drop the headers. After reviewing issues (#223 and others) and the paper (2.4.3), it's clear this is intentional. It's surprising to me then that I cannot reproduce this behavior using toJSON
's na
parameter.
Consider the following:
library(jsonlite)
# Creating missing data
iris = iris[1:3, ]
iris[1, 1:2] = NA
iris[2, 3:4] = NA
The default behavior drops missing headers.
toJSON(iris, pretty = T)
#> [
#> {
#> "Petal.Length": 1.4,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.9,
#> "Sepal.Width": 3,
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.7,
#> "Sepal.Width": 3.2,
#> "Petal.Length": 1.3,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> }
#> ]
na = "string"
keeps headers, passing the string "NA"
to record missingness
toJSON(iris, na = "string", pretty = T)
#> [
#> {
#> "Sepal.Length": "NA",
#> "Sepal.Width": "NA",
#> "Petal.Length": 1.4,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.9,
#> "Sepal.Width": 3,
#> "Petal.Length": "NA",
#> "Petal.Width": "NA",
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.7,
#> "Sepal.Width": 3.2,
#> "Petal.Length": 1.3,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> }
#> ]
na = "null"
keeps headers, passing the value null
to record missingness.
toJSON(iris, na = "null", pretty = T)
#> [
#> {
#> "Sepal.Length": null,
#> "Sepal.Width": null,
#> "Petal.Length": 1.4,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.9,
#> "Sepal.Width": 3,
#> "Petal.Length": null,
#> "Petal.Width": null,
#> "Species": "setosa"
#> },
#> {
#> "Sepal.Length": 4.7,
#> "Sepal.Width": 3.2,
#> "Petal.Length": 1.3,
#> "Petal.Width": 0.2,
#> "Species": "setosa"
#> }
#> ]
The above is equivalent to toJSON(iris, na = NULL, pretty = T)
These are the three possible values to na
, and none of them reproduce the first result when na
was not specified.
I would expect the default handling of missingness in toJSON
to be reachable through its na
parameter.
Any updates / progress on this front? NA-handling is incredibly important.