Option to remove prefix with how="bind"
psads-git opened this issue · 4 comments
Take your own example:
pokedex_wide <- rrapply(pokedex, how = "bind")
The colnames of the resulting dataframe have all the prefix pokemon
. It would be useful to have an easy way to remove the prefix.
Thanks in advance!
In the example, pokemon
is the name of the first list layer. To remove the pokemon
prefix, you can just index into the first element of the list:
library(rrapply)
data("pokedex")
pokedex_wide <- rrapply(pokedex[[1]], how = "bind")
head(pokedex_wide[, 1:5], n = 5)
#> id num name img
#> 1 1 001 Bulbasaur http://www.serebii.net/pokemongo/pokemon/001.png
#> 2 2 002 Ivysaur http://www.serebii.net/pokemongo/pokemon/002.png
#> 3 3 003 Venusaur http://www.serebii.net/pokemongo/pokemon/003.png
#> 4 4 004 Charmander http://www.serebii.net/pokemongo/pokemon/004.png
#> 5 5 005 Charmeleon http://www.serebii.net/pokemongo/pokemon/005.png
#> type
#> 1 Grass, Poison
#> 2 Grass, Poison
#> 3 Grass, Poison
#> 4 Fire
#> 5 Fire
I am not sure there is much added value in having a dedicated option for removing this prefix, or is there something I am missing?
Thanks, Joris, for your answer. If one takes the example below, I guess it is not that easy to remove value.
:
library(tidyverse)
library(rrapply)
eg <- tibble(user_id = c("10001", "10002"),
data = c("{'key': 'age', 'value': {'max': 40, 'min': 31}}",
"{'key': 'age', 'value': {'max': 30, 'min': 21}}"))
map(gsub("'", "\"", eg$data), jsonlite::fromJSON) %>%
rrapply(condition = \(x, .xname) .xname %in% c("min", "max"), how="bind")
#> value.max value.min
#> 1 40 31
#> 2 30 21
@psads-git: this has been addressed in release v1.2.5.
The data.frame columns in how = "bind"
now only concatenate child list names instead of full path names. In your previous example, it depends which depth layer is used to transform to individual data.frame columns. By default, this is the minimal depth across leaf nodes (i.e. key
/value
layer):
library(rrapply)
eg <- list(
list(key = "age", value = list(max = 40L, min = 31L)),
list(key = "age", value = list(max = 30L, min = 21L))
)
str(eg)
#> List of 2
#> $ :List of 2
#> ..$ key : chr "age"
#> ..$ value:List of 2
#> .. ..$ max: int 40
#> .. ..$ min: int 31
#> $ :List of 2
#> ..$ key : chr "age"
#> ..$ value:List of 2
#> .. ..$ max: int 30
#> .. ..$ min: int 21
## bind at depth 2
rrapply(eg, how = "bind")
#> key value.max value.min
#> 1 age 40 31
#> 2 age 30 21
In this case columns names start from the names key
and value...
.
If we bind child lists at the (deeper) list layer of min
and max
, the column names will start from min...
and max...
:
## bind at depth 3
rrapply(eg, how = "bind", options = list(coldepth = 3))
#> max min
#> 1 40 31
#> 2 30 21
The parent list names can still be added to the wide data.frame as individual columns by setting options = list(namecols = TRUE)
:
## bind at depth 3 + include name columns
rrapply(eg, how = "bind", options = list(coldepth = 3, namecols = TRUE))
#> L1 L2 max min
#> 1 1 value 40 31
#> 2 2 value 30 21
Thanks a lot, Joris! It is great now: A very nice improvement!