timelyportfolio/sunburstR

Sequence missing.

Dekermanjian opened this issue · 8 comments

Hi, I am having an issue where the second sequence is being omitted. For example if the sequence is 1-2-3-4-5, the sequence being shown on the graph is 1-3-4-5. This can also be seen in the baseball example provided here: https://github.com/timelyportfolio/sunburstR/blob/master/inst/examples/example_baseball.Rmd

@Dekermanjian, I still don't fully understand. Sometimes this might happen if the hierarchy is malformed. The hyphenated path was not robust enough for me, so in 1.0.0 I switched to prioritize a d3r conversion into a proper hierarchy on the R side. If you can provide a reproducible dataset or example, I can demonstrate how to do this way instead. Also, this example might help illustrate this format.

@Dekermanjian also very small slices < 0.05% will be removed. See #67, but I don't think this is the problem in your case.

@Dekermanjian, email attachments don't come through on Github issues. Which baseball example? There are a couple in that rmd.

@Dekermanjian, if I look at the data and the example, the second row shows up...

image

Here is a messy way to use d3r to build the hierarchy. I am sure there is a better way, but this is what I came up with most quickly :)

library(sunburstR)
library(pitchRx)
library(dplyr)
library(tidyr)
library(d3r)

# get all data from 2016-08-25
dat <- scrape(start = "2016-08-25", end = "2016-08-25")

action <- dat$runner %>%
  group_by(event_num) %>%
  filter(row_number() == 1) %>%
  ungroup() %>%
  select(gameday_link, inning, inning_side, event) %>%
  nest(event) %>%
  mutate(data = lapply(data, function(x) {
    y = t(x)
    colnames(y) <- paste0("index",seq_len(nrow(x)))
    data.frame(y, stringsAsFactors=FALSE)
  }))

bind_rows(action$data) %>%
  group_by_all() %>%
  summarize(count = n()) %>%
  d3_nest(value_cols = "count") %>%
  sunburst(valueField = "count")