mgirlich/tibblify

Add CI/CD tests with `gctorture2(1, inhibit_release = TRUE)`

Opened this issue · 7 comments

Running the full test suite with gctorture() is prohibitive. I think it might be possible to devise a minimal set of tests, and in these tests to run only the tibblify() call in gctorture() . The following set of tests (to be stored in covr.R) already gives 50% of coverage:

tibblify(list())

spec <- spec_object(
  a = tib_dbl("a")
)
try(tibblify(list(a = 1:2), spec = spec))
try(tibblify(list(), spec = spec))
try(tibblify(list(a = 1, a = 1), spec = spec))
try(tibblify(list(1), spec = spec))
try(tibblify(list(1, b = 1), spec = spec))

spec <- spec_object(
  a = tib_dbl("a", transform = as.numeric)
)
tibblify(list(a = "1"), spec = spec)

spec <- spec_object(
  a = tib_scalar("a", hms::hms(), transform = hms::hms)
)
tibblify(list(a = 1), spec = spec)

I am using the following script (e.g. in run.R) to find the next uncovered item:

library(tidyverse)

r_files <- fs::dir_ls("R", glob = "*.R")
# already has 100% coverage
c_files <- fs::dir_ls("src", glob = "*.c")
files <- c(r_files, c_files)

text <- map(rlang::set_names(files), readLines)
line_exclusions <- map(text, seq_along)

covr <- covr::package_coverage(type = "none", line_exclusions = line_exclusions, code = paste(readLines("covr.R"), collapse = "\n"))
covr::report(covr)

This script runs in just a few seconds. Run this script, find the next uncovered line, devise code that triggers that line, rinse, repeat.

The goal would be to:

  • expand the set of minimal tests to get ~100% coverage for the C++ code
  • further minimize the set of tests keeping full coverage
  • implement and use gctorture_tibblify() in those tests
  • if needed, split the tests into multiple files so that each runs in just a few seconds even with gctorture
  • perhaps skip_if_ci() those tests, or run them in a special step on GHA

Follow-up to #113.

Current iteration, with still 50% coverage, and giving an error with gctorture():

tibblify(list())

spec <- spec_object(
  tib_dbl("a")
)
try(tibblify(list(a = 1:2), spec = spec))
try(tibblify(list(), spec = spec))
try(tibblify(list(a = 1, a = 1), spec = spec))
try(tibblify(list(1), spec = spec))
try(tibblify(list(1, b = 1), spec = spec))

# transform
spec <- spec_object(
  tib_dbl("a", transform = as.numeric)
)
tibblify(list(a = "1"), spec = spec)

# types
spec <- spec_object(
  a = tib_scalar("a", hms::hms(), required = FALSE, transform = hms::hms)
)
tibblify(list(a = 1), spec = spec)
tibblify(list(), spec = spec)
try(tibblify(list(a = 1:2), spec = spec))

spec <- spec_object(
  tib_lgl("a", required = FALSE)
)
tibblify(list(a = TRUE), spec = spec)
tibblify(list(), spec = spec)

spec <- spec_object(
  tib_int("a", required = FALSE)
)
tibblify(list(a = 1L), spec = spec)
tibblify(list(), spec = spec)

spec <- spec_object(
  tib_dbl("a", required = FALSE)
)
tibblify(list(a = 1), spec = spec)
tibblify(list(), spec = spec)

spec <- spec_object(
  tib_chr("a", required = FALSE)
)
tibblify(list(a = "x"), spec = spec)
tibblify(list(), spec = spec)

spec <- spec_object(
  tib_int_vec("a", required = FALSE)
)
tibblify(list(a = 1:2), spec = spec)

spec <- spec_df(
  tib_int("a")
)
tibblify(list(list(a = 1L), list(a = 2L)), spec = spec)

spec <- spec_df(
  tib_int("a")
)
tibblify(list(list(a = 1L), list(a = 2L)), spec = spec)

spec <- spec_df(
  .names_to = "x",
  tib_int("a")
)
tibblify(list(x = list(a = 1L), y = list(a = 2L)), spec = spec)

spec <- spec_df(
  .names_to = "x",
  tib_variant("a")
)
tibblify(list(x = list(a = 1L), y = list(a = "z")), spec = spec)

The following gives an error after four minutes (!) with gctorture2(1, inhibit_release = TRUE)

# types
spec <- spec_object(
  a = tib_scalar("a", hms::hms(), required = FALSE, transform = hms::hms)
)
tibblify(list(a = 1), spec = spec)

It seems that the most time is spent constructing and throwing the error message:

Error in `abort_lossy_cast()`:
! Lossy cast from <character> to <hms> at position(s) 1
Backtrace:
     ▆
  1. ├─base::source("covr2.R", echo = TRUE)
  2. │ ├─base::withVisible(eval(ei, envir))
  3. │ └─base::eval(ei, envir)
  4. │   └─base::eval(ei, envir)
  5. └─tibblify::tibblify(list(a = 1), spec = spec) at covr2.R:5:0
  6.   └─tibblify::tibblify_impl(x, spec) at tibblify/R/tibblify.R:53:2
  7.     └─tibblify `<fn>`()
  8.       └─hms:::vec_cast.hms.character(...)
  9.         └─hms:::abort_lossy_cast(x, to, ..., lossy = lossy)
 10.           └─rlang::abort(...)
Execution halted

Would you be open to using plogr for logging the C++ code?

...and I can confirm that #113 (comment) is indeed the cause for this particular failure.

The alternative - maybe easier - would be to test out a cpp11 implementation and see whether I encounter the same performance issues and if so whether we can fix them. Might be the easier route. What do you think?

How would a cpp11 implementation help? Are you suggesting to replace all PROTECT() and UNPROTECT() calls with cpp11 objects?

I believe a minimal test suite would also be useful in that case.