Develop model_default_odin() for compatibility with vignettes

Question

Develop model_default_odin() for compatibility with vignettes

Opened this issue a month ago · 18 comments

We have an initial example of {odin} implementation of the overlapping interventions vignette on the odin-default-model branch.

The next step will to add the relevant input checking and output formatting so model_default_odin() behaves the same as model_default() (i.e. could swap out in the vignette and user wouldn't notice if both compiled models provided).

Answer 1 · 2024-10-25T22:07:30.000Z

Input checking and output formatting for model_default_odin() are done. I've added test-model_default_odin.R a duplicate of test-model_default.R swapping out model_default() for model_default_odin() for all the test that don't generate a warning or error.

All the failing test are related to your slack comment about allowing vectorised inputs.

All commits are in the odin-default-model branch.

Below is the summary of devtools::test().

Passing tests

✅ Default odin model: basic expectations, scalar arguments
✅ Default odin model: statistical correctness, parameters
✅ Default odin model: contacts interventions and stats. correctness
✅ Default odin model: rate interventions
✅ Default odin model: errors and warnings, scalar arguments
✅ Default odin model: errors on vectorised input

Warning test

⚠️ Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
      such that in the machine, T + H = T on the next step  
     (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

Failing tests

🐛 Default model: time dependence

Failure (test-model_default_odin.R:295:3): Default model: time dependence
all(epidemic_size(data_baseline) > epidemic_size(data)) is not TRUE

`actual`:   FALSE
`expected`: TRUE

🐛 Default model: infection parameters as vectors

Failure (test-model_default_odin.R:423:3): Default model: infection parameters as vectors
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:431:3): Default model: infection parameters as vectors
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:431:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

🐛 Default model: composable elements as lists

Failure (test-model_default_odin.R:483:3): Default model: composable elements as lists
Expected `model_default_odin(uk_population, intervention = npi_list)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:488:3): Default model: composable elements as lists
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(uk_population, intervention = npi_list) at test-model_default_odin.R:488:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

🐛 Default model: multi-parameter, multi-composables

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:545:3): Default model: multi-parameter, multi-composables
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:545:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

Answer 2 · 2024-10-25T22:28:25.000Z

epidemics_odin.Rmd is a clone of epidemics.Rmd that compares model_default() with model_default_odin(). The results are equal to within a tolerance of 0.4.

Answer 3 · 2024-10-25T23:00:05.000Z

Nice work, thanks for putting together! Should be fine to close this issue now these have been added, once below point about potential mismatch clarified? Then can move follow up steps to future issues.

A few quick questions just to make sure I'm following the changes:

Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).
Do you have an idea of the speed difference between the two models? Both should be much faster than pure R simulation, but useful to know what we're looking at. Update: I pushed a quick time comparison on epidemics_odin.R, and seems like odin is actually about 10x faster:

Original: 0.629
odin: 0.0612

Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).
We've now got the overlapping interventions vignette covered (and hence the 'getting started with interventions one too'), so it seems that as well as including uncertainty/scenario functionality, the modelling vaccination vignettte would be the next target for additional functionality if we want model_default_odin to be a full copy of model_default from a user point of view. Vaccination would be good one for @alxsrobert to input on, as he's already got several models coded here. Unless I'm missing something obvious we should be looking at first.

Answer 4 · 2024-10-25T23:26:06.000Z

Also noticed that we seem to have a mismatch in the outputs now. E.g. below using the original model_default:

And this when we replace model_default with model_default_odin:

The two outputs match when using the odin model in the modelling_scenarios_odin.Rmd example (which has below output):

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

  contact_matrix_norm <- population$contact_matrix
  contact_matrix_norm <- (contact_matrix_norm / max(Re(eigen(contact_matrix_norm)$values))) /
  population$demography_vector

Answer 5 · 2024-10-26T21:45:29.000Z

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because

it deals with parameter checking and preparation
didn't want to miss edge cases
keeping it constant means differences are downstream (solvers/output formatting)

Another potential source of the difference can be in the way odin expects its parameters to be supplied.

Answer 6 · 2024-10-26T22:25:57.000Z

Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

Answer 7 · 2024-10-27T03:27:49.000Z

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because
* it deals with parameter checking and preparation

* didn't want to miss edge cases

* keeping it constant means differences are downstream (solvers/output formatting)
Another potential source of the difference can be in the way odin expects its parameters to be supplied.

I revisited the tolerance check in epidemics_odin.Rmd to understand where difference might be. This is the entry with maximum difference in the output values, which corresponds to a very large difference in the epidemic dynamics:

> output[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 27816495
> output_odin[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 16413209

Noticed there was a discrepancy in the default R0 for the two models (1.5 vs 1.3 - also relates to issue #253 on misdescribed default in another vignette). Have pushed a fix and now this returns all.equal(output, output_odin)=T. The multiple interventions vignette also now outputting correctly (see below). So maybe we need stricter checks on model output tolerances?

Answer 8 · 2024-10-27T03:31:46.000Z

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Thanks. Yep, {seir} uses odin.dust to compile odin models to dust (e.g. SIR model). But agree that as only {odin} on CRAN, makes sense to focus on this. And having run with fresh install, compilation is nice and quick (actually, seems smoother than the original model_default C++ version).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

Sounds good – shall we just rename title and create new to-do list in thread?

Yes please

Answer 9 · 2024-10-28T22:09:59.000Z

Any thoughts on how to fix the warning. I spent all day trying to debug it. No luck yet.

⚠️ Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
     such that in the machine, T + H = T on the next step  
    (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

The code below (adapted from the test file test-model_default_odin.R includes everything required to replicate it.

rm(list = ls())
devtools::load_all()

generate_population <- function() {
  # Prepare contact matrix and demography vector
  polymod <- socialmixr::polymod
  contact_data <- socialmixr::contact_matrix(
    polymod,
    countries = "United Kingdom",
    age.limits = c(0, 60),
    symmetric = TRUE
  )
  contact_matrix <- t(contact_data$matrix)
  demography_vector <- contact_data$demography$population

  # make initial conditions - order is important
  initial_conditions <- c(
    S = 1 - 1e-6, E = 0,
    I = 1e-6, R = 0, V = 0
  )
  initial_conditions <- rbind(
    initial_conditions,
    initial_conditions
  )

  # create a population
  population(
    name = "UK population",
    contact_matrix = contact_matrix,
    demography_vector = demography_vector,
    initial_conditions = initial_conditions
  )
}
uk_population <- generate_population()


data_baseline <- model_default(uk_population)
data_baseline_odin <- model_default_odin(uk_population)
all.equal(data_baseline, data_baseline_odin)
## [1] TRUE

all.equal(epidemic_size(data_baseline), epidemic_size(data_baseline_odin))
## [1] "Mean relative difference: 3.632693e-07"


single_vaccination <- vaccination(
  name = "double_vaccination",
  nu = matrix(1e-3, nrow = 2),
  time_begin = matrix(0, nrow = 2),
  time_end = matrix(100, nrow = 2)
)
data <- model_default(uk_population, vaccination = single_vaccination)
data_odin <- model_default_odin(uk_population, vaccination = single_vaccination)
all.equal(data, data_odin)
## [1] "Column 'value': Mean relative difference: 1.898998"

all.equal(epidemic_size(data), epidemic_size(data_odin))
## [1] "Mean relative difference: 0.9805965"



high_rate_vax <- vaccination(
  time_begin = matrix(200, nrow(uk_population$contact_matrix)),
  time_end = matrix(200 + 150, nrow(uk_population$contact_matrix)),
  nu = matrix(0.01, nrow = nrow(uk_population$contact_matrix))
)
data_high_rate_vax <- model_default(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
data_high_rate_vax_odin <- model_default_odin(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
## Warning messages:
## 1: In lsoda(y, times, func, parms, ...) :
##   an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
## 2: In lsoda(y, times, func, parms, ...) :
##   Returning early. Results are accurate, as far as they go

c(nrow(data_high_rate_vax), nrow(data_high_rate_vax_odin))
## [1] 6010 2010

rows_to_compare <- seq_len(nrow(data_high_rate_vax_odin))

all.equal(
  data_high_rate_vax[rows_to_compare, ],
  data_high_rate_vax_odin[rows_to_compare, ]
)
## [1] "Column 'value': Mean relative difference: 1.666696e-05"

all.equal(
  epidemic_size(data_high_rate_vax[rows_to_compare, ]),
  epidemic_size(data_high_rate_vax_odin[rows_to_compare, ])
)
## [1] "Numeric: lengths (2, 0) differ"

I've also {microbenchmark}ed model_default(slow) v model_default_odin(fast).

microbenchmark::microbenchmark(
  model_default(uk_population),
  model_default_odin(uk_population),
  model_default(uk_population, vaccination = single_vaccination),
  model_default_odin(uk_population, vaccination = single_vaccination)
)
## Unit: milliseconds
##                                                                 expr      min
##                                         model_default(uk_population) 4.819167
##                                    model_default_odin(uk_population) 2.531334
##       model_default(uk_population, vaccination = single_vaccination) 4.622542
##  model_default_odin(uk_population, vaccination = single_vaccination) 2.493667
##        lq     mean   median       uq       max neval
##  4.944188 5.187382 5.014063 5.147542 12.064501   100
##  2.601313 3.541654 2.691938 2.832730 70.017876   100
##  4.787147 4.943179 4.842876 4.941834  8.213376   100
##  2.605500 2.850439 2.681106 2.799334  6.832918   100

Answer 10 · 2024-10-29T07:47:34.000Z

Looks like that's happening because the underlying ODE solver has ended with a very small (~0) time step, likely as the result of a sudden change in dynamics that it's struggling to integrate – maybe from a large influx of vaccination? Strange that it doesn't seem to be an issue for the simple_vaccination example. Tagging @hillalex who may also have some thoughts.

Answer 11 · 2024-11-02T03:42:48.000Z

I spotted the problem by checking the values of vax_nu that were getting passed to {odin} via model_default_odin. Turns out in .check_prepare_args_default, it scales the vaccination rate to give the number of people vaccinated (see below code), whereas in odin model, we're currently implementing as a per-capita vaccination rate (i.e. proportion of S group vaccinated per unit time).

# calculate vax_nu as the NUMBER of vaccinations per timestep
# rather than a fraction/rate. See issue #198 for more details.
vax_nu <- vax_nu * mod_args[["population"]][["demography_vector"]]

So {odin} returns an error because we're removing susceptible at an order of magnitude too high rate, turning the ODEs into a system that can't be accurately solved.

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

Answer 12 · 2024-11-04T21:24:26.000Z

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Failure (test-model_default_odin.R:260:3): Default model: vaccination and stats. correctness
Check on 'data[grepl("susceptible", data$compartment, fixed = TRUE), ]$value' failed: Element 661 is not >= 0
Backtrace:
▆

└─checkmate::expect_numeric(...) at test-model_default_odin.R:260:3

└─checkmate::makeExpectation(x, res, info, label)

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 627 ]

Answer 13 · 2024-11-04T21:28:35.000Z

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

Answer 14 · 2024-11-04T22:00:58.000Z

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

The min is to ensure that we don't vaccinate more people than there are susceptibles remaining. Basically there are two main ways to implement vaccination: 1) vaccinate a relative % per day (i.e. vax_rate[i]*S[i]) or 2) vaccinate an absolute number per day (min(vax_rate[i],S[i]). Issue #198 points out that (2) is often more realistic, because we'll have a stock of vaccine we can distribute per day, rather than just aiming to reach a fixed % then stopping each day. So I'm happy to keep this new version for now - but can discuss possible extension options more with @alxsrobert

Answer 15 · 2024-11-04T22:18:53.000Z

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Looks like the issue is a very small amount of overshoot in the solver because min() used (i.e. it removes a large number of vaccinated individuals from S before the solver catches up to fact S is now slightly smaller). So we get a value of -3.5e-7 at the tail end of the epidemic.

Using a scaling like in issue #198 may fix this, but could introduce some solver ineffeciencies.

Answer 16 · 2024-11-06T12:56:54.000Z

In the latest commit model_default_odin() now works with vector parameters. devtools::test() now passes the test below.

beta <- rnorm(10, 1.3 / 7, sd = 0.01)
sigma <- rnorm(10, 0.5, sd = 0.01)
gamma <- rnorm(10, 1 / 7, sd = 0.01)

test_that("Default odin model: infection parameters as vectors", {

Answer 17 · 2024-11-06T22:04:01.000Z

Thanks! I've run through the vignettes using a version of the package with the odin model (i.e. model_default = model_default_odin in model_default.R).
Here's a summary of checks on which ones ran and produced figures that are the same (based on visual check) as the original model vignettes:

Modelling interventions

modelling_interventions.Rmd
modelling_multiple_interventions.Rmd
modelling_rate_interventions.Rmd
Two notes: 1) this wasn't running initially because the dimension of the intervention was transposed in model_default - once this transposition was removed, the vignette ran; 2) the vignette produced a different plot to the original Cpp model for the same reduction in transmission (i.e. intervention had greater impact in {odin} model). Looking more, this is because the current model_default doesn't specify which parameter is being targeted by the rate intervention - so it's passed to odin model to reduce contacts (like other contact interventions). Not sure why they don't give same answer as it should be multiplicative (beta*contacts in the model). But there may be a subtlety in the cpp implementation I'm missing.

Vaccination

modelling_vaccination.Rmd
Note: This runs but vignette quite brief currently - should be addressed in tutorial, so could bring some of this plotting code for comparisons over to vignette. Also noting the numerical issues with modelling linear declines in S above

Time dependence

modelling_time_dependence.Rmd

Advanced modelling

modelling_param_uncertainty.Rmd
modelling_scenarios.Rmd Note: the code runs, but impact of mask interventions are not same as existing vignette (see notes about modelling_rate_interventions above)
finalsize_comparison.Rmd

Package vignettes

epidemics.Rmd
README.Rmd

Other observations

intervention() is an {epidemics} function but also used to define a variable inside the model (i.e. list of input interventions). Probably not an issue from a loaded package point of view, but caught me out while debugging.

Answer 18 · 2024-11-07T06:08:15.000Z

I also noticed that the odin multiplicative interventions weren't consistent with the additive spec in the design principles (see issue #255) so have pushed a fix (which actually has the result of simplifying the odin code – although will still need a bit more thought for the modelling_rate_interventions component).