Develop model_default_odin() for compatibility with vignettes
Opened this issue · 18 comments
We have an initial example of {odin} implementation of the overlapping interventions vignette on the odin-default-model
branch.
The next step will to add the relevant input checking and output formatting so model_default_odin()
behaves the same as model_default()
(i.e. could swap out in the vignette and user wouldn't notice if both compiled models provided).
Input checking and output formatting for model_default_odin()
are done. I've added test-model_default_odin.R a duplicate of test-model_default.R swapping out model_default()
for model_default_odin()
for all the test that don't generate a warning or error.
All the failing test are related to your slack comment about allowing vectorised inputs.
All commits are in the odin-default-model branch.
Below is the summary of devtools::test()
.
Passing tests
✅ Default odin model: basic expectations, scalar arguments
✅ Default odin model: statistical correctness, parameters
✅ Default odin model: contacts interventions and stats. correctness
✅ Default odin model: rate interventions
✅ Default odin model: errors and warnings, scalar arguments
✅ Default odin model: errors on vectorised input
Warning test
DLSODA- Warning..Internal T (=R1) and H (=R2) are
such that in the machine, T + H = T on the next step
(H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15
Failing tests
🐛 Default model: time dependence
Failure (test-model_default_odin.R:295:3): Default model: time dependence
all(epidemic_size(data_baseline) > epidemic_size(data)) is not TRUE
`actual`: FALSE
`expected`: TRUE
🐛 Default model: infection parameters as vectors
Failure (test-model_default_odin.R:423:3): Default model: infection parameters as vectors
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
Expected a scalar numeric for 'beta'
Error (test-model_default_odin.R:431:3): Default model: infection parameters as vectors
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
▆
1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:431:3
2. └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
3. └─odin10ab6dad (local) initialize(...)
4. └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7
🐛 Default model: composable elements as lists
Failure (test-model_default_odin.R:483:3): Default model: composable elements as lists
Expected `model_default_odin(uk_population, intervention = npi_list)` to run without any conditions.
i Actually got a <simpleError> with text:
Expected a scalar numeric for 'beta'
Error (test-model_default_odin.R:488:3): Default model: composable elements as lists
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
▆
1. └─epidemics::model_default_odin(uk_population, intervention = npi_list) at test-model_default_odin.R:488:3
2. └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
3. └─odin10ab6dad (local) initialize(...)
4. └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7
Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
Expected a scalar numeric for 'beta'
🐛 Default model: multi-parameter, multi-composables
Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
Expected a scalar numeric for 'beta'
Error (test-model_default_odin.R:545:3): Default model: multi-parameter, multi-composables
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
▆
1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:545:3
2. └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
3. └─odin10ab6dad (local) initialize(...)
4. └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7
epidemics_odin.Rmd is a clone of epidemics.Rmd that compares model_default()
with model_default_odin()
. The results are equal to within a tolerance of 0.4.
Nice work, thanks for putting together! Should be fine to close this issue now these have been added, once below point about potential mismatch clarified? Then can move follow up steps to future issues.
A few quick questions just to make sure I'm following the changes:
- Does
seirv_model
inmodel_default.R
compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening). - Do you have an idea of the speed difference between the two models? Both should be much faster than pure R simulation, but useful to know what we're looking at. Update: I pushed a quick time comparison on epidemics_odin.R, and seems like odin is actually about 10x faster:
Original: 0.629
odin: 0.0612
- Am I correct that the vectorisation in the original
model_default()
is handled withapply()
andMap()
functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature tomodel_default_odin()
(can make separate issue for this). - We've now got the overlapping interventions vignette covered (and hence the 'getting started with interventions one too'), so it seems that as well as including uncertainty/scenario functionality, the modelling vaccination vignettte would be the next target for additional functionality if we want
model_default_odin
to be a full copy ofmodel_default
from a user point of view. Vaccination would be good one for @alxsrobert to input on, as he's already got several models coded here. Unless I'm missing something obvious we should be looking at first.
Also noticed that we seem to have a mismatch in the outputs now. E.g. below using the original model_default
:
And this when we replace model_default
with model_default_odin
:
The two outputs match when using the odin model in the modelling_scenarios_odin.Rmd
example (which has below output):
Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin
?
contact_matrix_norm <- population$contact_matrix
contact_matrix_norm <- (contact_matrix_norm / max(Re(eigen(contact_matrix_norm)$values))) /
population$demography_vector
Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?
Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.
I mirrored the first 122 lines of model_default()
in model_default_odin()
because
- it deals with parameter checking and preparation
- didn't want to miss edge cases
- keeping it constant means differences are downstream (solvers/output formatting)
Another potential source of the difference can be in the way odin expects its parameters to be supplied.
Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).
I think from a user's perspective it'll compile on library()
because in development it compiles on devtools::load_all()
I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library()
and after that it's cached.
Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).
Yes. My preference would be to rename this issue and keep all things model_default_odin together here.
Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.
I mirrored the first 122 lines of
model_default()
inmodel_default_odin()
because* it deals with parameter checking and preparation * didn't want to miss edge cases * keeping it constant means differences are downstream (solvers/output formatting)
Another potential source of the difference can be in the way odin expects its parameters to be supplied.
I revisited the tolerance check in epidemics_odin.Rmd
to understand where difference might be. This is the entry with maximum difference in the output values, which corresponds to a very large difference in the epidemic dynamics:
> output[4548,]
time demography_group compartment value
<num> <char> <char> <num>
1: 303 40+ susceptible 27816495
> output_odin[4548,]
time demography_group compartment value
<num> <char> <char> <num>
1: 303 40+ susceptible 16413209
Noticed there was a discrepancy in the default R0 for the two models (1.5 vs 1.3 - also relates to issue #253 on misdescribed default in another vignette). Have pushed a fix and now this returns all.equal(output, output_odin)=T
. The multiple interventions vignette also now outputting correctly (see below). So maybe we need stricter checks on model output tolerances?
I think from a user's perspective it'll compile on
library()
because in development it compiles ondevtools::load_all()
I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly onlibrary()
and after that it's cached.
Thanks. Yep, {seir} uses odin.dust to compile odin models to dust (e.g. SIR model). But agree that as only {odin} on CRAN, makes sense to focus on this. And having run with fresh install, compilation is nice and quick (actually, seems smoother than the original model_default
C++ version).
Yes. My preference would be to rename this issue and keep all things model_default_odin together here.
Sounds good – shall we just rename title and create new to-do list in thread?
Yes please
Any thoughts on how to fix the warning. I spent all day trying to debug it. No luck yet.
⚠️ Default model: vaccination and stats. correctnessDLSODA- Warning..Internal T (=R1) and H (=R2) are such that in the machine, T + H = T on the next step (H = step size). Solver will continue anyway. In above message, R1 = 200, R2 = 4.47248e-15
The code below (adapted from the test file test-model_default_odin.R includes everything required to replicate it.
rm(list = ls())
devtools::load_all()
generate_population <- function() {
# Prepare contact matrix and demography vector
polymod <- socialmixr::polymod
contact_data <- socialmixr::contact_matrix(
polymod,
countries = "United Kingdom",
age.limits = c(0, 60),
symmetric = TRUE
)
contact_matrix <- t(contact_data$matrix)
demography_vector <- contact_data$demography$population
# make initial conditions - order is important
initial_conditions <- c(
S = 1 - 1e-6, E = 0,
I = 1e-6, R = 0, V = 0
)
initial_conditions <- rbind(
initial_conditions,
initial_conditions
)
# create a population
population(
name = "UK population",
contact_matrix = contact_matrix,
demography_vector = demography_vector,
initial_conditions = initial_conditions
)
}
uk_population <- generate_population()
data_baseline <- model_default(uk_population)
data_baseline_odin <- model_default_odin(uk_population)
all.equal(data_baseline, data_baseline_odin)
## [1] TRUE
all.equal(epidemic_size(data_baseline), epidemic_size(data_baseline_odin))
## [1] "Mean relative difference: 3.632693e-07"
single_vaccination <- vaccination(
name = "double_vaccination",
nu = matrix(1e-3, nrow = 2),
time_begin = matrix(0, nrow = 2),
time_end = matrix(100, nrow = 2)
)
data <- model_default(uk_population, vaccination = single_vaccination)
data_odin <- model_default_odin(uk_population, vaccination = single_vaccination)
all.equal(data, data_odin)
## [1] "Column 'value': Mean relative difference: 1.898998"
all.equal(epidemic_size(data), epidemic_size(data_odin))
## [1] "Mean relative difference: 0.9805965"
high_rate_vax <- vaccination(
time_begin = matrix(200, nrow(uk_population$contact_matrix)),
time_end = matrix(200 + 150, nrow(uk_population$contact_matrix)),
nu = matrix(0.01, nrow = nrow(uk_population$contact_matrix))
)
data_high_rate_vax <- model_default(
uk_population,
vaccination = high_rate_vax, time_end = 600
)
data_high_rate_vax_odin <- model_default_odin(
uk_population,
vaccination = high_rate_vax, time_end = 600
)
## Warning messages:
## 1: In lsoda(y, times, func, parms, ...) :
## an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
## 2: In lsoda(y, times, func, parms, ...) :
## Returning early. Results are accurate, as far as they go
c(nrow(data_high_rate_vax), nrow(data_high_rate_vax_odin))
## [1] 6010 2010
rows_to_compare <- seq_len(nrow(data_high_rate_vax_odin))
all.equal(
data_high_rate_vax[rows_to_compare, ],
data_high_rate_vax_odin[rows_to_compare, ]
)
## [1] "Column 'value': Mean relative difference: 1.666696e-05"
all.equal(
epidemic_size(data_high_rate_vax[rows_to_compare, ]),
epidemic_size(data_high_rate_vax_odin[rows_to_compare, ])
)
## [1] "Numeric: lengths (2, 0) differ"
I've also {microbenchmark}ed
model_default(slow) v model_default_odin(fast).
microbenchmark::microbenchmark(
model_default(uk_population),
model_default_odin(uk_population),
model_default(uk_population, vaccination = single_vaccination),
model_default_odin(uk_population, vaccination = single_vaccination)
)
## Unit: milliseconds
## expr min
## model_default(uk_population) 4.819167
## model_default_odin(uk_population) 2.531334
## model_default(uk_population, vaccination = single_vaccination) 4.622542
## model_default_odin(uk_population, vaccination = single_vaccination) 2.493667
## lq mean median uq max neval
## 4.944188 5.187382 5.014063 5.147542 12.064501 100
## 2.601313 3.541654 2.691938 2.832730 70.017876 100
## 4.787147 4.943179 4.842876 4.941834 8.213376 100
## 2.605500 2.850439 2.681106 2.799334 6.832918 100
Looks like that's happening because the underlying ODE solver has ended with a very small (~0) time step, likely as the result of a sudden change in dynamics that it's struggling to integrate – maybe from a large influx of vaccination? Strange that it doesn't seem to be an issue for the simple_vaccination
example. Tagging @hillalex who may also have some thoughts.
I spotted the problem by checking the values of vax_nu
that were getting passed to {odin} via model_default_odin
. Turns out in .check_prepare_args_default
, it scales the vaccination rate to give the number of people vaccinated (see below code), whereas in odin model, we're currently implementing as a per-capita vaccination rate (i.e. proportion of S group vaccinated per unit time).
# calculate vax_nu as the NUMBER of vaccinations per timestep
# rather than a fraction/rate. See issue #198 for more details.
vax_nu <- vax_nu * mod_args[["population"]][["demography_vector"]]
So {odin} returns an error because we're removing susceptible at an order of magnitude too high rate, turning the ODEs into a system that can't be accurately solved.
I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i]
to min(vax_rate[i],S[i])
.
I'm getting a test failure when I tried swapping out model_default()
with model_default_odin()
. I've pushed a commit to see if this is the same behavior for others. devtools::test()
output below.
Failure (test-model_default_odin.R:260:3): Default model: vaccination and stats. correctness
Check on 'data[grepl("susceptible", data$compartment, fixed = TRUE), ]$value' failed: Element 661 is not >= 0
Backtrace:
▆
- └─checkmate::expect_numeric(...) at test-model_default_odin.R:260:3
- └─checkmate::makeExpectation(x, res, info, label)
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 627 ]
I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from
vax_rate[i]*S[i]
tomin(vax_rate[i],S[i])
.
Don't think I understand why min
?
We could try passing what the odin model expects instead of what the C++ implementation expects?
Don't think I understand why
min
?We could try passing what the odin model expects instead of what the C++ implementation expects?
The min is to ensure that we don't vaccinate more people than there are susceptibles remaining. Basically there are two main ways to implement vaccination: 1) vaccinate a relative % per day (i.e. vax_rate[i]*S[i]
) or 2) vaccinate an absolute number per day (min(vax_rate[i],S[i]
). Issue #198 points out that (2) is often more realistic, because we'll have a stock of vaccine we can distribute per day, rather than just aiming to reach a fixed % then stopping each day. So I'm happy to keep this new version for now - but can discuss possible extension options more with @alxsrobert
I'm getting a test failure when I tried swapping out
model_default()
withmodel_default_odin()
. I've pushed a commit to see if this is the same behavior for others.devtools::test()
output below.
Looks like the issue is a very small amount of overshoot in the solver because min() used (i.e. it removes a large number of vaccinated individuals from S before the solver catches up to fact S is now slightly smaller). So we get a value of -3.5e-7
at the tail end of the epidemic.
Using a scaling like in issue #198 may fix this, but could introduce some solver ineffeciencies.
In the latest commit model_default_odin()
now works with vector parameters. devtools::test()
now passes the test below.
beta <- rnorm(10, 1.3 / 7, sd = 0.01)
sigma <- rnorm(10, 0.5, sd = 0.01)
gamma <- rnorm(10, 1 / 7, sd = 0.01)
test_that("Default odin model: infection parameters as vectors", {
Thanks! I've run through the vignettes using a version of the package with the odin model (i.e. model_default = model_default_odin
in model_default.R
).
Here's a summary of checks on which ones ran and produced figures that are the same (based on visual check) as the original model vignettes:
Modelling interventions
- modelling_interventions.Rmd
- modelling_multiple_interventions.Rmd
- modelling_rate_interventions.Rmd
Two notes: 1) this wasn't running initially because the dimension of the intervention was transposed in model_default - once this transposition was removed, the vignette ran; 2) the vignette produced a different plot to the original Cpp model for the same reduction in transmission (i.e. intervention had greater impact in {odin} model). Looking more, this is because the current model_default doesn't specify which parameter is being targeted by the rate intervention - so it's passed to odin model to reduce contacts (like other contact interventions). Not sure why they don't give same answer as it should be multiplicative (beta*contacts in the model). But there may be a subtlety in the cpp implementation I'm missing.
Vaccination
- modelling_vaccination.Rmd
Note: This runs but vignette quite brief currently - should be addressed in tutorial, so could bring some of this plotting code for comparisons over to vignette. Also noting the numerical issues with modelling linear declines in S above
Time dependence
- modelling_time_dependence.Rmd
Advanced modelling
- modelling_param_uncertainty.Rmd
- modelling_scenarios.Rmd Note: the code runs, but impact of mask interventions are not same as existing vignette (see notes about modelling_rate_interventions above)
- finalsize_comparison.Rmd
Package vignettes
- epidemics.Rmd
- README.Rmd
Other observations
- intervention() is an {epidemics} function but also used to define a variable inside the model (i.e. list of input interventions). Probably not an issue from a loaded package point of view, but caught me out while debugging.
I also noticed that the odin multiplicative interventions weren't consistent with the additive spec in the design principles (see issue #255) so have pushed a fix (which actually has the result of simplifying the odin code – although will still need a bit more thought for the modelling_rate_interventions
component).