/microsimulation

R package for microsimulation

Primary LanguageR

Microsimulation package for R

http://img.shields.io/:license-gpl3-blue.svg

News

  • The microsimulation package now allows for in-line use and provides a new SummaryReport class. See the in-line example below for a demonstration of both of these facilities.
  • The C++ application programming interface is now described using Doxygen.

Background

Microsimulation is a form of individual-based stochastic simulation. In continuous time, microsimulation is closely related to discrete event simulation, and in discrete time it is closely related to agent-based models. In econometrics and health care, microsimulation is often used to model policy changes. Our implementation is in continuous time and uses event-based discrete event simulation for the model specification.

The package provides several approaches for microsimulation and event-based, discrete event simulation. The package includes an R implementation of discrete event simulation, building on several R5 classes. This implementation is useful from a pedagogical perspective, but it is slow for larger microsimulations. For speed, we also provide C++ classes for discrete event simulation, building on several well established libraries, including the SSIM C++ discrete event simulation library, the RngStream C library for common random numbers, the Boost libraries for making many C++11 tools available to C++98, and Rcpp for providing glue between R, R’s C API and C++.

We specifically developed this package for modelling the cost-effectiveness of cancer screening, where there are many (e.g. 10^7) individuals who are followed from birth to death. Notably, we provide a complete prostate cancer screening model, including tools for cost-effectiveness analysis.

Installing the package

1 Installation through CRAN:

To install the microsimulation package from CRAN, just run the following in R:

install.packages("microsimulation")

For compiling from source, use steps 2–3.

2 Dependencies:

The microsimulation package requires Rcpp. A convenient, but not required, way of installing github-packages in R is to use devtools. Since both of the dependencies and devtools are available on CRAN just run the following in R.

install.packages("Rcpp")
install.packages("devtools")
3a Installation with devtools:

To install the microsimulation using devtools just run the following in R:

require(devtools)
install_github("mclements/microsimulation")
3b Alternative installation from shell:

If you prefer the shell over devtools, just run the following to download the microsimulation R-package:

git clone https://github.com/mclements/microsimulation.git

To install the microsimulation R-package run this in your shell:

R CMD INSTALL path_to_microsimulation

Running the simulation

Simple examples

require(microsimulation)
sim1 <- callSimplePerson2(n = 1e5)
summary(sim1$events)
#>      state               event          age             number
#>  Healthy:338   toOtherDeath :142   Min.   :  1.00   Min.   :   1.0
#>  Cancer :  0   toCancer     :100   1st Qu.: 37.00   1st Qu.:  74.0
#>  Death  :  0   toCancerDeath: 96   Median : 58.00   Median : 325.0
#>                                    Mean   : 56.98   Mean   : 445.1
#>                                    3rd Qu.: 79.00   3rd Qu.: 721.8
#>                                    Max.   :100.00   Max.   :1722.0
#>
sim2 <- callIllnessDeath(n = 1e5, cure = 0.2)
summary(sim2$prev)
#>      state          age             number
#>  Healthy:200   Min.   :  0.00   Min.   :     1
#>  Cancer :  0   1st Qu.: 25.75   1st Qu.:  2030
#>                Median : 50.50   Median :  8096
#>                Mean   : 50.49   Mean   : 34914
#>                3rd Qu.: 75.25   3rd Qu.: 79029
#>                Max.   :100.00   Max.   :100000
#>

In-line example: Sick-Sicker example from Krijkamp et al (2018)

We have provided a continuous-time re-implementation of the discrete-time Markov model developed by Krijkamp et al (2018).

The example demonstrates:

  • Passing data between R and C++.
  • Development of a SickSicker class in C++ that inherits from ssim::cProcess. This requires the specification of a void init() method and a void handleMessage(const ssim::cMessage* msg) method.
  • Reporting using the new SummaryReport class, which allows for utilities, point and interval costs and discounting. The class also allows for reporting individual costs and utilities to calculate the Monte Carlo error.

For the specific example, the transformation from discrete to continuous time showed that the discrete time formulation could have been improved. In particular, the discrete time formulation assumes no transitions between Healthy and Sicker over one year, while the approximate probability of that event is the one-year probability of moving from Healthy to Sick times the probability of moving from Sick to Sicker. We have included that probability in the transition matrix and using matrix logarithms to calculate the transition probabilities.

# +BEGIN_SRC R :session *R-microsimulation* :results output wrap :exports both :eval never-export
library(expm) # logm
library(Rcpp) # sourceCpp
library(microsimulation) # Rcpp::depends and include files
library(ascii); options(asciiType="org")
## set up the parameters
param <- within(list(), {
    ## Transition probabilities (per cycle) and rates
    p.HD = 0.005 # probability to die when healthy
    p.HS1 = 0.15 # probability to become sick when healthy
    p.S1H = 0.5 # probability to become healthy when sick
    p.S1S2 = 0.105 # probability to become sicker when sick
    rr.S1 = 3 # rate ratio of death when sick vs healthy
    rr.S2 = 10 # rate ratio of death when sicker vs healthy
    r.HD = -log(1-p.HD) # rate of death when healthy
    r.S1D = rr.S1 * r.HD # rate of death when sick
    r.S2D = rr.S2 * r.HD # rate of death when sicker
    p.S1D = 1-exp(-r.S1D) # probability to die when sick
    p.S2D = 1-exp(-r.S2D) # probability to die when sicker
    ## Cost and utility inputs
    c_H = 2000 # cost of remaining one cycle healthy
    c_S1 = 4000 # cost of remaining one cycle sick
    c_S2 = 15000 # cost of remaining one cycle sicker
    c_Trt = 12000 # (additional) cost of treatment (per cycle)
    u_H = 1 # utility when healthy
    u_S1 = 0.75 # utility when sick
    u_S2 = 0.5 # utility when sicker
    u_Trt = 0.95 # utility when sick (as per the code) and being treated
    ## new parameters
    discountRate = 0.03 # discount rate
    partitionBy = 1.0 # partition used in the report
    Trt = FALSE # Treatment?
    debug = FALSE
})
## For converting from discrete to continuous time: *p.HS2 should be non-zero*
param = within(param, { p.HS2 = p.HS1*p.S1S2 })
Pmat = with(param,
            matrix(c(1-p.HD-p.HS1-p.HS2,p.HS1,p.HS2,p.HD,
                     p.S1H,1-p.S1H-p.S1S2-p.S1D,p.S1S2,p.S1D,
                     0,0,1-p.S2D,p.S2D,
                     0,0,0,1), 4, byrow=TRUE))
stopifnot(all(abs(rowSums(Pmat)-1)<10*.Machine$double.eps))
Qmat = expm::logm(Pmat) # matrix logarithm
stopifnot(all(abs(rowSums(Qmat))<10*.Machine$double.eps))
## update the rates in param
param = within(param,
{ r_HS1 = Qmat[1,2]; r_HD = Qmat[1,4]
    r_S1H = Qmat[2,1]; r_S1S2 = Qmat[2,3]; r_S1D = Qmat[2,4]
    r_S2D = Qmat[3,4] })
##
sourceCpp(code="
  // [[Rcpp::depends(BH)]]
  // [[Rcpp::depends(RcppArmadillo)]]
  // [[Rcpp::depends(microsimulation)]]
  #include <microsimulation.h>
  enum state_t {Healthy, Sick, Sicker, Dead};
  enum event_t {toS1, toS2, toH, toD, toEOF};
  typedef ssim::SummaryReport<short,short> Report;
  /**
      Utility: Random exponential using rate parameterisation
  */
  template<class T> double rexpRate(T rate) { return R::rexp(1.0/as<double>(rate)); }
  /**
      Utility: Run a set of simulations for a single process
  */
  void runSimulations(ssim::cProcess* process, int n) {
    for (int i = 0; i < n; i++) {
      ssim::Sim::create_process(process);
      ssim::Sim::run_simulation();
      ssim::Sim::clear();
    }
  }
  /**
      Define a class for the process
  */
  class SickSicker : public ssim::cProcess
  {
  public:
    int id;
    state_t state;
    Rcpp::List param;
    Report *report;
    SickSicker(Rcpp::List param, Report *report) : id(-1), param(param), report(report) {
    }
    void init(); // to be specified
    void handleMessage(const ssim::cMessage* msg); // to be specified
    void cancelEvents(); // utility function
  };
  /**
      Initialise a simulation run for an individual
  */
  void SickSicker::init() {
    id++;
    state = Healthy;
    scheduleAt(rexpRate(param[\"r_HS1\"]),toS1);
    scheduleAt(rexpRate(param[\"r_HD\"]),toD);
    scheduleAt(31.0,toEOF); // end of follow-up
  }
 /**
      Utility to cancel some events
  */
  void SickSicker::cancelEvents() {
    ssim::RemoveKind(toH);
    ssim::RemoveKind(toS1);
    ssim::RemoveKind(toS2);
    ssim::RemoveKind(toD);
  }
  /**
      Handle receiving self-messages
  */
  void SickSicker::handleMessage(const ssim::cMessage* msg) {
    if (param[\"debug\"]) Rprintf(\"id: %i, state: %i, kind: %i, previous: %f, now: %f\\n\",
                       id, state, msg->kind, this->previousEventTime, ssim::now());
    report->add(state, msg->kind, this->previousEventTime, ssim::now(), id);
    switch(msg->kind) {
    case toH:
      state = Healthy;
      report->setUtility(param[\"u_H\"]);
      report->setCost(param[\"c_H\"]);
      cancelEvents();
      scheduleAt(ssim::now() + rexpRate(param[\"r_HS1\"]), toS1);
      scheduleAt(ssim::now() + rexpRate(param[\"r_HD\"]), toD);
      break;
    case toS1:
      state = Sick;
      report->setUtility(param[\"Trt\"] ? param[\"u_Trt\"] : param[\"u_S1\"]);
      report->setCost(param[\"c_S1\"] + (param[\"Trt\"] ? param[\"c_Trt\"] : 0.0));
      cancelEvents();
      scheduleAt(ssim::now() + rexpRate(param[\"r_S1H\"]), toH);
      scheduleAt(ssim::now() + rexpRate(param[\"r_S1S2\"]), toS2);
      scheduleAt(ssim::now() + rexpRate(param[\"r_S1D\"]), toD);
      break;
    case toS2:
      state = Sicker;
      report->setUtility(param[\"u_S2\"]);
      report->setCost(param[\"c_S2\"] + (param[\"Trt\"] ? param[\"c_Trt\"] : 0.0));
      cancelEvents();
      scheduleAt(ssim::now() + rexpRate(param[\"r_S2D\"]), toD);
      break;
    case toD:
    case toEOF:
      ssim::Sim::stop_simulation();
      break;
    default:
      REprintf(\"Invalid kind of event: %i.\\n\", msg->kind);
      break;
    }
    if (id % 10000 == 0) Rcpp::checkUserInterrupt(); /* be polite */
  }
  /**
      Exported function: Set up the report and process, run the simulations and return a report
  */
  //[[Rcpp::export]]
  Rcpp::List callSim(int n, Rcpp::List param, bool indivp = true) {
    Report report(n,indivp);
    report.setPartition(0.0,31.0,param[\"partitionBy\"]);
    report.setDiscountRate(param[\"discountRate\"]);
    SickSicker person(param,&report);
    runSimulations(&person, n);
    Rcpp::List lst = report.asList();
    lst.push_back(param,\"param\");
    return lst;
  }")
simulations = function(n, param, simulator=callSim, indivp=TRUE) {
    object = simulator(n, param, indivp)
    stateT = c("Healthy","Sick","Sicker","Dead")
    eventT = c("toS1", "toS2", "toH", "toD", "toEOF")
    for (name in c("ut","costs","pt","events","prev"))
        object[[name]] = transform(object[[name]], state=stateT[Key+1], Key=NULL)
    object$events = transform(object$events, event=eventT[event+1])
    class(object) = c("SickSicker","SummaryReport")
    object
}
## define a utility function for using system.time with ascii
ascii.proc_time = function(x, include.rownames=FALSE, include.colnames=TRUE, ...)
    ascii(summary(x), include.rownames, include.colnames, ...)
## run the simulations
set.seed(12345)
ascii(system.time(sim1 <- simulations(1e3,
                                      param=modifyList(param, list(Trt = FALSE)))),header=TRUE)
set.seed(12345)
ascii(system.time(sim2 <- simulations(1e3,
                                      param=modifyList(param, list(Trt = TRUE)))),FALSE,FALSE)
cat("\n")
ascii(ICER(sim1,sim2), rownames=c("No treatment","Treatment"),
      caption="Continuous-time Sick-Sicker model for n=10,000 individuals")
usersystemelapsed
0.100.000.10
0.090.000.09
TotalIncremental
Costs(se)QALYs(se)Costs(se)QALYs(se)ICER
No treatment117835.72263.013.1010.160
Treatment223871.94097.813.6580.167106036.11881.90.5580.013190110.5
  • The simulations can also be undertaken in parallel:
# +BEGIN_SRC R :session *R-microsimulation* :results output wrap :exports both :eval never-export

  library(parallel)
  set.seed(12345)
  mcsimulations <- function(n, simulations, ..., mc.cores = getOption("mc.cores", 2L)) {
      n.seg <- diff(c((0:(mc.cores-1))*floor(n/mc.cores),n))
      do.call(rbind, mclapply(n.seg, simulations, ..., mc.cores=mc.cores))
  }
  ascii(system.time(sim1 <- mcsimulations(1e5, simulations, 
                                          param=modifyList(param, list(Trt = FALSE)))),
        header=TRUE)
  set.seed(12345)
  ascii(system.time(sim2 <- mcsimulations(1e5, simulations,
                                          param=modifyList(param, list(Trt = TRUE)))),
        FALSE,FALSE)
  cat("\n")
  ascii(ICER(sim1,sim2),caption="Continuous-time Sick-Sicker model for n=100,000 individuals")
usersystemelapsed
8.250.034.14
4.180.024.20
TotalIncremental
Costs(se)QALYs(se)Costs(se)QALYs(se)ICER
Reference118425.3234.413.1330.016
Treatment225548.4424.313.7200.017107123.1485.70.5870.023182387.6

There was an appreciable difference in the estimates from the discrete-time case and the continuous-time case. This issue warrants further investigation, particularly given that the discrete time case ignores any transitions between Healthy and Sicker within a one-year period.

TotalIncremental
Costs(se)QALYs(se)Costs(se)QALYs(se)ICER
<r>
Reference7599618315.820.016
Treatment14164434316.380.016656481640.5610.001117087
set.seed(12345)
invisible(callSim(5, modifyList(param, list(debug=TRUE))))
set.seed(12345)
invisible(callSim(5, modifyList(param, list(Trt=TRUE, debug=TRUE))))
id: 0, state: 0, kind: 0, previous: 0.000000, now: 1.554306
id: 0, state: 1, kind: 1, previous: 1.554306, now: 1.660357
id: 0, state: 2, kind: 3, previous: 1.660357, now: 2.137922
id: 1, state: 0, kind: 0, previous: 0.000000, now: 22.523279
id: 1, state: 1, kind: 2, previous: 22.523279, now: 23.538790
id: 1, state: 0, kind: 0, previous: 23.538790, now: 25.110528
id: 1, state: 1, kind: 2, previous: 25.110528, now: 25.529621
id: 1, state: 0, kind: 4, previous: 25.529621, now: 31.000000
id: 2, state: 0, kind: 0, previous: 0.000000, now: 1.279403
id: 2, state: 1, kind: 2, previous: 1.279403, now: 2.482209
id: 2, state: 0, kind: 0, previous: 2.482209, now: 4.467754
id: 2, state: 1, kind: 2, previous: 4.467754, now: 5.240185
id: 2, state: 0, kind: 0, previous: 5.240185, now: 6.123519
id: 2, state: 1, kind: 2, previous: 6.123519, now: 7.951863
id: 2, state: 0, kind: 0, previous: 7.951863, now: 10.967305
id: 2, state: 1, kind: 2, previous: 10.967305, now: 11.127479
id: 2, state: 0, kind: 0, previous: 11.127479, now: 15.295999
id: 2, state: 1, kind: 1, previous: 15.295999, now: 16.663526
id: 2, state: 2, kind: 4, previous: 16.663526, now: 31.000000
id: 3, state: 0, kind: 0, previous: 0.000000, now: 0.021088
id: 3, state: 1, kind: 2, previous: 0.021088, now: 1.648953
id: 3, state: 0, kind: 0, previous: 1.648953, now: 1.665451
id: 3, state: 1, kind: 1, previous: 1.665451, now: 2.825812
id: 3, state: 2, kind: 3, previous: 2.825812, now: 3.378253
id: 4, state: 0, kind: 0, previous: 0.000000, now: 1.548766
id: 4, state: 1, kind: 2, previous: 1.548766, now: 4.303079
id: 4, state: 0, kind: 0, previous: 4.303079, now: 5.128089
id: 4, state: 1, kind: 2, previous: 5.128089, now: 5.418233
id: 4, state: 0, kind: 0, previous: 5.418233, now: 9.734630
id: 4, state: 1, kind: 2, previous: 9.734630, now: 11.013894
id: 4, state: 0, kind: 0, previous: 11.013894, now: 26.388802
id: 4, state: 1, kind: 2, previous: 26.388802, now: 26.514306
id: 4, state: 0, kind: 0, previous: 26.514306, now: 27.803678
id: 4, state: 1, kind: 2, previous: 27.803678, now: 28.437418
id: 4, state: 0, kind: 4, previous: 28.437418, now: 31.000000
id: 0, state: 0, kind: 0, previous: 0.000000, now: 1.554306
id: 0, state: 1, kind: 1, previous: 1.554306, now: 1.660357
id: 0, state: 2, kind: 3, previous: 1.660357, now: 2.137922
id: 1, state: 0, kind: 0, previous: 0.000000, now: 22.523279
id: 1, state: 1, kind: 2, previous: 22.523279, now: 23.538790
id: 1, state: 0, kind: 0, previous: 23.538790, now: 25.110528
id: 1, state: 1, kind: 2, previous: 25.110528, now: 25.529621
id: 1, state: 0, kind: 4, previous: 25.529621, now: 31.000000
id: 2, state: 0, kind: 0, previous: 0.000000, now: 1.279403
id: 2, state: 1, kind: 2, previous: 1.279403, now: 2.482209
id: 2, state: 0, kind: 0, previous: 2.482209, now: 4.467754
id: 2, state: 1, kind: 2, previous: 4.467754, now: 5.240185
id: 2, state: 0, kind: 0, previous: 5.240185, now: 6.123519
id: 2, state: 1, kind: 2, previous: 6.123519, now: 7.951863
id: 2, state: 0, kind: 0, previous: 7.951863, now: 10.967305
id: 2, state: 1, kind: 2, previous: 10.967305, now: 11.127479
id: 2, state: 0, kind: 0, previous: 11.127479, now: 15.295999
id: 2, state: 1, kind: 1, previous: 15.295999, now: 16.663526
id: 2, state: 2, kind: 4, previous: 16.663526, now: 31.000000
id: 3, state: 0, kind: 0, previous: 0.000000, now: 0.021088
id: 3, state: 1, kind: 2, previous: 0.021088, now: 1.648953
id: 3, state: 0, kind: 0, previous: 1.648953, now: 1.665451
id: 3, state: 1, kind: 1, previous: 1.665451, now: 2.825812
id: 3, state: 2, kind: 3, previous: 2.825812, now: 3.378253
id: 4, state: 0, kind: 0, previous: 0.000000, now: 1.548766
id: 4, state: 1, kind: 2, previous: 1.548766, now: 4.303079
id: 4, state: 0, kind: 0, previous: 4.303079, now: 5.128089
id: 4, state: 1, kind: 2, previous: 5.128089, now: 5.418233
id: 4, state: 0, kind: 0, previous: 5.418233, now: 9.734630
id: 4, state: 1, kind: 2, previous: 9.734630, now: 11.013894
id: 4, state: 0, kind: 0, previous: 11.013894, now: 26.388802
id: 4, state: 1, kind: 2, previous: 26.388802, now: 26.514306
id: 4, state: 0, kind: 0, previous: 26.514306, now: 27.803678
id: 4, state: 1, kind: 2, previous: 27.803678, now: 28.437418
id: 4, state: 0, kind: 4, previous: 28.437418, now: 31.000000

Limitations of the in-line approach

One limitation for the in-line code is that common random numbers, which are manipulated in C++ and use R’s random number functions, are not available. Common random numbers can be used in a package, which is used by the prostata package.

Extensive use case

For more advance use of the microsimulation framework, please have a look at our prostate cancer natural history model:

https://github.com/mclements/prostata