Exposed and Infected compartments are orders of magnitude higher compared to original code

After a set of changes, the values of the two compartments are more realistic, but still not quite as high as the original code. Why? How can I test differences?

With some fixes, the results obtained clearly show that the new code overestimates the number of exposures and infections after four hundred and forty days.

New code

newCodeResultsNonBDA.xlsx

Original

Democratic Republic of Congo_Simulation_Summary2024-09-06.xlsx

With some fixes, the results obtained clearly show that the new code overestimates the number of exposures and infections after four hundred and forty days.

New code

newCodeResultsNonBDA.xlsx

Original

Democratic Republic of Congo_Simulation_Summary2024-09-06.xlsx

@ashokkrish, ah, yes, I knew I saved the results of the comparison somewhere. Please look at this.

@bryce-carson

I plotted a time-series graph of the Vaccinated compartment alone in newCodeResultsNonBDA.xlsx and this looks so odd.

This definitely is not right!

Can you take a look at the code where a fraction of the population are vaccinated at each timestep?

@bryce-carson

A huge red flag in the plot of newI column reveals some negative values.

This is absolutely wrong.

@bryce-carson

A huge red flag in the plot of newI column reveals some negative values.

This is absolutely wrong.

The newI is calculated as the change in I, actually, but I can easily modify it so it only counts additions to the I compartment.

Please add the following columns to newCodeResultsNonBDA.xlsx

Alpha	Beta	Gamma	Sigma	Delta	Radius	Lambda	Model	DA

I know the parameters are constant right now but in the future we might have time-varying parameters (seasonally varying for certain diseases). Furthermore we need an additional column for whether the seed data was seeded on a single cell (0) or equitabally on the Moore neighbourhood (1) considering there is a radioButton toggle for that.

Please add the following columns to newCodeResultsNonBDA.xlsx
Alpha Beta Gamma Sigma Delta Radius Lambda Model DA

I know the parameters are constant right now but in the future we might have time-varying parameters (seasonally varying for certain diseases). Furthermore we need an additional column for whether the seed data was seeded on a single cell (0) or equitabally on the Moore neighbourhood (1) considering there is a radioButton toggle for that.

Indeed. That does make more sense to me now than it did months ago when I disagreed. Once I finish debugging the refactored code those new features will be much more easily and reliably implemented, so including these columns in the table now makes sense too.

This permalink to these ~50 lines is intended to draw attention to them.

spatialEpisim.foundation/R/SVEIRD.BayesianDataAssimilation.R

Lines 1020 to 1071 in 6dd5f47

    
           living <- terra::subset(layers, "Dead", negate = TRUE) %>% 
        
             terra::app("sum", na.rm = TRUE) %>% 
        
             "names<-"("Living") 
        
           ## NOTE: set the previous timesteps compartment count values in the 
        
           ## summary table before calculating values for the current timestep. 
        
           summaryTable[today, "N"] <- round(terra::global(living, "sum", na.rm = TRUE)) 
        
           summaryTable[today, c("S", "V", "E", "I", "R", "D")] <- 
        
             round(sums <- t(terra::global(layers, "sum", na.rm = TRUE))) 
        
           newVaccinated <- alpha * reclassifyBelowUpperBound(layers$Susceptible, upper = 1) 
        
           proportionSusceptible <- layers$Susceptible / living # alike total-mass action in Episim. 
        
           ## NOTE: Calculate a matrix of weights respecting human mobility patterns. 
        
           transmissionLikelihoods <- 
        
             terra::focal(x = layers$Infected, 
        
                          w = averageEuclideanDistance(lambda, aggregationFactor), 
        
                          na.policy = "omit", 
        
                          fillvalue = 0, 
        
                          fun = "sum", 
        
                          na.rm = TRUE) 
        
           uniqueInfectionLikelihoods <- length(unique(as.vector(transmissionLikelihoods))) 
        
           if (is.nan(sums[, "Infected"])) 
        
             warning("The result of (global) sum total of the Infected compartment is NaN according to terra, so the check that the number of unique infection likelihoods is greater than one is being skipped this iteration. See issue #13.") 
        
           if (all(!is.nan(sums[, "Infected"]), 
        
                   as.numeric(sums[, "Infected"]) >= 0, 
        
                   !(uniqueInfectionLikelihoods > 1))) 
        
             stop("The number of unique likelihoods of transmission is not more than one, indicating an issue generating the transmissionLikelihoods matrix.") 
        
           growth <- beta * proportionSusceptible * transmissionLikelihoods 
        
           ## MAYBE FIXME: there should not be two layers returned; the masking is not as intended. 
        
           newExposed <- terra::mask(if(simulationIsDeterministic) growth else stats::rpois(1, growth), 
        
                                     c(proportionSusceptible < 1, transmissionLikelihoods < 1), 
        
                                     maskvalues = TRUE, 
        
                                     updatevalue = 0) 
        
           newInfected <- reclassifyBelowUpperBound(gamma * sum(c(layers$Exposed, newExposed)), upper = 1) 
        
           ## Some infectious people are going to either recover or die 
        
           infectious <- reclassifyBelowUpperBound(sum(c(layers$Infected, newInfected)), upper = 1) 
        
           newRecovered <- sigma * infectious 
        
           newDead <- reclassifyBelowUpperBound(delta * infectious, upper = 1) 
        
           layers$Susceptible <- sum(c(layers$Susceptible, -1 * newExposed, -1 * newVaccinated), na.rm = TRUE) 
        
           layers$Vaccinated  <- sum(c(layers$Vaccinated, newVaccinated), na.rm = TRUE) 
        
           layers$Exposed     <- sum(c(layers$Exposed, newExposed, -1 * newInfected), na.rm = TRUE) 
        
           layers$Infected    <- sum(c(layers$Infected, newInfected, -1 * newDead, -1 * newRecovered), na.rm = TRUE) 
        
           layers$Recovered   <- sum(c(layers$Recovered, newRecovered), na.rm = TRUE) 
        
           layers$Dead        <- sum(c(layers$Dead, newDead), na.rm = TRUE)

These are the whole of the spatio-temporal SVEIRD simulation loop (when data assimilation is not enabled). I should walk through each step and explain the functions and data structures used in conversation with @ashokkrish when we both have time to meet.

@ashokkrish, I will discuss this comment in our meeting today.

I plotted the newV column from the original code and what I saw showed that the number of vaccinations each day is essentially the same over the entire simulation (decreasing by only eight individuals per day (but not by 8/day/day!)). At the beginning its ~398/day, at the end it is ~390/day.

The rate of vaccination was never the problem, then, but that is what I thought you were referring to. You were, based on the conversation in this issue, referring to the magnitude of the count of vaccinated persons over the period of the simulation; the number was very high, but due to other reasons than the code implementing vaccination of a fraction of individuals. With other fixes having been committed, the actual numbers of vaccinated people would be what you expect, and the count of people vaccinated per day due to the rate of vaccination is unchanged and as expected.

Please add the following columns to newCodeResultsNonBDA.xlsx
Alpha Beta Gamma Sigma Delta Radius Lambda Model DA

I know the parameters are constant right now but in the future we might have time-varying parameters (seasonally varying for certain diseases).

I agree with these changes.

Furthermore we need an additional column for whether the seed data was seeded on a single cell (0) or equitabally on the Moore neighbourhood (1) considering there is a radioButton toggle for that.

This one I am suspect of. It doesn't make sense to me why we would encode that parameter in a column; we could encode it in the filename like so:

yourSimulation_YYYYMMDDT24HOUR60MINUTE_SEED0
yourSimulation_YYYYMMDDT24HOUR60MINUTE_SEED1

An example would be exampleOfFilenamingConvention_20241007T1732_SEED1.

Having a column indicating when data is assimilated would be a fine addition.

There are multiple solutions to the overall consideration of the downloaded data. Those should be further discussed in the relevant issue in the spatialEpisim Shiny application repository.

This permalink to these ~50 lines is intended to draw attention to them.

spatialEpisim.foundation/R/SVEIRD.BayesianDataAssimilation.R

Lines 1020 to 1071 in 6dd5f47

living <- terra::subset(layers, "Dead", negate = TRUE) %>%

terra::app("sum", na.rm = TRUE) %>%

"names<-"("Living")

## NOTE: set the previous timesteps compartment count values in the

## summary table before calculating values for the current timestep.

summaryTable[today, "N"] <- round(terra::global(living, "sum", na.rm = TRUE))

summaryTable[today, c("S", "V", "E", "I", "R", "D")] <-

round(sums <- t(terra::global(layers, "sum", na.rm = TRUE)))

newVaccinated <- alpha * reclassifyBelowUpperBound(layers$Susceptible, upper = 1)

proportionSusceptible <- layers$Susceptible / living # alike total-mass action in Episim.

## NOTE: Calculate a matrix of weights respecting human mobility patterns.

transmissionLikelihoods <-

terra::focal(x = layers$Infected,

w = averageEuclideanDistance(lambda, aggregationFactor),

na.policy = "omit",

fillvalue = 0,

fun = "sum",

na.rm = TRUE)

uniqueInfectionLikelihoods <- length(unique(as.vector(transmissionLikelihoods)))

if (is.nan(sums[, "Infected"]))

warning("The result of (global) sum total of the Infected compartment is NaN according to terra, so the check that the number of unique infection likelihoods is greater than one is being skipped this iteration. See issue #13.")

if (all(!is.nan(sums[, "Infected"]),

as.numeric(sums[, "Infected"]) >= 0,

!(uniqueInfectionLikelihoods > 1)))

stop("The number of unique likelihoods of transmission is not more than one, indicating an issue generating the transmissionLikelihoods matrix.")

growth <- beta * proportionSusceptible * transmissionLikelihoods

## MAYBE FIXME: there should not be two layers returned; the masking is not as intended.

newExposed <- terra::mask(if(simulationIsDeterministic) growth else stats::rpois(1, growth),

c(proportionSusceptible < 1, transmissionLikelihoods < 1),

maskvalues = TRUE,

updatevalue = 0)

newInfected <- reclassifyBelowUpperBound(gamma * sum(c(layers$Exposed, newExposed)), upper = 1)

## Some infectious people are going to either recover or die

infectious <- reclassifyBelowUpperBound(sum(c(layers$Infected, newInfected)), upper = 1)

newRecovered <- sigma * infectious

newDead <- reclassifyBelowUpperBound(delta * infectious, upper = 1)

layers$Susceptible <- sum(c(layers$Susceptible, -1 * newExposed, -1 * newVaccinated), na.rm = TRUE)

layers$Vaccinated <- sum(c(layers$Vaccinated, newVaccinated), na.rm = TRUE)

layers$Exposed <- sum(c(layers$Exposed, newExposed, -1 * newInfected), na.rm = TRUE)

layers$Infected <- sum(c(layers$Infected, newInfected, -1 * newDead, -1 * newRecovered), na.rm = TRUE)

layers$Recovered <- sum(c(layers$Recovered, newRecovered), na.rm = TRUE)

layers$Dead <- sum(c(layers$Dead, newDead), na.rm = TRUE)

These are the whole of the spatio-temporal SVEIRD simulation loop (when data assimilation is not enabled). I should walk through each step and explain the functions and data structures used in conversation with @ashokkrish when we both have time to meet.

spatialEpisim.foundation/R/SVEIRD.BayesianDataAssimilation.R

Lines 1020 to 1075 in 6a5305c

    
           living <- terra::subset(layers, "Dead", negate = TRUE) %>% 
        
             terra::app("sum", na.rm = TRUE) %>% 
        
             "names<-"("Living") 
        
           ## NOTE: set the previous timesteps compartment count values in the 
        
           ## summary table before calculating values for the current timestep. 
        
           summaryTable[today, "N"] <- round(terra::global(living, "sum", na.rm = TRUE)) 
        
           summaryTable[today, c("S", "V", "E", "I", "R", "D")] <- 
        
             round(sums <- t(terra::global(layers, "sum", na.rm = TRUE))) 
        
           newVaccinated <- alpha * reclassifyBelowUpperBound(layers$Susceptible, upper = 1) 
        
           if (aggregationFactor > 1) 
        
             weights <- averageEuclideanDistance(lambda, aggregationFactor) 
        
           else 
        
             weights <- averageEuclideanDistance(lambda) 
        
           transmissionLikelihoods <- 
        
             terra::focal(x = layers$Infected, 
        
                          w = weights, 
        
                          na.policy = "omit", 
        
                          fillvalue = 0, 
        
                          fun = "sum", 
        
                          na.rm = TRUE) 
        
           uniqueInfectionLikelihoods <- length(unique(as.vector(transmissionLikelihoods))) 
        
           if (is.nan(sums[, "Infected"])) 
        
             warning("The result of (global) sum total of the Infected compartment is NaN according to terra, so the check that the number of unique infection likelihoods is greater than one is being skipped this iteration. See issue #13.") 
        
           if (all(!is.nan(sums[, "Infected"]), 
        
                   as.numeric(sums[, "Infected"]) >= 0, 
        
                   !(uniqueInfectionLikelihoods > 1))) 
        
             stop("The number of unique likelihoods of transmission is not more than one, indicating an issue generating the transmissionLikelihoods matrix.") 
        
           proportionSusceptible <- layers$Susceptible / living # alike total-mass action in Episim. 
        
           growth <- beta * proportionSusceptible * transmissionLikelihoods 
        
           newExposed <- terra::mask(if(simulationIsDeterministic) growth else stats::rpois(1, growth), 
        
                                     terra::app(c(proportionSusceptible < 1, transmissionLikelihoods < 1), 
        
                                                fun = "sum", 
        
                                                na.rm = TRUE), 
        
                                     maskvalues = 1, 
        
                                     updatevalue = 0) 
        
           newInfected <- reclassifyBelowUpperBound(gamma * sum(c(layers$Exposed, newExposed)), upper = 1) 
        
           infectious <- reclassifyBelowUpperBound(sum(c(layers$Infected, newInfected)), upper = 1) 
        
           newRecovered <- sigma * infectious 
        
           newDead <- reclassifyBelowUpperBound(delta * infectious, upper = 1) 
        
           layers$Susceptible <- sum(c(layers$Susceptible, -1 * newExposed, -1 * newVaccinated), na.rm = TRUE) 
        
           layers$Vaccinated  <- sum(c(layers$Vaccinated, newVaccinated), na.rm = TRUE) 
        
           layers$Exposed     <- sum(c(layers$Exposed, newExposed, -1 * newInfected), na.rm = TRUE) 
        
           layers$Infected    <- sum(c(layers$Infected, newInfected, -1 * newDead, -1 * newRecovered), na.rm = TRUE) 
        
           layers$Recovered   <- sum(c(layers$Recovered, newRecovered), na.rm = TRUE) 
        
           layers$Dead        <- sum(c(layers$Dead, newDead), na.rm = TRUE)

Discuss the lines as they currently exist, not as they were in that commit.

	living <- terra::subset(layers, "Dead", negate = TRUE) %>%
	terra::app("sum", na.rm = TRUE) %>%
	"names<-"("Living")

	## NOTE: set the previous timesteps compartment count values in the
	## summary table before calculating values for the current timestep.
	summaryTable[today, "N"] <- round(terra::global(living, "sum", na.rm = TRUE))
	summaryTable[today, c("S", "V", "E", "I", "R", "D")] <-
	round(sums <- t(terra::global(layers, "sum", na.rm = TRUE)))

	newVaccinated <- alpha * reclassifyBelowUpperBound(layers$Susceptible, upper = 1)

	proportionSusceptible <- layers$Susceptible / living # alike total-mass action in Episim.

	## NOTE: Calculate a matrix of weights respecting human mobility patterns.
	transmissionLikelihoods <-
	terra::focal(x = layers$Infected,
	w = averageEuclideanDistance(lambda, aggregationFactor),
	na.policy = "omit",
	fillvalue = 0,
	fun = "sum",
	na.rm = TRUE)

	uniqueInfectionLikelihoods <- length(unique(as.vector(transmissionLikelihoods)))
	if (is.nan(sums[, "Infected"]))
	warning("The result of (global) sum total of the Infected compartment is NaN according to terra, so the check that the number of unique infection likelihoods is greater than one is being skipped this iteration. See issue #13.")
	if (all(!is.nan(sums[, "Infected"]),
	as.numeric(sums[, "Infected"]) >= 0,
	!(uniqueInfectionLikelihoods > 1)))
	stop("The number of unique likelihoods of transmission is not more than one, indicating an issue generating the transmissionLikelihoods matrix.")

	growth <- beta * proportionSusceptible * transmissionLikelihoods
	## MAYBE FIXME: there should not be two layers returned; the masking is not as intended.
	newExposed <- terra::mask(if(simulationIsDeterministic) growth else stats::rpois(1, growth),
	c(proportionSusceptible < 1, transmissionLikelihoods < 1),
	maskvalues = TRUE,
	updatevalue = 0)

	newInfected <- reclassifyBelowUpperBound(gamma * sum(c(layers$Exposed, newExposed)), upper = 1)

	## Some infectious people are going to either recover or die
	infectious <- reclassifyBelowUpperBound(sum(c(layers$Infected, newInfected)), upper = 1)
	newRecovered <- sigma * infectious

	newDead <- reclassifyBelowUpperBound(delta * infectious, upper = 1)

	layers$Susceptible <- sum(c(layers$Susceptible, -1 * newExposed, -1 * newVaccinated), na.rm = TRUE)
	layers$Vaccinated <- sum(c(layers$Vaccinated, newVaccinated), na.rm = TRUE)
	layers$Exposed <- sum(c(layers$Exposed, newExposed, -1 * newInfected), na.rm = TRUE)
	layers$Infected <- sum(c(layers$Infected, newInfected, -1 * newDead, -1 * newRecovered), na.rm = TRUE)
	layers$Recovered <- sum(c(layers$Recovered, newRecovered), na.rm = TRUE)
	layers$Dead <- sum(c(layers$Dead, newDead), na.rm = TRUE)