DrylandEcology/rSFSW2

All aggregated output using snow is incorrect for scenarios > ambient

Closed this issue · 2 comments

The function do_OneSite loops over (climate) scenario and removes all scenario-dependent intermediate variables at the beginning of the loop

to_del <- c("temp.yr", "temp.mo", "temp.dy", ...)
to_del <- to_del[to_del %in% ls()]
if (length(to_del) > 0) try(rm(list = to_del), silent = TRUE)

This is because the individual aggregation options, e.g., dailySnowpack, test whether the current iteration's data has already been extracted into the local variable, e.g.,

if (!exists("SWE.dy")) SWE.dy <- get_SWE_dy(runDataSC, isim_time)

with the purpose that this extraction from the rSOILWAT2 object occurs only once and not for each individual aggregation option repeatedly.

This code setup, however, requires that every local variable is removed at the beginning of the scenario loop.

This has not been the case for snow-related variables on the master branch since commit 821d725 (March 5, 2015) when the development branch got merged on which the deletion of the local snow variables (SWE.mo and SWE.dy) was removed with commit 17aa5c2 (March 5, 2015) without explanation.

As of today, commit 873bc99, the following aggregation options use ambient values for SWE.dy and SWE.mo instead of correct scenario values:

  • dailySnowpack all output is affected
  • dailyFrostInSnowfreePeriod all output is affected
  • dailyColdDegreeDays one of two output variables is affected: "ColdDegreeDays.SnowFree.Base.0C.dailyTMean_Cdays_mean"
  • dailySuitablePeriodsDuration all output is affected
  • dailySuitablePeriodsAvailableWater all output is affected
  • dailySuitablePeriodsDrySpells all output is affected
  • dailyThermalDrynessStress a third of the output variables is affected: those related to "Mean10HottestSnowfreeDays_*" and "Mean10ColdestSnowfreeDays_*"
  • monthlySnowpack all output is affected
  • dailyRegeneration_bySWPSnow all output is affected

The following aggregations also use snow-related values, but their output is correct in terms of this bug:

  • because they extracted snow values directly from the runDataSC object instead of SWE.dy:
    • dailyRegeneration_GISSM
  • because they used prcp.dy for snowfall which was correctly removed at the beginning of the scenario loop instead of SWE.dy:
    • yearlyPPT
    • monthlyPlantGrowthControls
    • yearlyWaterBalanceFluxes
  • and mean daily aggregations

No other aggregation seems to be affected by not deleting the scenario-specific local variable.

Update unit test project to include climate scenarios, etc. Check that new DB with different design is ok:

library("DBI")
library("RSQLite")

#--- Check that old and new database contain the same values
# Connect to old and new output database
con1 <- dbConnect(SQLite(), "tests/test_data/0_ReferenceOutput/dbTables_Test4_AllOverallAggregations_snow_v2.6.3.sqlite3")
con2 <- dbConnect(SQLite(), "tests/test_data/0_ReferenceOutput/dbTables_Test4_AllOverallAggregations_snow_v2.7.4.sqlite3")

# Read data from overall aggregation table
aom1 <- dbReadTable(con1, "aggregation_overall_mean")
aom2 <- dbReadTable(con2, "aggregation_overall_mean")

# Subset output from new database to those records that are contained in the old one
runs2 <- dbReadTable(con2, "runs")
ids2 <- runs2[runs2[, "scenario_id"] == 1 & runs2[, "treatment_id"] == 1, "P_id"]
aom2to1 <- aom2[aom2[, "P_id"] %in% ids2, ]

# Calculate differences (without first column of `P_id`s)
diffs <- aom2to1[, -1] - aom1[, -1]
icd <- apply(abs(diffs), 2, sum, na.rm = TRUE) > sqrt(.Machine$double.eps)
icd[is.na(icd)] <- FALSE

# If old and new database are identical, then we expect that 0 columns differ
print(sum(icd))

[1] 0

#--- Check that snowpack varies within sites among climate scenarios --
# unless the variable is constant at zero
icol <- grep("snow", colnames(aom2), value = TRUE, ignore.case = TRUE)
ids2 <- runs2[runs2[, "treatment_id"] == 1, "P_id"]
irow_aom2 <- aom2[, "P_id"] %in% ids2
irow_runs2 <- runs2[, "P_id"] %in% aom2[irow_aom2, "P_id"]
snow_sum <- aggregate(aom2[irow_aom2, icol], by = list(runs2[irow_runs2, "site_id"]), 
  FUN = sum)
snow_var <- aggregate(aom2[irow_aom2, icol], by = list(runs2[irow_runs2, "site_id"]), 
  FUN = var)

has_snow <- apply(snow_sum, 2, sum, na.rm = TRUE) > sqrt(.Machine$double.eps)
icd <- !has_snow | has_snow & apply(snow > sqrt(.Machine$double.eps), 2, any, na.rm = TRUE)

# If climate scenarios vary then snow-related fields must vary too, i.e., 
# we expect that 0 columns no not vary
print(icol[!icd])

character(0)