All aggregated output using snow is incorrect for scenarios > ambient

Question

All aggregated output using snow is incorrect for scenarios > ambient

Closed this issue 7 years ago · 2 comments

The function do_OneSite loops over (climate) scenario and removes all scenario-dependent intermediate variables at the beginning of the loop

to_del <- c("temp.yr", "temp.mo", "temp.dy", ...)
to_del <- to_del[to_del %in% ls()]
if (length(to_del) > 0) try(rm(list = to_del), silent = TRUE)

This is because the individual aggregation options, e.g., dailySnowpack, test whether the current iteration's data has already been extracted into the local variable, e.g.,

if (!exists("SWE.dy")) SWE.dy <- get_SWE_dy(runDataSC, isim_time)

with the purpose that this extraction from the rSOILWAT2 object occurs only once and not for each individual aggregation option repeatedly.

This code setup, however, requires that every local variable is removed at the beginning of the scenario loop.

This has not been the case for snow-related variables on the master branch since commit 821d725 (March 5, 2015) when the development branch got merged on which the deletion of the local snow variables (SWE.mo and SWE.dy) was removed with commit 17aa5c2 (March 5, 2015) without explanation.

As of today, commit 873bc99, the following aggregation options use ambient values for SWE.dy and SWE.mo instead of correct scenario values:

dailySnowpack all output is affected
dailyFrostInSnowfreePeriod all output is affected
dailyColdDegreeDays one of two output variables is affected: "ColdDegreeDays.SnowFree.Base.0C.dailyTMean_Cdays_mean"
dailySuitablePeriodsDuration all output is affected
dailySuitablePeriodsAvailableWater all output is affected
dailySuitablePeriodsDrySpells all output is affected
dailyThermalDrynessStress a third of the output variables is affected: those related to "Mean10HottestSnowfreeDays_*" and "Mean10ColdestSnowfreeDays_*"
monthlySnowpack all output is affected
dailyRegeneration_bySWPSnow all output is affected

The following aggregations also use snow-related values, but their output is correct in terms of this bug:

because they extracted snow values directly from the runDataSC object instead of SWE.dy:
- dailyRegeneration_GISSM
because they used prcp.dy for snowfall which was correctly removed at the beginning of the scenario loop instead of SWE.dy:
- yearlyPPT
- monthlyPlantGrowthControls
- yearlyWaterBalanceFluxes
and mean daily aggregations

Answer 1 · 2018-04-30T16:45:09.000Z

No other aggregation seems to be affected by not deleting the scenario-specific local variable.

Answer 2 · 2018-05-02T17:58:34.000Z

Update unit test project to include climate scenarios, etc. Check that new DB with different design is ok:

library("DBI")
library("RSQLite")

#--- Check that old and new database contain the same values
# Connect to old and new output database
con1 <- dbConnect(SQLite(), "tests/test_data/0_ReferenceOutput/dbTables_Test4_AllOverallAggregations_snow_v2.6.3.sqlite3")
con2 <- dbConnect(SQLite(), "tests/test_data/0_ReferenceOutput/dbTables_Test4_AllOverallAggregations_snow_v2.7.4.sqlite3")

# Read data from overall aggregation table
aom1 <- dbReadTable(con1, "aggregation_overall_mean")
aom2 <- dbReadTable(con2, "aggregation_overall_mean")

# Subset output from new database to those records that are contained in the old one
runs2 <- dbReadTable(con2, "runs")
ids2 <- runs2[runs2[, "scenario_id"] == 1 & runs2[, "treatment_id"] == 1, "P_id"]
aom2to1 <- aom2[aom2[, "P_id"] %in% ids2, ]

# Calculate differences (without first column of `P_id`s)
diffs <- aom2to1[, -1] - aom1[, -1]
icd <- apply(abs(diffs), 2, sum, na.rm = TRUE) > sqrt(.Machine$double.eps)
icd[is.na(icd)] <- FALSE

# If old and new database are identical, then we expect that 0 columns differ
print(sum(icd))

[1] 0

#--- Check that snowpack varies within sites among climate scenarios --
# unless the variable is constant at zero
icol <- grep("snow", colnames(aom2), value = TRUE, ignore.case = TRUE)
ids2 <- runs2[runs2[, "treatment_id"] == 1, "P_id"]
irow_aom2 <- aom2[, "P_id"] %in% ids2
irow_runs2 <- runs2[, "P_id"] %in% aom2[irow_aom2, "P_id"]
snow_sum <- aggregate(aom2[irow_aom2, icol], by = list(runs2[irow_runs2, "site_id"]), 
  FUN = sum)
snow_var <- aggregate(aom2[irow_aom2, icol], by = list(runs2[irow_runs2, "site_id"]), 
  FUN = var)

has_snow <- apply(snow_sum, 2, sum, na.rm = TRUE) > sqrt(.Machine$double.eps)
icd <- !has_snow | has_snow & apply(snow > sqrt(.Machine$double.eps), 2, any, na.rm = TRUE)

# If climate scenarios vary then snow-related fields must vary too, i.e., 
# we expect that 0 columns no not vary
print(icol[!icd])

character(0)