Processors Facilitating Stan Ensemble Runs and Sensitivity Analysis
Opened this issue · 0 comments
taliaweiss commented
Running batch/ensemble analyses should happen in two steps (calls to morpho):
- Prepare and analyze data (either real or pseudo), producing posteriors;
- Compute and present metrics that depend on results of the entire ensemble of runs.
For Step 1, a processor should have the following capabilities:
- Calls another processor which randomly selects input values to the data generator from priors, and saves the inputs in the appropriate file - to facilitate computing coverages, etc. (This should be an automatic part of all sensitivity analyses). The processor could be based on
morpho/morpho/preprocessing/sample_inputs.py
inmorpho1
.- If necessary, incorporate an option to compute transformed inputs from sampled inputs, before generation.
- Optionally resets Stan initialization and sampling parameters (like
iter
) based on inputs sampled from priors. This is possible in thespectrum_analysis
branch ofmorpho1
. - Generates (or loads) and analyzes a specified number of data sets. Could be configured for parallel processing of analysis runs on a cluster, as is the case in
scripts/morpho_models/python_scripts/ensemble_runs.py
(which I am cleaning up at the moment).
For Step 2, we should have the following features (processors):
- Compute and report posterior credible intervals and coverages.
- Plot posterior means and credible intervals as a function of inputs for a given parameter. This is often the simplest way to summarize the results of Bayesian inference.
- Compute posterior shrinkages (S) and z-scores (Z); plot S vs. Z for a given parameter. These plots allow one to diagnose overfitting, posterior/prior conflicts, poor model specification, and good model behavior.