Notes and next steps from meeting on 10/26/2021 with Khachik

Question

Opened this issue 3 years ago · 2 comments

Saeed notes from meeting on 10/26/2021 with Khachik

@WPringle and @zacharyburnettNOAA see if this note is useful.

using a polynomial for the surrogate might not be the best method for maximum elevations
- because it might be “jumpy” (discontinuous) - perhaps look into time series for PC?
- 3rd order polynomial might be overfitting because of too many degrees of freedom
  surrogate
- if we can’t use polynomials, we will have to move to a more flexible scheme
  - neural network (don’t get sensitivities for free)
plot map of sensitivities using function provided in document
To figure out to do percentile !?
- in high-dimensional (more than 4) random variables, "there is no such thing as quantile"

quadrature can be more accurate with less samples, but could fail for discontinued parameters
take out the original run from the quadrature fit - otherwise the quadrature will break
repeat the same sample with regression
try KL or PCA

try validating the surrogate with 50 or so reference samples separate from the training set
- plot the fit against the training data
- if the training set fits well but validation set does NOT fit well, then the surrogate model is overfit
use Equation 6 from doc to go from math space to physical space - "everything should go through Equation 6"
try testing a single mesh node
- pick a single node
- gather maximum elevations
- build surrogate
- compare percentiles

perhaps we can ravel time into the node space
- compress time axis with KL eigenvalues
- either way we will have to sparsify nodes
perhaps we can decompose the matrix
- what we need is the covariance of the entire matrix
- aggregate with dask?
sparsify time series
- get 12 hour leadup time to storm landfall
- every 2 hours

Answer 1 · 2021-10-26T16:03:15.000Z

Fold time, lat and lon and pass it to KL or PCA to compress it. Perhaps instead of 5 or 6 mode we may get 10s of modes.

Answer 2 · 2021-10-26T17:04:54.000Z