Comparative model evaluation
jtwalsh0 opened this issue · 2 comments
jtwalsh0 commented
Our model evaluations currently look at models independently. We should also compare models. Here are some of the things to look at:
- Correlation matrices between model predictions
- Jaccard similarity and rank-order correlations
- Webapp should show display model accuracy for simple comparison, e.g. sort by accuracy
- Cluster models
- Predict model performance from model characteristics/configurations, e.g. type of model (random forest, logistic regression) is a feature, size of the time window is a feature, time period is a feature, hyperparameters are features, etc. That can help uncover patterns
jtwalsh0 commented
The webapp should show how stable/unstable model performance is over time.
Within-model evaluation:
- This is an absolute measure. Plot precision/recall/ROC AUC/etc over time
Between-model evaluation:
- plot rank-order correlation from one period to the next (i.e. do the same models consistently appear at the top?), maybe Jaccard similarity for models in top k models
jtwalsh0 commented
This is part of Tyra now