Aggregate local_interactions to estimate shap with interactions
Opened this issue · 5 comments
Hi,
Thanks for the package! I was wondering how is the variable order set when calculating the local interactions and if there could be a way to randomize that order to repeat the measure of the contribution for different orders (and get an estimation of the contribution closer to what SHAP would output)?
I tried passing different orders of variables to local_interactions(..., order =) but it does not change anything, and so I don't know if I am missing a step.. ?
Script example:
# get the variable names and interactions
tmp <- colnames(X)
tmp <- combn(tmp, m = 2)
tmp <-unlist(lapply(asplit(tmp, MARGIN = 2), paste, collapse = ':'))
varN <- c(colnames(X), tmp)
# create different orders
var_orders <- list()
for (i in 1:5){
set.seed(i)
var_orders[[i]] <- sample(varN)
}
# get the contributions for different orders
res <- list()
i <- 1
for (vo in var_orders){
res[[i]] <- local_interactions(new_observation = X[1,],x = explain_rf, interaction_preference = 10, var_orders = vo)
i <- i+1
}
Hi,
I have a minimal example of the change in variable order:
library("DALEX")
library("iBreakDown")
set.seed(1313)
model_titanic_glm <- glm(survived ~ .,
data = titanic_imputed, family = "binomial")
explain_titanic_glm <- explain(model_titanic_glm,
data = titanic_imputed[,-8],
y = titanic_imputed$survived,
label = "glm")
bd_glm <- local_interactions(explain_titanic_glm, titanic_imputed[1, ], order=6:1)
bd_glm
bd_glm <- local_interactions(explain_titanic_glm, titanic_imputed[1, ], order=1:6)
bd_glm
bd_glm <- local_interactions(explain_titanic_glm, titanic_imputed[1, ], order=c('age:gender', 'class', 'embarked', 'fare', 'sibsp'))
bd_glm
bd_glm <- local_interactions(explain_titanic_glm, titanic_imputed[1, ], order=c('age:gender', 'embarked:class', 'sibsp:fare'))
bd_glm
Estimation of SHAP by repeating contributions over different orders is possible using the shap
function:
https://modeloriented.github.io/iBreakDown/reference/break_down_uncertainty.html
More on the topic of these methods can be found in the EMA e-book http://ema.drwhy.ai/shapley.html
Thanks Hubert! I tried your example and it indeed works fine :) However, when passing an order with all variables and possible interactions, I do not get any interaction anymore but only the contributions of single variables. Is it that not all interactions can be passed to the function?
And thanks for pointing to the shap()
function! I had been using it but could not find how to calculate SHAP values for interactions with it? This is why I switched to the local_interaction()
function..
I believe that each variable can be mentioned only once e.g. if 'age'
is apparent, then 'age:gender'
cannot be used. Additionally, I see that when passing interactions as strings, only one name convention is possible e.g. 'age:gender'
not 'gender:age'
.
As for SHAP with interactions, I think that it would be a great feature/method to consider.
I see, thanks Hubert for the clarification! And so not all pairwise interactions can be assessed nor single and interactions.. That could also be a nice feature too :)
Looking forward to the shap interactions!
I think this could remain open