ModelOriented/shapviz

Not compatible with mlr3 package and DALEXtra package

zhangkaicr opened this issue · 6 comments

Thanks to your shapviz package, we can do many beautiful visualizations of shap values in the R environment. Since there are many current machine learning algorithms, everyone tends to use a system with a unified wrapper, such as mlr3. Your shapviz package can construct shapviz objects using the results of the predict_parts function of the DALEXtra package. The DALEXtra package can also build an explainer for the results of mlr3 for the next step of predict_parts calculation. However, I found some problems when applying the above process in practice.

`

Create explainer from your mlr3 model

titanic_imputed$survived <- as.factor(titanic_imputed$survived)
task_classif <- TaskClassif$new(id = "1", backend = titanic_imputed, target = "survived")
learner_classif <- lrn("classif.rpart", predict_type = "prob")
learner_classif$train(task_classif)
exp3<- explain_mlr3(learner_classif, data = titanic_imputed,
y = as.numeric(as.character(titanic_imputed$survived)))

Instance Level shap of the Model Predictions

pred_part3<- DALEX::predict_parts(explainer = exp3,
new_observation = titanic_imputed[,1:7],
type = "shap")

Initialize "shapviz" Object

sv <- shapviz(pred_part3)

SHAP Importance Plots

sv_importance(sv,kind = "beeswarm")`

image

After inspection, we found that the result of the shapviz object only has one row of data

image

You are perfectly right: The DALEX/shapviz combo focusses on local explanations via sv_waterfall() and sv_force().

Since you want to visualize SHAP decompositions of multiple observations, you might want to do the actual crunching via {kernelshap}. It has an exact permutation SHAP algo (for not too many features), and Kernel SHAP as strong approximation.

Note: Explanations are done on probability scale here. I usually recommend to do it on log-odds scale, which is easily possible by providing also a taylored pred_fun() to permshap()/kernelshap():

X_explain <- titanic_imputed[sample(nrow(titanic_imputed), 500), ]
ps <- permshap(
  learner_classif, 
  X = X_explain, 
  bg_X = X_explain[1:200, ],
  feature_names = learner_classif$selected_features()
)
# Or kernelshap() if number of features is larger than ~ 8-10
sv <- shapviz(ps)[[2]]  # for binary classification, choose second class
sv

sv_importance(sv, "bee")
sv_dependence(sv, x)
image image

Same on logit scale:

ps_logit <- permshap(
  learner_classif, 
  X = X_explain, 
  bg_X = X_explain[1:200, ],
  feature_names = learner_classif$selected_features(),
  pred_fun = function(m, X) qlogis(m$predict_newdata(X)$prob[, 2])
)
# Or kernelshap() if number of features is larger than ~ 8-10
sv_logit <- shapviz(ps_logit)
sv_logit

sv_importance(sv_logit, "bee")
sv_dependence(sv_logit, x)
image image

Thank you very much for your answer. I have studied your kernelshap package carefully and it is indeed a very good solution. But when the independent variables of the visual shape need to be more than 14, is there a way to achieve it?

For calculating SHAP values? kernelshap() works for any number of features, permshap() gets painfully slow after 10.

This blog post shows that Kernel SHAP is a very good approximation to (exact) permutation SHAP:
https://lorentzen.ch/index.php/2023/11/11/permutation-shap-versus-kernel-shap/

Another option: The {fastshap} package of Brandon Greenwell offers a sampling version of permshap(), and it is compatible with shapviz. But you need to select a relatively high number of permutations (the default is 1, which is obviouslytoo small).

Or do you mean in the plots?

Btw, thx to your question, I found a way to greatly simplify the internal {mlr3} wrapper of {kernelshap}!