Sum of SHAP values not equal to `pred - mean(pred)` when `exact = TRUE`
dfsnow opened this issue · 2 comments
Hi! Thanks for the great package. I want to clarify a point of confusion I have before proceeding. I found the sample code you posted here and ran it locally. Quick reprex:
library(xgboost)
library(fastshap)
library(SHAPforxgboost) #to load the dataX
y_var <- "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])
# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror", # For regression
eta = 0.02,
max_depth = 10,
gamma = 0.01,
subsample = 0.95
)
mod <- xgboost(data = dataX, label = as.matrix(dataXY_df[[y_var]]),
params = param_list, nrounds = 10, verbose = FALSE,
nthread = parallel::detectCores() - 2, early_stopping_rounds = 8)
# Grab SHAP values directly from XGBoost
shap <- predict(mod, newdata = dataX, predcontrib = TRUE)
# Compute shapley values
shap2 <- explain(mod, X = dataX, exact = TRUE, adjust = TRUE)
# Compute bias term; difference between predictions and sum of SHAP values
pred <- predict(mod, newdata = dataX)
head(bias <- pred - rowSums(shap2))
#> [1] 0.4174776 0.4174775 0.4174775 0.4174775 0.4174775 0.4174776
# Compare to output from XGBoost
head(shap[, "BIAS"])
#> [1] 0.4174775 0.4174775 0.4174775 0.4174775 0.4174775 0.4174775
# Check that SHAP values sum to the difference between pred and mean(pred)
head(cbind(rowSums(shap2), pred - mean(pred)))
#> [,1] [,2]
#> [1,] -0.03048085 -0.03053582
#> [2,] -0.08669319 -0.08674819
#> [3,] -0.05410853 -0.05416352
#> [4,] -0.09465271 -0.09470773
#> [5,] -0.01655553 -0.01661054
#> [6,] -0.01729831 -0.01735327
In this code, the SHAP values' sum is not equal to the difference between pred and mean(pred) as suggested. Instead the SHAP values' sum is (nearly) equal to the BIAS
term from the stats::predict(object, X, predcontrib = TRUE, ...)
call in explain.xgb.Booster
when exact = TRUE
.
# Compare pred - BIAS from shap2
head(cbind(rowSums(shap2), pred - attributes(shap2)$baseline))
#> [,1] [,2]
#> [1,] -0.03048085 -0.03048083
#> [2,] -0.08669319 -0.08669320
#> [3,] -0.05410853 -0.05410853
#> [4,] -0.09465271 -0.09465274
#> [5,] -0.01655553 -0.01655555
#> [6,] -0.01729831 -0.01729828
So, quick questions:
- Should
adjust = TRUE
have the same effect forexact = TRUE
output as it does forexact = FALSE
output? In the line above (explain(mod, X = dataX, exact = TRUE, adjust = TRUE)
),adjust = TRUE
has no function. Is is simply passed on to the predict method of xgb.Booster and silently swallowed. Is this the intended behavior? - Can you briefly explain the difference between the baseline/bias term (produced by
predict(xgb.Booster, newdata = X, predcontrib = FALSE)
as the last matrix column) andmean(prediction)
? I scoured the xgboost/lightgbm docs but couldn't find much.
Hi @dfsnow, thanks for the note. Setting adjust = TRUE
has no affect on the output when using exact = TRUE
since they are already supposed to be additive. I'm not sure why the SHAP values aren't additive here (and I get the same issue when using XGBoost directly), so it may be better to ask on the XGBoost issues page. The bias column/term should be the average of all the training predictions (i.e., E(f(x))), which also corresponds to the difference between a particular prediction and the sum of its corresponding Shapley values.
Interesting. For what it's worth, this issue is also true of LightGBM. I'll make a quick issue on the xgboost repo. Thanks!