is it possible to extract the intercept / bias / base value with fastshap?
marboe123 opened this issue · 6 comments
Hello,
Thank you for the nice package!
Is it also possible to extract the intercept (or ''bias'' or ''base value'') in some way with fastshap?
Or is it possible to calculate these from the shap values as extracted with explain
?
Thanks a lot!
Hi @marboe123, thank you for the note. It is not currently possible to extract the bias term directly, but if you set adjust = TRUE
, then you can ensure that the row sums correspond to f(x) - mean(response); from this, you can obtain the associated bias term if you wish. See ?fastshap::explain
for deets on the adjust
argument. Does that help with your question? Feel free to post a reproducible example if you need more help.
Hi Brandon @bgreenwell , thank you for your explanation.
I think I understand. Although can it be correct that there is no difference between setting adjust = TRUE and adjust = FALSE in the example below?
library(xgboost)
library(vip)
library(fastshap)
library(SHAPforxgboost) #to load the dataX
y_var <- "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])
# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror", # For regression
eta = 0.02,
max_depth = 10,
gamma = 0.01,
subsample = 0.95
)
mod <- xgboost::xgboost(data = dataX,
label = as.matrix(dataXY_df[[y_var]]),
params = param_list, nrounds = 10,
verbose = FALSE, nthread = parallel::detectCores() - 2,
early_stopping_rounds = 8)
# Compute shapley values
shap1 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = FALSE)
shap2 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = TRUE)
shap1 <- as.data.frame(shap1)
shap2 <- as.data.frame(shap2)
difference <- setdiff(shap1, shap2)
Correct, the adjustment only applies to the sampling procedure that is used when no exact method exists.
I have tried to implement the f(x) - mean(response) below as you mentioned, although I do not succeed to obtain the associated bias term.
library(xgboost)
library(vip)
library(fastshap)
library(SHAPforxgboost) #to load the dataX
y_var <- "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])
# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror", # For regression
eta = 0.02,
max_depth = 10,
gamma = 0.01,
subsample = 0.95
)
mod <- xgboost::xgboost(data = dataX,
label = as.matrix(dataXY_df[[y_var]]),
params = param_list, nrounds = 10,
verbose = FALSE, nthread = parallel::detectCores() - 2,
early_stopping_rounds = 8)
# Compute shapley values
shap2 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = TRUE)
row_sums <- rowSums(shap2)
shap2 <- cbind(shap2, row_sums)
pred <- predict(mod, dataX)
meanpred <- mean(dataXY_df$diffcwv)
Hi @marboe123, you're close, but it's the average training prediction, not mean response. I made some adjustments to your example to show how to get the bias term and comfirm the results by comparing directly to XGBoost:
library(xgboost)
library(fastshap)
library(SHAPforxgboost) #to load the dataX
y_var <- "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])
# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror", # For regression
eta = 0.02,
max_depth = 10,
gamma = 0.01,
subsample = 0.95
)
mod <- xgboost(data = dataX, label = as.matrix(dataXY_df[[y_var]]),
params = param_list, nrounds = 10, verbose = FALSE,
nthread = parallel::detectCores() - 2, early_stopping_rounds = 8)
# Grab SHAP values directly from XGBoost
shap <- predict(mod, newdata = dataX, predcontrib = TRUE)
# Compute shapley values
shap2 <- explain(mod, X = dataX, exact = TRUE, adjust = TRUE)
# Compute bias term; difference between predictions and sum of SHAP values
pred <- predict(mod, newdata = dataX)
head(bias <- pred - rowSums(shap2))
# Compare to output from XGBoost
head(shap[, "BIAS"])
# Check that SHAP values sum to the difference between pred and mean(pred)
head(cbind(rowSums(shap2), pred - mean(pred)))
Hi Brandon @bgreenwell , thanks a lot for your explanation and adjusting the code! The concept is much more clear to me now.