NorskRegnesentral/shapr

Make linear model without interactions more efficient

martinju opened this issue · 0 comments

For linear models without interactions, we only need E[X_j|X_S] for every single feature X_j to compute the Shapley values, not E[f(X)|X_S]. We should utilize this to compute Shapley values more efficiently for this special case

When this is also combined with approach="gaussian", (and maybe also approach = "copula"?), we could also bypass the Monte Carlo sampling from the fitted models in this case.

Note: Linear models with lower order interactions also lead to simplifications, as it requires E[X_jX_k|X_S] for pairwise interactions and E[X_jX_k_X_l|X_S] for triplets and so on, but it is more tedious to implement that.