awslabs/sagemaker-debugger

Full shap values

NRauschmayr opened this issue · 2 comments

When retrieving full_shap values, debugger returns a matrix in the shape of number of training samples, number of features e.g.

for index,i in enumerate(trial.tensor_names(regex='full_shap')):
    tensor = trial.tensor(i).value(step_num=50)
    print(i, tensor.shape)

full_shap/bias (26048, 13)
full_shap/f0 (26048, 13)
full_shap/f1 (26048, 13)
full_shap/f10 (26048, 13)
full_shap/f11 (26048, 13)
full_shap/f2 (26048, 13)
full_shap/f3 (26048, 13)
full_shap/f4 (26048, 13)
full_shap/f5 (26048, 13)
full_shap/f6 (26048, 13)
full_shap/f7 (26048, 13)
full_shap/f8 (26048, 13)
full_shap/f9 (26048, 13)

Expected shape of the tensor would be (26048,1) The issue happens in this line:

for feature_id, feature_name in enumerate(feature_names):
       self._save_for_tensor(f"full_shap/{feature_name}", self._full_shap_values)

The following should fix it:

for feature_id, feature_name in enumerate(feature_names):
       self._save_for_tensor(f"full_shap/{feature_name}", self._full_shap_values[:, feature_id])

Why has this not been fixed yet, it caused me so much confusion and wasted time?