DanaJomar/PyALE

Why use moving average for centered ALE 1D?

Closed this issue · 2 comments

Hi, i would like to ask why in line 57 of ALE_1D.aleplot_1D_continuous function did you use moving average instead of vanilla average based on the original ALE paper?
Thanks.

mean_mv_avg = (
(res_df["eff"] + res_df["eff"].shift(1, fill_value=0)) / 2 * res_df["size"]
).sum() / res_df["size"].sum()
res_df = res_df.sort_index().assign(eff=res_df["eff"] - mean_mv_avg)

I understand that they implemented it exactly as it is done in the R package ALEPlot: https://github.com/cran/ALEPlot/blob/d283def19a5d7f1840f18dfca82c073210df0d81/R/ALEPlot.R#L95C5-L95C54

And the R package was created by Dan Apley, the author of the original ALE paper.

ALIBI Explain, another Python package also makes this centering: https://github.com/SeldonIO/alibi/blob/bf32cbf0542136ec2d731e0b82b786a0af0997fa/alibi/explainers/ale.py#L535

Dalex just substracts the mean for what I gather: https://github.com/ModelOriented/DALEX/blob/2b8a82899b17a0b2d953183bf1ce240f7a79fdf1/python/dalex/dalex/model_explanations/_aggregated_profiles/utils.py#L27

Yes, so only this package used moving average rather than the supposed sample mean subtraction. Either way we decided to amend this package's code in our project.