sibylhe/mmm_stan

MAPE problem with photo

Emnlv opened this issue · 3 comments

Emnlv commented

Hi Sibyl.

This formula:

def mean_absolute_percentage_error(y_true, y_pred): 
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

I have seen that if I am using the same value that is inside a DataFrame instead of an array, with the exact same values, results are totally different. I attached real examples with your data examples.

I will better explain, focusing on the calculation inside the formula mmm_decompose_contrib():
If you consider this: mc_tt["y_true"], mc_tt["y_pred"] = y_true2, y_pred
np.mean(np.abs((np.array(mc_tt["y_true"]) - np.array(mc_tt["y_pred"]))/np.array(mc_tt["y_true"]))) * 100
is totally different compared to np.mean(np.abs((np.array(y_true2) - np.array(y_pred)) / np.array(y_true2))) * 100. In your code it is as in this last form at line 552 and 553:

print('mape (multiplicative model): ', 
         mean_absolute_percentage_error(y_true2, y_pred))

I have check it step by step. And the output of this step:
np.abs((np.array(y_true2) - np.array(y_pred)) / np.array(y_true2)) it is not clear to me, because the output is a combination of array.
Here you can find the example.
first
In this case, the final MAPE is 20%

Whilst the output of this:
np.abs((np.array(mc_tt["y_true"]) - np.array(mc_tt["y_pred"]))/np.array(mc_tt["y_true"])) * 100 is a singular array, and it makes sense to me. Because on that you calculate then the mean
second

In this second case MAPE is 11%.

If it was an error, I suggest inside the formula mmm_decompose_contrib() to put the data required for the MAPE, into the mc_df and then to pass it through the formula as you did here for the first model:
print('mape: ', mean_absolute_percentage_error(df['sales'], df['base_sales']))

Sorry I was occupied last week. Hope I can check it this weekend.

Emnlv commented

Don’t worry Sibyl. I will wait for it, and thanks for the support.

Emnlv commented

Hi Sibyl,
Any news?
Because I have seen potentially the same problem in the third model, into the formula “evaluate_hill_model”