Jacob-
We have daily sales data for the last four years, and we'd love some insight into how to schedule our salespeople. I certainly can't find any pattern in the data, but maybe you can with your 'machine learning' stuff? Let me know if you have any insights.
Thanks.
-Tim
- Explore the dataset
- If possible, create a model which can predict store sales to inform staffing decisions
- Plot the sales predictions against the actual sales in the test set
- Python 3.9.7
- matplotlib.pyplot
- matplotlib.ticker
- numpy
- pandas
- seaborn
- holiday from pandas.tseries
- RandomForestRegressor from sklearn.ensemble
- permutation_importance from sklearn.inspection
- acf, pacf from statsmodels.tsa.stattools
- plot_acf, plot_pacf from statsmodels.graphics.tsaplots
(Original source unknown)
- Decreased monthly staffing expenses by > 36%.
- Increased salesperson satisfaction and retention by ensuring adequate staffing on busy days.
- The model can be used for staffing decisions ~6 weeks into the future.
- When spikes in sales volume are predicted, ~40% should be added to that predicted number when scheduling sales reps.
- The general contour of the predictions closely matches the actual sales, suggesting that when a spike is predicted, more sales reps should be scheduled, even if the actual size of the spike isn't perfectly accurate.
- The model tends to underestimate sales spikes by ~40%, so if one rep can handle ~ $1000 in daily sales, and a spike of $3000 is predicted, 4 reps should be scheduled.
- Hybrid model predictive accuracy declines significantly 100+ days in the future, so long-term hiring decisions are better informed by the simple linear model.
- For further study:
βββWhy is there a 50-day lagging trend in sales?
βββCan I get the same predictive power with only the lagging data, to make sure there's no information leakage in the model?
Here's the notebook for your perusal, fully annotated.