/forecasting-time-series

Forecasting Wallmart Inventory for CA, TX, and WI in Python

Primary LanguageJupyter Notebook


Wallmart Sales Forecasting


Author: Chelsea Zaloumis

Last update: 3/19/2021

title

Background


Based off the M5 Forecasting Kaggle Competition: https://www.kaggle.com/c/m5-forecasting-accuracy/overview

Can you estimate, as precisely as possible, the point forecasts of the unit sales of various products sold in the USA by Walmart?

This project helped me better understand time series and forecasting problems.

Exploration


Original data can be found at the kaggle link to the competition above. Data contains item sales by state.

For exploring the data, users can find functions in src/helper_functions.py and example usage in src/0_eda.ipynb. Examples below for drilling down to each state's various stores, departments, and items.

Initial Modeling


Compared three regression models out of box on forecasting each state's monthly sales. Linear Regression outperformed both Random Forest and Gradient Boosting for all states. First, California monthly sales forecasted with Linear Regression achieving an R2 score of 0.37 and RMSE 734.07.

Texas monthly sales forecasted with Linear Regression achieving an R2 score of 0.56 and RMSE 373.16.

Wisconsin monthly sales forecasted with Linear Regression achieving an R2 score of 0.74 and RMSE 549.4.