/digital-foundry-demand-forcasting

In tune with conventional big data and data science practitioners’ line of thought, currently causal analysis was the only approach considered for our demand forecasting effort which was applicable across the product portfolio. Experience dictates that not all data are same. Each group of data has different data patterns based on how they were sold and supported over the product life cycle. One-methodology-fits-all is very pleasing from an implementation of view. On a practical ground, one must consider solutions for varying needs of different product types in our product portfolio like new products both evolutionary and revolutionary, niche products, high growth products and more. With this backdrop, we have evolved a solution which segments the product portfolio into quadrants and then match a series of algorithms for each quadrant instead of one methodology for all. And technology stack would be simulated/mocked data(Hadoop Ecosystem) > AzureML with R/Python > Zeppelin.

Primary LanguageR

Digital_foundry_demand_forcasting

In tune with conventional big data and data science practitioners’ line of thought, currently causal analysis was the only approach considered for our demand forecasting effort which was applicable across the product portfolio. Experience dictates that not all data are same. Each group of data has different data patterns based on how they were sold and supported over the product life cycle. One-methodology-fits-all is very pleasing from an implementation of view. On a practical ground, one must consider solutions for varying needs of different product types in our product portfolio like new products both evolutionary and revolutionary, niche products, high growth products and more. With this backdrop, we have evolved a solution which segments the product portfolio into quadrants and then match a series of algorithms for each quadrant instead of one methodology for all. And technology stack would be simulated/mocked data(Hadoop Ecosystem) > AzureML with R/Python > Zeppelin.

Overview

The modeling system is designed to build complete models with price, promotions and competitive terms and flexible enough for exploratory & Ad Hoc modeling. The Model of POC is based on R implementation on RStudio® IDE and Azure ML Studio and is published in the Cortana Gallery : https://gallery.cortanaintelligence.com/Experiment/Digital-Foundry-Demand-Forecasting

Forecast Model selection

A multiple regression model was used to estimate demand (SALES) by incorporating historical data available as well as through other factors influencing the demand. The model was built at a Product-Market level (lowest level of granularity) Model equation:

Demand ie; Sales = f (Price, Discount, Other influencing factors*)

*Other influencing factors – Holiday variables, seasonality, Promotion support (Feature, Display), competitior’s effect etc.

Model Set-Up & Basics – Data Preparation

The model necessitated the addition of derived & external variables:

  • Discount
  • Competitor’s effect
  • Holiday variables
    • New Year Day
    • Easter
    • Memorial Day
    • Labor Day
    • Independence Day
    • Superbowl
    • Thanksgiving
    • Christmas
Holiday variables were incorporated into the model to account for the change in consumption pattern during the various holidays in the US. In addition to the holiday itself, effects of shopping behavior pre- (that is, before the holiday) was also captured during the relevant week.

High level process flow in Azure ML

Refer the document: https://github.com/kumarchinnakali/digital-foundry-demand-forcasting/blob/master/BuildModel/DF2model.pdf

Model Results and Diagnostics

Fitted-salesunits and actual-salesunits are plotted against Weekending dates to capture their varying trends. Comparing Predictions from Causal Model and STLM model:

Model Evaluation

ModeL Evaluation is doen by calculating accuracy and MAPE Error for causal model and MASE Error for Decomposition and Croston Method bulit model.