Extracting insights from raw data

< under construction >

« this is meant to be a tutorial for beginners »

The goal of this repo is to generate as many insights as possible from raw data taken from the internet (thanks, UCI). The goal is to go through all steps in details, from data cleaning to building machine learning models. Each step has its own Jupyter Notebook.

  • What to expect:

    • Data cleaning;
    • Temporal analysis and Streamlit;
    • Predictive analysis;
    • Machine learning
  • About the data: This Online Retail II dataset contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/Dec/2009 and 09/Dec/2011.

  • Index (subject to changes):

  1. Data cleaning
  2. Temporal analysis and Streamlit
  3. Customer segmentation (Recency-Frequency-Monetary value)
  4. Predicting next purchase day
  5. Lifetime value prediction