Brazillian-E-Commerce Olist Analysys

By Kristine Petrosyan

Problem statement

As a data scientist, I have been tasked with drawing insights from a Kaggle dataset Brazillian retailer Olist. In particular, I will seek to answer the following questions, which are of interest to stakeholders:

  1. Customer LTV(lifetime value)
  2. Monthly performance of the business
  3. Best selling categories
  4. Prediction for future sales

Components

The Jupyter Notebook is our key deliverable and contains the answers to the above questions.

Data

The data was provided from Kaggle https://www.kaggle.com/olistbr/brazilian-ecommerce and https://www.kaggle.com/olistbr/marketing-funnel-olist.

Methodology

  • The relevant data was queried from the table and stored as a Pandas DataFrame.
  • Data manipulation was undertaken as required (e.g. creating feature columns).
  • EDA and visualisations were created.
  • Time Series Arima model were used to forecast the future sales.

Findings and Recommendations

  • As a conclusion we have:

    • From all customers only 3% are recurring and remaining 97% are just below 1 year purchasers.
    • Total revenues across 29 segments came in at 664,858 in the first eight months of 2018. The biggest segment was 'watches', which generated 17.4% of total revenues.
    • The best categories are watches and audio.
    • Though 'watches' segment is the largest part of revenue, it has only two sellers. Furthermore, the leading seller generated 97.0% of segment revenue.

    Please feel fee to contact me kristinelpetrosyan@gmail.com.