Google-Play-Store-Apps-EDA

Category:

  • App / Application

  • Technology

  • Software

Overview:

Conducted different Explanatory Data Analysis methods to preprocess the dataset's each variables and deploy a Machine Learning model, namely the Linear Regression to calculate the RSME.

Language & Tool:

  • Python

  • Tableau

  • Power BI

Dataset:

Source link: https://www.kaggle.com/lava18/google-play-store-apps

App Category Rating Reviews Size Installs Type Price Content Rating Genres Last Updated Current Ver Android Ver

Exploratory Data Analytis:

Data Engineering:

  • Import Libraries

  • Load Data

  • Check data shape

  • Check null sum

  • Check duplicates

Process Data Columns:

  • Type

  • Review

  • Category

  • Genres

  • Size

  • Installs

  • Price

  • Content Rating

  • Last Update

  • Android Ver

Visualization in both Python (as per Notebook) and Tableau (as below):

Bar - Content Rating vs Rating

Bar - Content Rating vs Rating

Boxplot - Price filtered by Paid Type

Boxplot - Price filtered by Paid Type

Boxplot - Sum of Reviews 0-150K

Boxplot - Sum of Reviews 0-150K

Boxplot of Rating vs Category with Types

Boxplot of Rating VS Category with Types

Boxplot of Rating vs Category

Boxplot of Rating VS Category

Boxplot of Rating vs Category

Boxplot of Rating vs Genres

Boxplot of Rating vs Genres

Boxplot of Rating vs Genres with Types

Boxplot of Rating vs Genre with Types

Distribution of Rating

Distribution of Rating

Distribution of Rating

Treemap - Rating

Heatmap - Rating

Treemap - Rating

Area - Rating by Content Rating

Area - Rating by Content Rating

Heatmap - Rating bin vs Content Rating

Heatmap - Rating bin vs Content Rating

Histogram - Rating group by Content Rating

Histogram - Rating group by Content Rating

Line - Rating vs Sum Size

Line - Rating vs Sum Size

Scatter Plot - Rating vs Count Size

Scatter Plot - Rating vs Count Size

Dashboard - Rating

Dashboard - Rating

Dashboard - Category

Dashboard - Category

Dashboard - Rating Distribution

Dashboard - Rating Distribution

Build Model:

  • Pre-processing

  • Featuring

  • Predicting