/Prduct_Sale_Analysis

The project involves Cleaning, Analysing and creating Visualizations to make data easily accessible of ease use.

Primary LanguageJupyter Notebook

Amazon Electronics Sales Data Analysis

Overview

This dataset provides comprehensive sales data for electronics products on Amazon. It includes user ratings, product categories, and the time of sale. This data is crucial for understanding consumer behavior and market trends in the electronics segment on Amazon. The dataset can be downloaded here.

Objective

Our aim is to use Python libraries such as Pandas, Numpy, Matplotlib, and Seaborn for an in-depth analysis of the sales data. The dataset contains extensive information on electronics store purchases, including monthly breakdowns, product types, costs, purchase addresses, and more. We will conduct this analysis in a Jupyter Notebook environment.

Steps of Analysis

Step 1: Exploratory Data Analysis (EDA)

Purpose

EDA helps us to understand the dataset by identifying patterns, anomalies, testing hypotheses, and validating assumptions using summary statistics and graphical representations.

Process

  • Import necessary libraries (Pandas, Numpy, Matplotlib, Seaborn).
  • Load the dataset.
  • Use .head() and .tail() functions to view the first and last five rows.
  • Use .shape to understand the structure of the dataset. Our dataset has 1,292,954 observations and 10 features.
  • Examine the data types of each column and check for null values using .info().

Understanding the Dataset

Columns

  1. User ID (int64)
  2. Product ID (object)
  3. Rating (int64)
  4. Timestamp (int64)
  5. Category (object)

Data Type Conversions

Necessary conversions for accurate data representation (e.g., converting Timestamp from int64 to a proper timestamp format).

Statistical Summary

Using .describe() to get a statistical overview of the dataset, including measures like mean, median, standard deviation, and percentile values.

Unique Users and Items

Identifying the count of unique users and items.

Dealing with Missing Values

Understanding why data might be missing and addressing these gaps is crucial for a comprehensive analysis.

Step 2: Data Visualization

Utilizing Matplotlib and Seaborn for graphical representation of data. Analyzing trends such as best year for sales, best month for sales, top-selling brands and products, category-wise sales performance, and product ratings.

Conclusion

Key findings like the peak sales year, top-selling categories and brands, sales trends over the years, and consumer ratings.

Tools Used

Data Science: Python, Pandas, Numpy. Data Visualization: Matplotlib, Seaborn. Environment: Jupyter Notebook.


Note: Replace [link-to-data-set] with the actual URL where the dataset can be downloaded. Ensure that all analysis steps and conclusions are based on the actual data available in the dataset. This README should be adapted as per the findings and specifics of the dataset.