audi_used_car_EDA

An overview on a dataset of AUDI Cars. Analysis of the data set with comparing different columns with each other, using Histogram, Scatterplot, Distribution, ECDF, Barplot, Violin plot along with proper naming and plot analysis.

The attributes of the dataset are:

  • model - model of the car

  • year - year of manufacture

  • price - price of the car

  • transmission - type of transmission in the car (Automatic, Manual and Semi-automatic)

  • mileage - mileage of the car

  • fuelType - type of fuel used in the car (Petrol, Diesel, Electric, Hybrid and Others)

  • mpg - miles the car run per gallon

  • engineSize - size of the engine used in the car

Data Visualizations Objectives:

  • Importing necessary python packages

  • Reading the excel (.csv) file

  • Naming the new dataFrame

  • Finding the number of rows and columns

  • Descriptive Statistics of the dataset

  • Finding the data types and missing values

Visualizing...

  • Visualizing the dataset using different plots

  • The top 5 selling car models in the dataset

  • The average selling price of the top 5 selling car models

  • The total sale of the top 5 selling car models

  • Data Set Aggregation and exploratory data analysis based on model, transmission and fuel type of the audi cars

Data Visualization:

In this project, the dataset has been visualized to observation, using bar plot, scatter plot, histogram plot, distribution plot, ECDF, violin plot and box plot.


Machine Learning Model:

I have applied One-hot Encoding, Linear Regression and Predicted the model training results. I have compared the actual and predicted target variable through visualization later also for an easy understanding.

Dataset Referance:

https://www.kaggle.com/aishwaryamuthukumar/cars-dataset-audi-bmw-ford-hyundai-skoda-vw