/DSND_t2_p1_cardata

Data Analysis of 1990-2019 car MPG

Primary LanguageJupyter Notebook

DSND_t2_p1: cardata

Jesse Fredrickson

5/2/2019

Motivation

This project aims to clean and analyze a dataset of 30,000+ unique cars/trims from 1990 to 2019. In the project I will perform extensive cleaning and transformation of the data in order to accurately depict trends associated with certain properties of cars pertaining to MPG, and I will conclude by training and testing several different supervised machine learning algorithms on the data in order to predict MPG based on other features.

Files

Car_Data_Explore.ipynb: In this Ipython notebook file, I read in the data and perform all of my analysis

fullspecs.csv: This is the target dataset, taken from www.reddit.com/r/datasets

Results

In the Ipython notebook, I read in the .csv data using pandas, and analyzed it using regex, matplotlib, seaborn, numpy, and sklearn. In order to recreate my findings, simply download both of the files above, and run the cells in the notebook. A high level walkthrough can be found at https://medium.com/@jessefredrickson/are-modern-cars-as-efficient-as-we-like-to-think-a-brief-data-science-approach-c7dd91c22f9b