EDA_NYC_Yellow_Taxi

image

Introduction

  • The dataset for this project is taken from NYC TLC Trip Record Data. link.

  • In this project, New york city yellow taxi data of 7 months (Jan - Jul) in year 2021 are taken into consideration for the analysis.

  • Here Dask, folium, geopandas, pandas library are utilized.

Objective:

  • The main goal of this project is to identify maximum taxi usage day and distance between two points.

  • Time difference between pick and drop location.

  • Top pickup and dropping points in the city

Conclusion

In this dataset, the latitude, longitude, new travel distance, and time difference are calculated. Few oultiners are also identified in the data due to improper entry.

From the calculations and analysis following are key findings:

  • Friday is the most active day and where most people utilized the taxis.

  • Raids with a single passenger are booked more compared to the 2,3,4 passengers

  • NY 237 is the most common pickup point. Because it has an intersection with famous locations and schools.

  • NY 236 is the most common dropoff point. 

  • Where NY 140 has low pickup points and NY 99 has low dropping points compared to all others.

  • Wednesday and thursday are the busiest days at NY 236 location

  • From morning 6 am to 14 pm is linear increment and from 15:00 - 23:00 pm there is linear decrement

  • From Night 1 am to 5 am usage of taxis are very low.

Please run the file to view the visualizations