The dataset for this project is taken from NYC TLC Trip Record Data. link.
In this project, New york city yellow taxi data of 7 months (Jan - Jul) in year 2021 are taken into consideration for the analysis.
Here Dask, folium, geopandas, pandas library are utilized.
Objective:
The main goal of this project is to identify maximum taxi usage day and distance between two points.
Time difference between pick and drop location.
Top pickup and dropping points in the city
Conclusion
In this dataset, the latitude, longitude, new travel distance, and time difference are calculated. Few oultiners are also identified in the data due to improper entry.
From the calculations and analysis following are key findings:
Friday is the most active day and where most people utilized the taxis.
Raids with a single passenger are booked more compared to the 2,3,4 passengers
NY 237 is the most common pickup point. Because it has an intersection with famous locations and schools.
NY 236 is the most common dropoff point.
Where NY 140 has low pickup points and NY 99 has low dropping points compared to all others.
Wednesday and thursday are the busiest days at NY 236 location
From morning 6 am to 14 pm is linear increment and from 15:00 - 23:00 pm there is linear decrement
From Night 1 am to 5 am usage of taxis are very low.