/CITY-BIKING

Final Project for Data Science Class

Primary LanguageJupyter Notebook

Helsinki & San Francisco Bike Share Comparision - An Urban Study

Our exploratory data analysis has been completed in two notebooks, helsinki.ipynb and sanfrancisco.ipynb.

Data sources used thus far include:

A small sample of the joined and cleaned dataset for trips/stations has been added in the data/sf folder on GitHub since the entire database for SF is too large for GitHub to handle. Please see the Kaggle link for the complete database files that we are using.

Goals and research questions/areas

The goal of the project is to analyze the hidden layers of data among the daily travel of cyclists in the city of Helsinki and San Francisco. In order to do so, the various relationships are identified and necessary EDA has been performed to understand how cyclists travel in the both the cities. We are looking to compare how individuals from both cities are interacting with the bike sharing system. We are also interested in figuring out insights that are of interest to the bike share managers, such as which stations have a large or small amount of bikes at a given date, time, and weather. Essentially, one of the goals of the project will be to use machine learning models to help allocate the correct amount of bikes for each station. In order to do this we are interested in looking at a few key relationships such as the location, hour, month, weather, commuters/casual users, weekends/weekdays, distance, and more. The hundreds of station locations in Helsinki and San Francisco are able to tell us which features of the urban space most people are excited about and how they are using it currently. It is also important to understand what time of the day these centers are most active. Some of our initial visualizations are able to show these insights and we are planning to create more in our final submission.