/Citi_Bike_Analysis

Analyzing Citi Bike data using Tableau

Primary LanguageJupyter Notebook

Citi_Bike_Analysis

This project is tasked with aggregating the data found in the Citi Bike Trip History Logs to build a data dashboard,story, or report using Tableau. The Citi Bike Trip History Logs are posted publicly on the Citi Bike website.

The data includes:

  • Trip Duration (seconds)
  • Start Time and Date
  • Stop Time and Date
  • Start Station Name
  • End Station Name
  • Station ID
  • Station Lat/Long
  • Bike ID
  • User Type (Customer = 24-hour pass or 3-day pass user; Subscriber = Annual Member)
  • Gender (Zero=unknown; 1=male; 2=female)
  • Year of Birth

This data has been processed to remove trips that are taken by staff as they service and inspect the system, trips that are taken to/from any of our “test” stations, and any trips that were below 60 seconds in length (potentially false starts or users trying to re-dock a bike to ensure it's secure).

Data sets used:

Jersey City Trips from January 2019 through May 2019 (This dataset is much smaller and was initially used to assemble the worksheets and dashboards.)

New York City Trips from January 2019 through May 2019 (This dataset is much larger.)

Data cleaning:

  • Removed entries with birth year < 1939 (or age > 80).
  • Removed entries with trips that are over 24 hours. Bikes checked out for more than 24 hours are considered lost/stolen.

Resulting Jersey City data set contained 130,966 records.

Resulting NYC data set contained 6,918,077 records.

Dashboards include:

  • Total number of trips recorded during the time period.
  • Average duration of trips.
  • User demographics based on gender and age.
  • Short-term customers and annual subscribers.
  • Top 10 stations for starting/ending trips.
  • Bottom 10 stations for starting/ending trips.
  • Bikes (by ID) most likely due for repair or inspection based on bike usage.

Visualizations:

  • A static map that plots all bike stations with a visual indication of the most popular locations to start and end a journey with zip code data overlaid on top.
  • A dynamic map that shows how each station's popularity changes over time by month

Deliverables:

Jersey City Bike Usage

https://public.tableau.com/profile/cindy7982#!/vizhome/CitiBikeAnalytics-JerseyCity/CitiBikes2019

New York City Bike Usage

Tableau workbook was too large to upload to GitHub, so exported workbook to PDF.