The public tableau website: https://public.tableau.com/profile/zheng5688#!/vizhome/CitiBike_16195667310560/Story1?publish=yes
Original data of Citi Bike programs from 2018 to 2020 is saved in Data folder. A jupyter notebook citibike-data.ipynb is used to preprocess the data and save the aggregataed data in Resource folder.
Congratulations on your new job! As the new lead analyst for the New York Citi Bike Program, you are now responsible for overseeing the largest bike sharing program in the United States. In your new role, you will be expected to generate regular reports for city officials looking to publicize and improve the city program.
Since 2013, the Citi Bike Program has implemented a robust infrastructure for collecting data on the program's utilization. Through the team's efforts, each month bike data is collected, organized, and made public on the Citi Bike Data webpage.
However, while the data has been regularly updated, the team has yet to implement a dashboard or sophisticated reporting process. City officials have a number of questions on the program, so your first task on the job is to build a set of data reports to provide the answers.
Your task in this assignment is to aggregate the data found in the Citi Bike Trip History Logs and find two unexpected phenomena.
Design 2-5 visualizations for each discovered phenomena (4-10 total). You may work with a timespan of your choosing. Optionally, you may merge multiple datasets from different periods.
The following are some questions you may wish to tackle. Do not limit yourself to these questions; they are suggestions for a starting point. Be creative!
-
How many trips have been recorded total during the chosen period?
-
By what percentage has total ridership grown?
-
How has the proportion of short-term customers and annual subscribers changed?
-
What are the peak hours in which bikes are used during summer months?
-
What are the peak hours in which bikes are used during winter months?
-
Today, what are the top 10 stations in the city for starting a journey? (Based on data, why do you hypothesize these are the top locations?)
-
Today, what are the top 10 stations in the city for ending a journey? (Based on data, why?)
-
Today, what are the bottom 10 stations in the city for starting a journey? (Based on data, why?)
-
Today, what are the bottom 10 stations in the city for ending a journey (Based on data, why?)
-
Today, what is the gender breakdown of active participants (Male v. Female)?
-
How effective has gender outreach been in increasing female ridership over the timespan?
-
How does the average trip duration change by age?
-
What is the average distance in miles that a bike is ridden?
-
Which bikes (by ID) are most likely due for repair or inspection in the timespan?
-
How variable is the utilization by bike ID?
Next, as a chronic over-achiever:
- Use your visualizations (does not have to be all of them) to design a dashboard for each phenomena.
- The dashboards should be accompanied with an analysis explaining why the phenomena may be occuring.
City officials would also like to see one of the following visualizations:
-
Basic: A static map that plots all bike stations with a visual indication of the most popular locations to start and end a journey with zip code data overlaid on top.
-
Advanced: A dynamic map that shows how each station's popularity changes over time (by month and year). Again, with zip code data overlaid on the map.
-
The map you choose should also be accompanied by a write-up unveiling any trends that were noticed during your analysis.
Finally, create your final presentation
- Create a Tableau story that brings together the visualizations, requested maps, and dashboards.
- This is what will be presented to the officials, so be sure to make it professional, logical, and visually appealing.
Remember, the people reading your analysis will NOT be data analysts. Your audience will be city officials, public administrators, and heads of New York City departments. Your data and analysis needs to be presented in a way that is focused, concise, easy-to-understand, and visually compelling. Your visualizations should be colorful enough to be included in press releases, and your analysis should be thoughtful enough for dictating programmatic changes.
Your final submission should include:
- A link to your Tableau Public workbook that includes:
- 4-10 Total "Phenomenon" Visualizations
- 2 Dashboards
- 1 City Official Map
- 1 Story
- A text or markdown file with your analysis on the phenomenons you uncovered from the data.
Data Boot Camp © 2019. All Rights Reserved.