The Cycle-share dataset Analysis using BigData concepts - Hadoop MapReduce Framework, Hive, Pig, MapReduce Design Patterns
The dataset was downloaded from Kaggle - https://www.kaggle.com/pronto/cycle-share-dataset.
- Analysis 1 - Number of trips by month-year
- Analysis 2 - Min, Max and Average duration of trips from each station
- Analysis 3 - Total number of trips per station by year MapReduce
- Analysis 4 - Top 5 busy stations by month
- Analysis 5 - Most active age groups
- Analysis 6 - Number of trips in a day from each station and the corresponding weather on that day.
- Analysis 7 - Custom MapReduce algorithm to find the top 10 most busy routes
- Analysis 8 - Count membership by gender MapReduce
- Analysis 9 - Count of all the trips by station
- Analysis 10 - Top 5 busiest hours of the day
- Analysis 11 - Total number of trips that lasted more than 30mins (1800sec) in each station
- Analysis 12 - Number of trips in a day from each station and the corresponding weather on that day - joins patterns