We show with two simple tasks, how to process the data with Google Cloud DataFlow. Our solution extracts basic statistics from the data, which can serve as an input to a statistical program for in depth analysis.
Check resources/datasets folder for more info about datasets.
- How many movies does each user rate?
- Is the movie app used more by females or males?
- Which gender watches more movies, males of females?
- Perhaps males and females rate movies differently. Is there a difference in ratings between genders, which gender rates movies with higher ratings, is this difference significant?
- How to optimize the display time duration of the button shown in the app?