This uses a dataset from Kaggle to explore Spark DataFrames(http://alaska.epfl.ch/~dockermoocs/bigdata/atussum.csv) It contains information about how do people spend their time (e.g., sleeping, eating, working, etc.).
###goal is to identify three groups of activities:
- primary needs (sleeping and eating),
- work,
- other (leisure).
how much time do we spend on primary needs compared to other activities? do women and men spend the same amount of time in working? does the time spent on primary needs change when people get older? how much time do employed people spend on leisure compared to unemployed people?