
Class Project for Introduction to Statistical Reasoning and Data Science: Exploratory Statistical Analysis on Royal Canadian Yacht Club Dataset

Exploratory Statistical Analysis on Royal Canadian Yacht Club Dataset

Class Project for Introduction to Statistical Reasoning and Data Science

• Background: We will work with a dataset of 1000 randomly selected RCYC members. The variables in the dataset contain basic information of the members and their RCYC facilities usages. The variables have been jittered (i.e. random noise has been added to them) to anonymize the data. This project aims to identify patterns of how RCYC members use their facilities.

• Outline: First, we will use randomization test to study the difference in median dining spendings between RCYC members who rented a dock and those who did not. Then, we will use linear regression to study the association between RCYC members’ spendings at RCYC bars and at RCYC restaurants. Finally, we will use classification tree to predict whether a member used RCYC fitness facilities based on his/her sex and other spendings at RCYC facilities (not counting restaurants and bars)