STAT GU4205/5205
The data comprise of roughly 25,000 records for males between the age of 18 and 70 who are full time workers. A variety of variables are given for each subject: years of education and job experience, college graduate (yes, no), working in or near a city (yes, no), US region (midwest, northeast, south, west), commuting distance, number of employees in a company, and race (African American, Caucasian, Other). The response variable is weekly wages (in dollars). The data are taken from many decades ago so the wages are low compared to current times.
A government offcial is interested in whether the average male wages are statistically different for the three race classes. Specically, the government offcial wants to answer the following research questions:
- Do African American males have statistically different wages compared to Caucasian males?
- Do African American males have statistically different wages compared to all other males?
The goal of this case study is two fold:
- Come up with a linear regression model that incorporates all relevant variables, interactions and functional forms of the covariates.
- Using the final model, test the two research questions above.