Name: Zun Yang USF email address: zyang65@dons.usfca.edu
The PDF version of this assignment spec is available under Files on Canvas.
As seen during class, there are many requirements on the data we feed to simple linear regression (SLR). In this lab, your task is to recreate the visual work in class to determine whether a dataset is appropriate for SLR and show how you are able to apply the technique to a previously-unseen dataset.
There are two datasets for this lab, listed in Table 1. Use (only) the Predictors listed.
Dataset | Target | Predictors |
---|---|---|
Toluca (Links to an external site.) dataset (courtesy of Paul Intrevado) | lotSize | workHours |
Credit (Links to an external site.) dataset (from Introduction to Statistical Learning with Applications in R) | Limit | Income, Rating, Cards, Age, Education |
Table 1: Datasets, Targets and Predictors
Use the starter code here (Links to an external site.) and fill in your name and USF email in the readme.
For each predictor, your implementation must:
- Plot the predictor against the target
- Plot the residual against the target
- Determine the coefficients (slope, intercept) against the target
… and based on the above, you must determine whether the predictor is suitable for SLR.
Grades for this assignment will be determined by the grader as follows:
- 100% = Code functions, is well-documented and clearly shows the relationships of all predictors to targets and to residuals.
- 75% = Code functions but is not well-documented or does not clearly show the relationships of all predictors to targets and to residuals.
- 50% = Code functions but is not well-documented -AND- does not clearly show the relationships of all predictors to targets and to residuals.
- 0% = No submission / code does not function.