Activity 04 - Linear Regression Considerations

This activity is intended to be completed in one week - outside of class preparation work and two 75-minute class meetings. On our Blackboard course site you were provided with items to read, watch, and do prior to attempting this activity. Do not proceed in this activity until you have minimally:

  1. (Day 1 portion) Read ISL Sections 3.3.1 & 3.3.2.
  2. (Day 2 portion) Read ISL Sections 3.3.3.

In this repository/directory, you should see five items:

  • README-img - a folder containing images that I am embedding within this README.md file and other files. You do not need to do anything with this.
  • .gitignore - a file that is used to specify what Git can ignore when pushing to GitHub. You do not need to do anything with this.
  • README.md - the document you are currently reading.
  • day01-fitting - a folder that contains items for you to complete during the first 75-minute class meeting.

We will explore most of these items over this week. Before doing that, you will first make your own copy of this repository.

check-in Check in

Do you want an interactive way to check your understanding outside of class? Remember that these are a good way to check your foundation understanding and were created by Benjamin Baumer (associate professor at Smith College), in collaboration with the OpenIntro team and others. The following tutorials will provide you with an applied approach to our topics (reorganized to better correspond with our readings):

Days 1 & 2:

Note that this tutorial covers materials for this week and the past week.

Task 1: Forking & cloning

Forking

Read these directions first, then work through them. In this GitHub repo (i.e., my repo):

  1. Click on the fork Fork icon near the upper-right-hand corner. You will be taken to a Create a new fork screen.
  2. Verify that your GitHub username is selected under Owner and that the Repository name is activity04-regression-considerations with a green check mark (this verifies that you do not already have a GitHub repository with this name).
  3. You may provide a Description if you would like. This is a way to provide some additional, more descriptive, meta information related to the things you did. I like to provide a brief description of what happened.
  4. Verify that Copy the main branch only is selected.
  5. Click on the green Create fork button at the bottom of this page.

You should be taken a copy of this repo that is in your GitHub account. That is, your page title should be username/activity04-regression-considerations, where username is replaced with your GitHub username. Directly below this, you will see the following message:

forked from gvsu-sta631/activity04-regression-considerations

You will complete the rest of this activity in your forked copy of the activity04-regression-considerations repo.

Cloning

Read these directions first, then work through them. Note that you will be switching between RStudio and your GitHub repo (that you previously forked).

  1. In RStudio, click on the RStudio Project icon (the icon below the Edit drop-down menu).
  2. Click on Version Control on the New Project Wizard pop-up.
  3. Click on Git and you should be on a “Clone Git Repository” page.
  4. Back to your activity04-regression-considerations GitHub repo, click on the green Code button near the top of the page.
  5. Verify that HTTPS is underlined in orange/red on the drop-down menu, then copy the URL provided.
  6. Back in RStudio, paste the URL in the “Repository URL” text field.
  7. The “Project directory name” text field should have automatically populated with activity04-regression-considerations. If yours did not (this is usually an issue on Macs),
    • Click back into the “Repository URL” text field.
    • Highlight any bit of this text (it does not seem to matter what or how much).
    • Press Ctrl/Cmd and the “Project directory name” should now have automatically populated with activity04-regression-considerations.
  8. In the “Create project as subdirectory of” field, click on Browse…. Create a New Folder called “STA 631”, then within this folder, create a New Folder called “Activities”, then click Choose. If you already have this, you can simply browse to “STA 631/Activities”, then click Choose. Note that I am forcing you to use my opinionated file system management style.
  9. Click on Create Project.

Your screen should refresh and the Files pane should say that you are currently in your activity04-regression-considerations folder that currently has the same files and folders as your GitHub repo. If you are asked for your GitHub credentials, provide your GitHub username and your PAT (not your password).

check-in Check in

Take a moment to reflect on what is possibly your second time doing this forking process.

  • This is now your third time doing this complete process in one “swoop”, how is it going?
  • What is easier?
  • What do you still need help remembering what to do?

We will use a different dataset for this activity. If you took STA 518 with me, you already have experience with this dataset - Professor evaluations and beauty. From the OpenIntro site:

The data are gathered from end of semester student evaluations for 463 courses taught by a sample of 94 professors from the University of Texas at Austin. In addition, six students rate the professors’ physical appearance. The result is a data frame where each row contains a different course and each column has information on the course and the professor who taught that course.

Read these directions first, then work through them.

  1. In your activity04-regression-considerations repo folder/directory, locate and click into the day01-fitting subfolder.
  2. In the day01-fitting subfolder, you will be greeted by a new README.md file. Do your best to complete the tasks/directions provide in this subfolder by 11:59 pm (EST) on Tue, Jan 31.
  3. Ask questions in class as you are working. If you need to finish this up outside of our class meetings, remember that you can use our Teams workspace (linked on Blackboard), and post questions/issues in the Muddy channel. If someone else already posted what you though was muddy, add any clarification to their post and give them a “+ 1” 👍. Remember that this space is for conversations as well as posting questions. Read through your peers’ muddy posts and do your best to provide help.

The rest of this README document contains tasks/directions for the second class meeting of this week.

Attribution

This document is based on labs from OpenIntro.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.