/Regression-Analysis

This is the course website for MSAN 601: "Linear Regression Analysis" at the University of San Francisco. Assignments, lecture notes, and open source code will all be available on this website.

Primary LanguageHTML

MSAN 601 - Linear Regression Analysis

James D. Wilson

Email: jdwilson4@usfca.edu

Time Line: Wednesday, August 22nd - Wednesday, October 11th

Class Time: M, W: 10:00 - 11:50 AM; 1:15 - 3:05 PM in Howard Room 527

Office Hours: M, W: 3:30 - 4:30 PM in Howard 5th floor Agora

Grader: Anshika Srivastava (asrivastava3@dons.usfca.edu)

Textbooks

  • Applied Linear Regression Models- 4th Edition by Kutner, Nachtsheim, and Neter (Required)
  • Introduction to Statistical Learning (online)
  • Elements of Statistical Learning (online)
  • Linear Models with R by Julian Faraway
  • Statistical Inference by Casella and Berger

Course Learning Outcomes

By the end of this course, you will be able to

  • Formulate and apply classical simple and multiple linear regression models
  • Formulate and test hypotheses and use models for both prediction and explanation
  • Use R to load and manipulate data, fit regression models, and generate various outputs like ANOVA tables, confidence intervals for parameters, and diagnostic assessments
  • Verify/test whether or not fitted residuals conform to the assumptions that underlie classical regression
  • Identify and manage outliers and influential observations
  • Assess and address multicollinearity, heteroscedasticity, autocorrelation, non-normality, model misspecification
  • Communicate the results of complete and well-reasoned regression analysis

Course Overview

The focus of this course will be to provide you with the basic mathematical and computational techniques available for making informed, data-driven decisions using regression models. We will implement the models using the R programming language. We will discuss the following topics

  • Distributional Theory: the Normal, t, Chi-Squared, and F distributions
  • Statistical Inference: estimation, hypothesis tests, and confidence intervals
  • Simple and Multiple Linear Regression
  • Model Building and Variable Selection
  • Outlier detection
  • Model Diagnostics: outliers, multicollinearity, non-normality, autocorrelation
  • Analysis of Variance (ANOVA)
  • Logistic Regression
  • Shrinkage Methods: the Lasso and Ridge Regression

Assessment

The focus of this course will be to provide you with the basic techniques available for making informed, data-driven decisions using the R programming language. This is not a statistics course, but will provide you the intuition to make hypotheses about complex questions through visualization, wrangling, manipulation, and exploration of data. The course will be graded based on the following components:

  • Assignments (30%): You will be assigned computational and theoretical homework assignments to be completed and turned in on Canvas
  • Quizzes (20%): Each week you will be given a short quiz that tests the main lessons taught in class from the previous week. These are given on Mondays at 9:00 AM
  • Final Exam (30%): The final exam will be a comprehensive exam covering the main components of Regression analysis
  • Final Project (20%): The final project will be a computational case study that brings together the techniques learned throughout the semester. The description for this project will be provided towards the mid point of the semester.

Quizzes

Homework

  • Homework 1. Due Thursday, September 7th at 9:00 AM on Canvas
  • Homework 2. Due Thursday, September 21st at 9:00 AM on Canvas
  • Homework 3. Due Wednesday, October 11th at 5:00 PM on Canvas

Final Case Study

Schedule

Introduction and Motivation

Topic Reading Practice In-Class Code
Intro and A Brief History of Data Science Ch. 1 of Doing Data Science Read this
Overview of Machine Learning Ch. 1 of ISL
Model Building from the Statistical Learning Perspective Ch. 2 of ISL

Model Fitting and Inference

Topic Reading Practice In-Class Code
Simple Linear Regression: Model and Estimation Ch. 3 of ISL Ch 3.6.1 - 3.6.3 ISL Intro to Regression in R
Basics of Statistical Inference
Tests, Confidence Intervals, and Prediction Intervals Ch 2 and 3 of Linear Models with R
Shrinkage Methods - Ridge and Lasso Ch. 6 of ISL Penalized Regression in R

Model Diagnostics

Topic Reading Practice In-Class Code
Influential Points and Outliers Ch 4.2 of Linear Models with R
Tests of Normality and Equal Variance Ch 4.1, 4.3 of Linear Models with R
Collinearity Ch 5.3 of Linear Models with R
Transformations Ch 7 of Linear Models with R

Generalized Linear Models

Topic Reading Practice In-Class Code
Classification and Logistic Regression Ch. 4.3 of ISL

Additional Resources

Important Dates

  • Wednesday, August 23rd - First day of class
  • Monday, September 4th - Labor Day Holiday, no class
  • Wednesday, October 11th - Last day of class