/Simple-Multiple-Regression-in-Python

Regression Techniques in Machine learning including topics from Assumption, Simple and Multiple Linear Regression. Both theory and python codes are included.

Primary LanguageJupyter Notebook

Simple-Multiple-Regression-in-Python

01 LR Introduction (Theory)

  • Linear relationship between Input and Output

02 OLS, Simple & Multiple Regression (Theory)

  • Simple : One Independent Variable
  • Multiple : More than One Independent Variables
  • OLS :
    • Ordinary Least Square
    • Sum of all [(Acutal - Observed)^2] = Total Error
  • Steps to build Regression Model
    1. Select all variable
    2. Stepwise Regression - Backward & Forward
    3. Model Score comparision

03 Regression Assumptions (Theory)

  1. Linearity : X linear to Y
  2. Constant Error Variance : Homoscedacity
  3. Independent Error Term : Auto Correlation
  4. Normal Error : Normal distribution of Error
  5. No multicollinearity : Independent X variables
  6. Exogenity : Omitted Variance Bias

04 Residual Plot (Theory)

  • Plot of Error

05 Homoscedasticity & Hetroscedasticity (Theory)

  • Homoscedasticity : Same Variance
  • Hetroscedasticity : Different Variance

06 Covariance & Correlation (Theory)

  • Covariance : Direction of a relationship between variables
  • Correlation : Strength & Direction of a relationship between variables

07 Correlation & Causation (Theory)

08 Collinearity (Theory)

  • Why colliearity a Problem ?
  • Check Collinearity
  • Multi collinearity

09 Auto Correlation (Theory)

  • Similarity between observations as a function of time lag between them

10 VIF (Theory)

  • Detects multicollinearity in Regression

11 Regression Assumptions in Python (Code)

  • Step by steps checking Regression Assumptions

12 Metrics in Regression (Theory)

  • Asses Model Performance
    1. Mean Absolute Error (MAE)
    2. Mean Square Error (MSE)
    3. Root Mean Square Error (RMSE)
    4. Mean Absolute Percentage Error (MAPE)
    5. Mean Percentage Error (MPE)
    6. R Square

13 Sum of Square & Adjusted R Square (Theory)

  • Total Variation = Explain Variation + Unexplained Variation
  • SST = SSR + SSE
  • SST = Sum of Square Total
  • SSR = Sum of Square Residual = Actual - Mean = explained error
  • SSE = Sum of Square Error = Actual - Predicted = unexplained error

14 R, R Square & Adjusted R Square (Theory)

  • R = Correlation Value is known as R
  • R Square = SSR / SST
  • R Square is variation explained by the Data

15 Hypothesis Testing (Theory)

  • Evalutes 2 or more exclusive statements
  • Null Hypothesis is always neutral (no relationship between variables)
  • Alternate Hypothesis is always neutral (there is a relationship between variables)

16 P Values (Theory)

  • Probability for the hypothesis to be True
  • All statistical package give P Value of Alternate Hypothesis
  • So P value for alternate hypothesis to be True

17 Level of Significance (Theory)

  • Level of Significance is Probability with which we will reject the Null Hypothesis denoted by (alpha)
  • Confidence Level is Probability with which we will accept the Null Hypothesis denoted by (1 - alpha)

18 Normal Distribution (Theory)

19 Confidence Interval & CLT (Theory)

20 Standard Error (Theory)

  • Meausre of Uncertainity in Sample Mean
  • Population Mean != Sample Mean

21 AIC & BIC (Theory)

  • Akaike Information Criterion
  • Bayesian Information Criterion
  • Both Penalize the complex model in nature

22 Regression Output Explained Part 1 (Theory)

  • Overall output is explained in depth

23 Regression Output Explained Part 2 (Theory)

  • Python Stats Model output explained in depth

24 Simple & Multiple Regression in Python (Code)

  • Step by step code of Simple & Multiple Regression

25 Interview Questions: Simple & Multiple (Theory)