/Project-CaliforniaHousingDataset

This repository contains a model using the California Housing Dataset. The project is about to create a Linear Regression model with Stochastic Gradient Descent with specific penalty parameters and compare its accuracy with a Linear Regression model.

Primary LanguageJupyter Notebook

Project-CaliforniaHousingDataset

This repository contains a model using the California Housing Dataset. The project is about to create a Linear Regression model with Stochastic Gradient Descent with specific penalty parameters and compare its accuracy with a Linear Regression model.

b. Objectives:

  1. Read the data.
  2. Exploratory Data Analysis.
  3. Feature selection.
  4. Using K-cross validation with K as 5, building 3 linear regression models and comparing their performance:
  • First model: Linear Regression model with Stochastic Gradient Descent with a penalty of 'Elastic Net'.
  • Second model: Linear Regression model with Stochastic Gradient Descent with a penalty of 'Ridge Regression'.
  • Third model: Ordinaty Linear Regression model.

c. Data: The data we will use is the California Housing Dataset from sklearn datasets, StatLib repository. We can see as follow the description of the variables: