/beyond-OLS-med-supp-material

Supplementary material for the medium article Beyond linear regression: Leveraging linear regression for feature selection of continuous/categorical variables.

Primary LanguageJupyter NotebookMIT LicenseMIT

Beyond linear regression

— Leveraging linear regression for feature selection of continuous/categorical variables—

Overview

This repository is a supplementary material for the medium article Beyond linear regression: Leveraging linear regression for feature selection of continuous/categorical variables.

It applies the introduced feature selection technic on the Automobile Data Set. The objective is to find the top $K$ most relevant features that explains the price of the car.

Project architecture

I adopt cookiecutter Simple DS project to structure this repository.

  • data folder gathers raw and processed data
  • notebooks contains the notebooks for preprocessing, exploring, and performing features selection.
  • py_scripts is a python package where I put all the utils used in to produce the notebooks.

Get started

  1. clone the repository on your local machine in cd to it
git clone https://github.com/Badr-MOUFAD/supp-material-med-article
cd supp-material-med-article
  1. Initialize a the conda environnement and install py_scripts
conda env create -f environment.yml
pip install -e .
  1. run the notebooks in the notebooks folder in the specified order

useful links