/Sklearn_Pipeline

Machine learning (ML) pipelines consist of several steps to train a model.

Primary LanguageJupyter NotebookMIT LicenseMIT

Machine Learning PipeLine

Machine learning (ML) pipelines consist of several steps to train a model.

A machine learning pipeline would consist of the following processes:

  • Data collection
  • Data cleaning
  • Feature extraction (labelling and dimensionality reduction)
  • Model validation
  • Visualisation

Data collection and cleaning are the primary tasks of any machine learning Pipeline.

Data

House Prices: Advanced Regression Techniques

File descriptions

  • train.csv - the training set

  • test.csv - the test set

  • data_description.txt - full description of each column, originally prepared by Dean De Cock but lightly edited to match the column names used here

  • sample_submission.csv - a benchmark submission from a linear regression on year and month of sale, lot square footage, and number of bedrooms Data fields Here's a brief version of what you'll find in the data description file.