/feature-engine-examples

Primary LanguageJupyter NotebookBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Jupyter notebooks with Demos of Feature-engine's functionality

PythonVersion License https://github.com/feature-engine/feature_engine/blob/master/LICENSE.md Sponsorship https://www.trainindata.com/

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow scikit-learn's functionality with fit() and transform() methods to first learn the transforming parameters from data and then transform the data.

In this repo, you will find a lot of examples on how to use Feature-engine's transformers on various datasets. The notebooks are sorted in the following folders and include examples for the following transformers:

creation

  • MathematicalCombination
  • CombineWithReferenceFeature
  • CyclicalTransformer - notebook wanted, please contribute

discretisation

  • EqualFrequencyDiscretiser
  • EqualFrequencyDiscretiser plus WoEEncoder
  • EqualWidthDiscretiser
  • EqualWidthDiscretiser plus OrdinalEncoder
  • DecisionTreeDiscretiser
  • ArbitraryDiscreriser
  • ArbitraryDiscreriser plus MeanEncoder

encoding

  • OneHotEncoder
  • OrdinalEncoder
  • CountFrequencyEncoder
  • MeanEncoder
  • WoEEncoder
  • PRatioEncoder
  • RareLabelEncoder
  • DecisionTreeEncoder

imputation

  • MeanMedianImputer
  • RandomSampleImputer
  • EndTailImputer
  • AddMissingIndicator
  • CategoricalImputer
  • ArbitraryNumberImputer
  • DropMissingData -- notebook wanted, please contribute

outliers

  • Winsorizer
  • ArbitraryOutlierCapper
  • OutlierTrimmer

pipelines

  • create new features - wine data
  • regression pipeline - house prices data
  • more notebooks wanted, please constribute

transformation

  • LogTransformer
  • LogCpTransformer
  • ReciprocalTransformer
  • PowerTransformer
  • BoxCoxTransformer
  • YeoJohnsonTransformer

wrappers

  • SklearnTransformerWrapper plus Scikit-learn's OneHotEncoder
  • SklearnTransformerWrapper plus Scikit-learn's feature selection classes
  • SklearnTransformerWrapper plus Scikit-learn's KBinsDiscretizer
  • SklearnTransformerWrapper plus Scikit-learn's Scalers
  • SklearnTransformerWrapper plus Scikit-learn's SimpleImputer

selection

  • notebooks wanted, please contribute

Contributing

We welcome notebooks from users of the package. If you want to create one of the missing notebooks, or want to add a notebook of your own, provided that the data set is free to share, make a pull request with the code.

How to contribute:

Local Setup Steps

  • Fork the repo
  • Clone your fork into your local computer: git clone https://github.com/<YOURUSERNAME>/feature-engine-examples.git
  • cd into the repo cd feature-engine-examples
  • If you haven't done so yet, install feature-engine pip install feature_engine
  • Create a feature branch with a meaningful name for your notebook: git checkout -b mynotebookbranch
  • Develop your notebook
  • Add the changes to your copy of the fork: git add ., git commit -m "a meaningful commit message", git pull origin mynotebookbranch:mynotebookbranch
  • Go to your fork on Github and make a PR to this repo
  • Done

Thank you!!

Feature-engine features in the following resources:

Blogs about Feature-engine:

Documentation

En Español: