Jupyter notebooks with Demos of Feature-engine's functionality

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow scikit-learn's functionality with fit() and transform() methods to first learn the transforming parameters from data and then transform the data.

In this repo, you will find a lot of examples on how to use Feature-engine's transformers on various datasets. The notebooks are sorted in the following folders and include examples for the following transformers:

creation

MathematicalCombination
CombineWithReferenceFeature
CyclicalTransformer - notebook wanted, please contribute

discretisation

EqualFrequencyDiscretiser
EqualFrequencyDiscretiser plus WoEEncoder
EqualWidthDiscretiser
EqualWidthDiscretiser plus OrdinalEncoder
DecisionTreeDiscretiser
ArbitraryDiscreriser
ArbitraryDiscreriser plus MeanEncoder

encoding

OneHotEncoder
OrdinalEncoder
CountFrequencyEncoder
MeanEncoder
WoEEncoder
PRatioEncoder
RareLabelEncoder
DecisionTreeEncoder

imputation

MeanMedianImputer
RandomSampleImputer
EndTailImputer
AddMissingIndicator
CategoricalImputer
ArbitraryNumberImputer
DropMissingData -- notebook wanted, please contribute

outliers

Winsorizer
ArbitraryOutlierCapper
OutlierTrimmer

pipelines

create new features - wine data
regression pipeline - house prices data
more notebooks wanted, please constribute

transformation

LogTransformer
LogCpTransformer
ReciprocalTransformer
PowerTransformer
BoxCoxTransformer
YeoJohnsonTransformer

wrappers

SklearnTransformerWrapper plus Scikit-learn's OneHotEncoder
SklearnTransformerWrapper plus Scikit-learn's feature selection classes
SklearnTransformerWrapper plus Scikit-learn's KBinsDiscretizer
SklearnTransformerWrapper plus Scikit-learn's Scalers
SklearnTransformerWrapper plus Scikit-learn's SimpleImputer

selection

notebooks wanted, please contribute

Contributing

We welcome notebooks from users of the package. If you want to create one of the missing notebooks, or want to add a notebook of your own, provided that the data set is free to share, make a pull request with the code.

How to contribute:

Local Setup Steps

Fork the repo
Clone your fork into your local computer: git clone https://github.com/<YOURUSERNAME>/feature-engine-examples.git
cd into the repo cd feature-engine-examples
If you haven't done so yet, install feature-engine pip install feature_engine
Create a feature branch with a meaningful name for your notebook: git checkout -b mynotebookbranch
Develop your notebook
Add the changes to your copy of the fork: git add ., git commit -m "a meaningful commit message", git pull origin mynotebookbranch:mynotebookbranch
Go to your fork on Github and make a PR to this repo
Done

thibaultbl/feature-engine-examples