/Entity_Embeddings

Written a tutorial on learning Entity Embeddings for categorical variables with neural networks for use in tree-based algorithms.

Primary LanguageJupyter Notebook

Random forest + Entity embeddings

Summary

A series of notebooks covering Entity Embeddings for categorical variables. Specifically, these kernels show how we can learn categorical embeddings by training a feed-forward DNN on high-performance gpu-enabled hardware, and later exploit them to boost the performance of other algorithms, such as tree-based ensembles (here we use a random forest), which can be trained on more lightweight infrastructure.

NB

The embeddings were learnt previously in this notebook, as part of the ASHRAE competition on Kaggle.