A supplementary code for Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data paper.
It learns deep ensembles of oblivious differentiable decision trees on tabular data
- A machine with some CPU (preferably 2+ free cores) and GPU(s)
- Running without GPU is possible but takes 8-10x as long even on high-end CPUs
- Our implementation is memory inefficient and may require a lot of GPU memory to converge
- Some popular Linux x64 distribution
- Tested on Ubuntu16.04, should work fine on any popular linux64 and even MacOS;
- Windows and x32 systems may require heavy wizardry to run;
- When in doubt, use Docker, preferably GPU-enabled (i.e. nvidia-docker)
- Clone or download this repo.
cd
yourself to it's root directory. - Grab or build a working python enviromnent. Anaconda works fine.
- Install packages from
requirements.txt
- It is critical that you use torch >= 1.1, not 1.0 or earlier
- You will also need jupyter or some other way to work with .ipynb files
- Run jupyter notebook and open a notebook in
./notebooks/
- Before you run the first cell, change
%env CUDA_VISIBLE_DEVICES=#
to an index that you plan to use. - The notebook downloads data from dropbox. You will need 1-5Gb of disk space depending on dataset.
We showcase two typical learning scenarios for classification and regression. Please consult the original paper for training details.