aia-uclouvain/pydl8.5

Recommended procedure for feature discretization

Closed this issue · 1 comments

Hi,

I am beginning to try out the DL8.5 algorithm. Since it only operates over 0-1 features, continuous features probably need to binned and be represented as dense one-hot vector outside the library. Is this correct? If so, is there a recommended feature binning procedure that is known to work well with the technique?

Thanks

In case it could help someone, the continuous features should be binarized outside the library. For this, any approach returning a 0/1 numpy 2d matrix is fine. You could use for instance the OneHotEncoder class of the Scikit-learn preprocessing package.