Recommended procedure for feature discretization

Question

Recommended procedure for feature discretization

Closed this issue 2 years ago · 1 comments

Hi,

I am beginning to try out the DL8.5 algorithm. Since it only operates over 0-1 features, continuous features probably need to binned and be represented as dense one-hot vector outside the library. Is this correct? If so, is there a recommended feature binning procedure that is known to work well with the technique?

Thanks

Answer 1 · 2023-01-20T17:04:19.000Z

In case it could help someone, the continuous features should be binarized outside the library. For this, any approach returning a 0/1 numpy 2d matrix is fine. You could use for instance the OneHotEncoder class of the Scikit-learn preprocessing package.