Faster alternative to GapEncoder
Opened this issue · 1 comments
jeromedockes commented
Problem Description
For encoding text/high-cardinality categories, ATM we have MinHashEncoder, which only works when the downstream learner is based on decision trees, and GapEncoder, which gives high-quality representations but is very slow. It would be good to have something similar to the GapEncoder but faster, maybe a SVD or scikit-learn's NMF
Feature Description
an encoder that works similarly to GapEncoder but is faster, possibly at the cost of less interpretable topics or slightly reduced prediction performance
jeromedockes commented
related: #139