skrub-data/skrub

Add a "DropSimilar" transformer

GaelVaroquaux opened this issue · 0 comments

Problem Description

Dropping very similar columns can decrease computational cost (memory usage and CPU time)

Feature Description

Once #984, we will have the mechanics to detect this similarity. We should add a transformer that exposes it.

This should probably be tackled only after we have merged in the recipe, as it will be much nicer to demonstrate this functionality using the recipe.