Add project: miceforest
samFarrellDay opened this issue · 1 comments
Project details:
Missing data imputation is a widely used method for dealing with missing data in machine learning and statistical workflows. miceforest aims to provide extremely accurate imputations using lightgbm, while being as lightweight and fast as possible. This package can:
- Impute multiple datasets, so the user can perform Multiple Imputation by Chained Equations (MICE)
- Plot the imputed correlations, distributions, feature importance, and more
- Train models on 1 dataset, and impute a different dataset (useful for production environments)
- Can be GPU accelerated through the lightgbm api.
- Can impute data in place, which means the dataset never has to be copied. Useful for huge datasets.
- Project Name:
- Github URL: https://github.com/AnotherSamWilson/miceforest
- Category: Other? New Category? See Additional Context.
- License: MIT
- Package Managers: pypi, conda
Additional context:
I am wondering if Missing Data Imputation should be it's own category - it is very often used in machine learning, especially predictive modeling. There is another missing data imputation project already on here, fancyimpute
. What do you think?
Thanks for the suggestion. I added miceforest
to the tabular data section. In case there are even more projects coming in related to Missing Data Imputation
I will consider to add a new category for this.