scikit-learn-contrib/qolmat

[Audit] - MCAR : Explore the pklm test

adriencrtr opened this issue · 0 comments

Why ?

There are 3 types of missing data : MCAR, MAR and MNAR. In the imputer selection process, we must choose a holes generator. It would be suitable to select a holes generator that reflects the holes present in the dataset. The aim is to provide a tool for characterizing the user's hole type.

Current situation :

No tool available today in Qolmat. The Little's test is implemented but it only handles quantitative features and presents some limits :

  • Bad power in high dimension.
  • Do not detect the HC.
Improvement sought :

Implement briefly the PKLM test to confirm the published performance by the authors.
See the PKLM Test by Spohn et al.