ncn-foreigners/nonprobsvy

TODO

Closed this issue · 2 comments

Issues to consider for the current version of the package.

To develop (SHORT TERM):

  • add pop_totals / pop_means for variable selection method
  • add the rest of the estimation methods gee / mle provided after variable selection
  • provide bias minimization method to the base usage of the package
  • accelerate performance for variable selection algorithm written in C++
  • structural changes, such as equal model.frame call for all models and one function for models with and without variable selection
  • Ability to call functions for several outcome variables e.g. y1 + y2 + ... + yk ~ x1 + x2 + x3 + ... + xn
  • add trace/verbose tracking for variable selection and bootstrap algorithms
  • move bias correction definition to the other control function or consider another way to define it
  • change functions name in OutcomeMethods
  • fix BIC.nonprobsvy in summary
  • change the structure of defining gee with h functions in control_selection
  • add error message in case of duplicates of outcome variables in formula
  • add error message in case of badly defined formulas
  • add propensity score adjustment using xgboost model.
  • add svrep (bootstrap weighting) to the functionality of the package.
  • add div to variable selection models

To develop (LONG TERM):

  • variance for DR estimator when MI estimation using NN algorithm
  • method to estimate mean/median/totals in subsets/groups (called on the nonprobsvy object).
  • variance for MI estimator when MI using PMM imputation

To fix:

  • weights for non-probability sample - not stable algorithm during estimation (overestimation of propensity weights or errors in maxLik model)
  • variance for DR and MI estimator (with NN)
Kertoo commented

add propensity score adjustment using xgboost model.

Fitting more complicated machine learning models (such as xgboost ) is more involved than just calling lm/glm with avoiding overfitting etc. and adding a plethora of minor tweaks. Maybe it would just be better to create a method where user supplies a fitted ML object instead of calling xgboost internally? This would make tuning the model much less tedious.

tasks moved to #47