Implementation Process / Evolution
antoinecarme opened this issue · 12 comments
The goal of this issue is to implement SQL generation for the building blocks of any caret model:
- Base classification models (GLMxx , naive bayes, decision trees, SVMs , Neural Nets)
- Regressions (almost he same as above , except naive bayes)
- Preprocessings : "center", "scale", "pca"
- Ensembles : Boosting, Bagging, Random Forests, XGBoost.
These lists are expected to cover the most used models in the daily life of a data scientist. The initial guess comes from the original caret paper (http://www.jstatsoft.org/article/view/v028i05/v28i05.pdf), on page 9.
Deliverables :
- create a separate github issue for each element of the four lists.
- Implementation => jupyter R notebooks
- Tests following the process defined in the issue #1
- Keep track of the evolution of these sub-issues in the comments of this issue.
Closed the issue : Implementation Process - xgboost methods #9
Closed the issue : Implementation Process - rpart method #6
Closed the issue : Implementation Process - glmnet method #4
Closed the issue : naive_bayes method #5
Closed the issue : Implementation Process - nnet method #8
Closed the issue : Implementation Process - Data Preprocessing - center + scale method #10
Closed the issue : Implementation Process - Data Preprocessing - PCA method #11
Closed the issue : Implementation Process - svmRadial method (and other SVMxx) #7
Closed the issue : Implementation Process - Data Preprocessing - ICA method #12
Closed the issue : Implementation Process - Caret Pipeline Models #13
Closed the issue : Implementation Process - rf method #14
Closed the issue : Implementation Process - ctree method #15