Part 2. Regression
stemlock opened this issue · 1 comments
stemlock commented
- Imputation for Categorical Variables: np.unique() in the imputation section could be confusing compared to the previous output where the NaNs are in a dataframe. Maybe consider converting
cp_imp
back to a pandas dataframe to show the difference between the two after imputation. - Dummy Encoding: I believe dummy encoding can be done by passing in "drop='first'" as an argument in sklearn.OneHotEncoder object. This should remove the need to create a separate DummyEncoding class.
- ColumnTransformer: Spelling mistakes in "ColumntTransformer for Combined Preprocessing" opening description -> "ColumntTransformer" should be "ColumnTransformer", "differntially" should be "differentially"
- Transform the
test
Data: Spelling mistake after data is saved -> "...everything else is just a matter of choosing your mdoel..." should be "model" - GLM Ridge Regression: Spelling mistake in opening description -> "Ridge regression takes a hyerparameter..." should be "hyperparameter"
- GLM Ridge Regression: "Leave One Out Cross Validation" (LOOCV) is not explained. A "see more" link might be useful.
- Non-Linear Models: Might be helpful to include a quick explainer comparing linear vs non-linear models and pros/cons. Currently they are introduced without explanation.
pssachdeva commented
Closed by #39