This ongoing machine learning regression project is a dummy project to apply my knowledge on AI, by comparing models and conducting hyperparameter searches that are relevant to the problem I am solving, which is to create the optimum model to estimate the maximum time taken for a plant to grow to its maximum height based on continuous variables.
This is based on a data pipeline which I have collected and cleaned prior to this.
The aim is to measure which machine learning models perform best on unseen data.
Target variable: max time to ultimate height.
Features: 'Full Sun', 'Sheltered', 'Generally pest free'
This model helps to predict the time it takes for a plant to grow to its maximum height based on the above predictors.
- Linear regression (baseline)
- Decision tree regressor
- K nearest neighbors regressor
- Ridge regression
Model | Hyperparameter | Training Score | Validation Score | R2 Score |
---|---|---|---|---|
Linear Regression (baseline) | - | 0.021 | -0.029 | -0.018 |
Decision Tree Regressor | criterion='mse' | 0.028 | -0.021 | -0.010 |
KNN | n_neighbors=4 | -0.20 | -0.184 | -0.188 |
Ridge Regression | alpha=0.1 | 0.021 | -0.029 | -0.018 |