/crime_rate_by_regression_trees

Predicting crime rate using DecisionTreeRegressor(), analyzing the importance of specific features, and reducing the complexity of the model.

Primary LanguageJupyter Notebook

Predicting crime rate by regression trees

The main task here was to build a simple decision tree model to predict crime rate using the DecisionTreeRegressor().

Four regression models were built using the original and modified dataset and their resulted trees were observed. The complexity of the model was then reduced by varying the parameter max_depth of the regressor.

The importance of specific features was also analyzed and an interesting pattern was observed when separating the feature regions in four news features, one for each specific region. It made clearer the choice of the algorithm for geographic regions 1 and 4 when making a prediction. The results could be improved with a larger dataset.

Finally, the performance of the models was measured by generating and analyzing the learning curves.

All that said, for being my first machine learning model I think it was a very positive and exciting experience.

Libraries used in this project:

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • sklearn
  • graphviz
  • pytdotplus
  • io