This project is part of the Kaggle competition to predict the sale price of bulldozers based on various features and attributes. The objective of this project is to develop a machine learning model that can accurately predict the sale price of bulldozers given a set of input features.
The dataset used in this project is the "Blue Book for Bulldozers" dataset, which contains historical auction results for bulldozers sold at auctions. The dataset includes information about the bulldozers, such as their model, age, and usage, as well as their sale price.
In this project, we used the Random Forest algorithm to build a regression model that can predict the sale price of a bulldozer given its characteristics. We started by exploring and cleaning the dataset, and then we performed feature engineering to extract meaningful information from the data. We also used cross-validation to evaluate the performance of our model and tuned its hyperparameters to improve its accuracy.
Our final model achieved an RMSLE (Root Mean Squared Log Error) of around 0.25, which is a good performance for this type of problem. We also analyzed the importance of the different features in our model and found that the age of the bulldozer and the model type were the most important factors in determining its sale price.