This project focuses on predicting whether a taxi passenger will give a generous tip (≥ 20%). It demonstrates proficiency in data preprocessing, feature engineering, and model evaluation using Python. The goal was to develop a machine learning model that could help taxi drivers anticipate tip amounts, improving their overall revenue.
- Utilized Python for data analysis, preprocessing, and modeling.
- Conducted feature engineering to extract relevant information for predicting tip amounts.
- Evaluated the model using key metrics like accuracy, precision, recall, and F1 score.
- Demonstrated expertise in handling imbalanced datasets and ethical considerations in machine learning.
- Python
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
- Achieved an F1 score of 0.7136 on the validation set with the Random Forest model.
- Tested the model on a separate test set, achieving an F1 score of 0.7235.
- Compared performance with an XGBoost model, which achieved an F1 score of 0.6955 on the validation set.
Collect more data to improve model performance. Experiment with different machine learning algorithms and hyperparameter tuning strategies. Explore additional features that could enhance prediction accuracy.