/Automatidata-Taxi-Tip-Prediction-Model

Developed a machine learning model to predict whether taxi passengers 🚖 will give a generous tip (≥ 20%) 💸 based on trip and payment data. Utilized Python 🐍, Pandas 🐼, NumPy 🔢, Scikit-learn 🤖, and XGBoost 🚀 for data preprocessing, feature engineering, modeling, and evaluation.

Primary LanguageJupyter Notebook

Automatidata-Taxi-Tip-Prediction-Model

Overview

This project focuses on predicting whether a taxi passenger will give a generous tip (≥ 20%). It demonstrates proficiency in data preprocessing, feature engineering, and model evaluation using Python. The goal was to develop a machine learning model that could help taxi drivers anticipate tip amounts, improving their overall revenue.

Key Features

  1. Utilized Python for data analysis, preprocessing, and modeling.
  2. Conducted feature engineering to extract relevant information for predicting tip amounts.
  3. Evaluated the model using key metrics like accuracy, precision, recall, and F1 score.
  4. Demonstrated expertise in handling imbalanced datasets and ethical considerations in machine learning.

Technologies Used

  1. Python
  2. Pandas
  3. Scikit-learn
  4. Matplotlib
  5. Seaborn

Results

  1. Achieved an F1 score of 0.7136 on the validation set with the Random Forest model.
  2. Tested the model on a separate test set, achieving an F1 score of 0.7235.
  3. Compared performance with an XGBoost model, which achieved an F1 score of 0.6955 on the validation set.

Future Improvements

Collect more data to improve model performance. Experiment with different machine learning algorithms and hyperparameter tuning strategies. Explore additional features that could enhance prediction accuracy.