pip install -r requirements.txt
Hotel Cancellation Prediction with Machine Learning
- Actionable insights for hotels
- Predict cancellations to:
- Offer discounts and prevent revenue loss
- Adjust staffing and optimize costs
- The majority of reservations occur in the months of July and August, while the fewest bookings are made at the start and close of the year.
- City Hotels have more monthly bookings and overall bookings than Resort Hotels.
- Both hotels have the fewest guests during the winter.
- The number of guests from Portugal is significantly higher than the other countries.
- Portugal, Great Britain and France account for 50% of the guests.
- This plot clearly shows that prices in the Resort Hotel are much higher during the summer and prices of city hotel varies less.
- City hotels generate higher revenues compared to resort hotels across all room types.
- Room type A are the most profitable for both city and resort hotels.
- There is a great loss over the years in the revenue due to cancellations for both city and resort hotels, but it is more obvious in the city hotels significantly.
- Get percentage of missing values in each column.
- Drop the columns ‘agent and company’.
- Drop the rows
- The
reservation_status
andreservation_status_date
columns should be dropped because they provide information about when the booking was canceled or when the customer checked out of the hotel.
- Encoding Categorical Columns
- Discretizing Numerical Columns
-
Assumes continuous target variables and may not provide meaningful insights when applied to binary outcomes.
-
Inappropriate for capturing the relationship between categorical predictors and binary targets, leading to ineffective feature selection.
Output: [(c0, 1), (c1, 1), (c0, 1), (c0, 1), (c1, 1)]
Output: [(c0, 3), (c1, 2)]
Output: [(c0, (f1, v1, 5)), (c0, (f1, v2, 10)), (c1, (f1, v1, 2))]
Output: [(c0, (f1, v1, 50)), (c0, (f1, v2, 45)), (c1, (f1, v1, 36))]
Output: [(c0, (f1, v1, 50, 100)), (c0, (f1, v2, 45, 100)), (c1, (f1, v1, 36, 80))]
Output: [(c0, (f1, v1, 50/100)), (c0, (f1, v2, 45/100)), (c1, (f1, v1, 36/80))]
- Ahmed Emad
- Hla Hany
- Yomna Osama
- Youssef Mohamed