

Since its establishment, Airbnb has become a popular accommodation for short-term stays. It is widely perceived that Airbnbs provide a lower and a more reasonable price compared to hotels. However, there are also many Airbnb listings that ask for a price that is higher than the average market number. What are the factors that enable an Airbnb listing to be much more expensive than a branded hotel room? We hope to identify the key contributors to the pricing of an Airbnb accommodation through this machine learning project. Upon finishing up, we expect to build a model that can help hosts to set up the most profitable price and can help travelers to find the best stay place within their budgets.

Dataset Description

The dataset we plan to use contains data about the different characteristics of various Airbnb listings throughout various European cities. The data is separated across 20 .csv files, which cover 10 cities, with each city having its data split into two files, one for weekday listings and another for weekend listings. Features include “cleanliness_rating,” “guest_satisfaction_overall,” “bedrooms,” and “metro_dist,” with nearly all being relevant attributes that can be seen as reasonable contributors to the price point of a listing


In this project, we implemented prediction models using three different approaches: An ensemble learning algorithm (Adaboost), a neural network model (MLP), and a linear regression model (elastic net). While we trained the models separately on either the merged or separate datasets, all of them performed relatively well, with the best R^2 value being around 0.6. It also turned out that “city”, “room type”, and “person capacity” are among the most important features. Our project results are in line with the empirical evidence and can also serve as a nice reference for future Airbnb/Hotel pricing research in Europe.