/airbnb-in-berlin-2020

A data mining project to analyse Airbnb's data of Berlin for the year 2020 using KDD

Primary LanguageJupyter Notebook

With its presence in more than 80000 cities across the globe, Airbnb and its app play a significant role in the field of tourism with an amalgamation of the latest technology.

This project analyses the open data set collected from insideairbnb.com to understand those aspects which have not only affected the business of the popular hotels but also created a stir among the local property dealers and brokers.

Among European cities, Berlin holds a special economic, tourism and historical position. This makes it a prime focus for Airbnb. Therefore, I have focussed on data of Berlin for the year 2020.

The knowledge discovery database (KDD) approach of datamining has been used in this project.

Research questions:

A. Is there a seasonality in the prices of properties listed in AirBnb-Berlin?
B. Which are the popular areas of Berlin among the tourists?
C. An analysis of reviews – using text mining
D. Which are the most commonly available amenities in the properties of Berlin?
E. Can we predict the price of properties in Berlin by analysing other column values?

Metrics used:

R2 score
Mean absolute error
RMSE

Regression models used:

Linear regression
KNN
Decision tree
Random forest