/host-characteristics-on-airbnb-prices

This research is about the effect of different host characteristics on prices of Airbnb's. We will compare countries with high and low inflation in the last years to see whether there is a difference in the effect of host characteristics between these countries. We will also give a general impact of host characteristics on prices.

Primary LanguageR

The effect of host characteristics on prices of Airbnb's

Screenshot 2022-10-13 at 20 18 19

Are host characteristics influencing the price?

Table of content

1. Our project

2. Method

3. Results and interpretation

4. Repository

5. Running instructions

6. More resources

7. About

1. Our project

Short project description and research motivation

Nowadays the Airbnb platform is growing with a big supply of accommodations. Therefore, people start looking more critically at different features of Airbnb’s. For example, the quantity of rooms, prices and host characteristics. This research will investigate the effect of host characteristics on Airbnb prices.

Due to recent high inflation rates, it is interesting to compare this effect of host characteristics on price between cities with low inflation and cities with high inflation. Currently, inflation in many parts of the world is increasing. This inflation is due to many economies recovering from the COVID recession in 2020. Additionally this inflation is due to the rising gas prices. This research will use two city subsets, namely a subset of five cities with high inflation rates consisting of: Rio de Janeiro, Mexico City, Boston, Cape Town and Santiago and a subset of five cities with low inflation, consisting out of: Tokyo, Geneva, Beijing, Bangkok and Athens. These cities were selected based on inflation data from the past few years from (https://www.theglobaleconomy.com/rankings/inflation/). To what extent does inflation moderate the effect of host characteristics on the Airbnb prices in these countries?

Research question

What is the effect of different host characteristics on Airbnb prices, moderated by high or low inflation?

4 different subquestions:

  1. What is the effect of different host characteristics on Airbnb prices in cities with low inflation?
  2. What is the effect of different host characteristics on Airbnb prices in cities with high inflation?
  3. What is the difference between the effect of different host characteristics on Airbnb prices in cities with high inflation and cities with low inflation?
  4. What is the general effect of different host characteristics on Airbnb prices?

Conceptual model

image

2. Method

Datasets

In our research, we decided that we want to compare the effect of host characteristics on prices between cities with high inflation and cities with low inflation. To select the cities included in the dataset, we used an overview of inflation by country around the world in the past years from The Global Economy (https://www.theglobaleconomy.com/rankings/inflation/). We compared these countries with the cities of which datasets were available, and selected the following cities:

Cities with low inflation:

  1. Tokyo, Japan
  2. Geneva, Switzerland
  3. Beijing, China
  4. Bangkok, Thailand
  5. Athens, Greece

Cities with high inflation:

  1. Rio de Janeiro, Brazil
  2. Mexico City, Mexico
  3. Boston, United States
  4. Cape Town, South Africa
  5. Santiago, Chile

We combined these seperate datasets into three different bigger datasets: one dataset with all information about the cities with low inflation, one dataset with all information about the cities with high inflation and one general dataset with all information of all cities. The seperate low inflation and high inflation datasets can be used to compare the difference of host characteristics on prices of Airbnb's. The general dataset with all cities included can be used to create a general overview of the effects of different host characteristics on prices of Airbnb's. Later, we will clean these datasets so they can be easily used in our analysis.

Variables

In total, the datasets consist of 75 different variables. However, for this research, only the specific variables about host characteristics and prices of the Airbnb's are relevant. The following variables in the datasets will be used and analyzed in our research:

Variable name Variable explanation
price_in_dollars (Y) Price of the Airbnb in dollars
host_years (X1) How many years the host has been active now
host_response_time_recoded (X2) How fast the host responds rated from 1 to 4
host_response_rate_recoded (X3) How often the host responds rated from 0 to 1
host_is_superhost (X4) Dummy whether the host is a superhost
host_has_profile_pic (X5) Dummy whether the host has a profile pic
host_identity_verified (X6) Dummy whether the identity of the host is verified

Research method

This project will use the Ordinary Least Square (OLS) regression method to examine the effect of different host characteristics of Airbnb's in low and high inflation countries. We can use the OLS regression to see whether the relationship between the variables is positive or negative. The dependent variable is the Airbnb price in dollars. The independent variables are given in the table above, notated by X. The regression is as follows:

Y = b0 + b1X1 + b2X2 + b3X3 + b4X4 + b5X5 + b6X6

Here, host_is_superhost, host_has_profile_pic and host_identity_verified are dummy variables.

3. Results and interpretation

To investigate the effect of host characteristics (independent variables) on Airbnb's prices (dependent variable) between cities with low inflation or high inflation, we conducted a linear regression for the low inflation, high inflation and full datasests. The output of the regressions can be found below:

Screenshot 2022-10-13 at 20 19 53

Looking at the output of the regression, several variables have a significant effect on price. There are more variables that have a significant effect on price in high inflation cities (regression 2) than in low inflation cities (regression 1). Two out of six variables have a significant effect on Airbnb's price in low inflation cities while five out of six host characteristics have a significant effect on Airbnb's price in high inflation cities. In the regression of all cities together (3) it can be observed that again 5 out of 6 variables have a significant effect on price. Therefore, with these results we can conclude that host characteristics do have an effect on Airbnb's price.

A more detailed analysis of these results can be found in the PDF in the gen folder.

4. Repository overview

Structure

├── README.md
├── data
├── gen
│   ├── analysis
│   ├── data-preparation
│   └── paper
└── src
|  ├── analysis
|  ├── data-preparation
|  └── paper
└── make file

5. Running instructions

Software

For this research, the downloading of the data, the cleaning of the data and the OLS regression were done using R and Rstudio. To run each file smoothly in one time, a makefile was generated.

In R, the following packages were used. If you did not download them yet, please use install.packages() to do so. Otherwise, you can load each package using the library() function:

library(tidyverse)
library(dplyr)
library(ggplot2)
library(readr)
library(stargazer)

Running the code

Using the makefile

It is most easy to run the makefile, this will run each source code in the right sequence leading eventually to the results of the analysis. You can run the makefile by following these steps:

  1. Fork this repository to your own GitHub account
  2. Clone the repository just forked to your local computer using Git / terminal / command prompt. Go to the right directory you want to clone the repository into and type:
git clone https://github.com/{your username}/host-characteristics-on-airbnb-prices.git
  1. Set your working directory to the just cloned folder using cd host-characteristics-on-airbnb-prices
  2. Type make, this will run all the source code (it could take a while)
  3. In your local folder, the generated stargazer output with the regression results can be found:
/host-characteristics-on-airbnb-prices/gen/analysis/output/model_report_airbnb.html

Run each sourcecode seperately

If you want to run each dataset seperately, this should be done in the following order:

  1. download_data.R
  2. merge_data.R
  3. data_transformation.R
  4. clean_data.R
  5. analyze.R

6. More resources

The following website was used to decide which cities to include in the high and low inflation dataset:

7. About

This project is conducted for the Data Preparation and Workflow Management course at Tilburg University. The members of our team are: