airbnb

Up in the Air{bnb}

Website: https://hayleyjellison.github.io/up-in-the-airbnb/

Group Members

In this project, we are taking data from Airbnb properties around Austin, Texas to analyze basic property information (such as room types and fees by region) using Tableau and to analyze the text contained within the review comments users submitted. We clean the text data by getting rid of filler words in reviews such as "the", "of", "and" using PySpark and the John Snow NLP library. The text is then analyzed using pretrained pipelines for sentiment analysis and the most common ngrams. Visualizations are then created to show a map of sentiment values for Airbnb listings and a dashboard showing the ngram analysis by zipcode. Finally, an RNN was trained using LSTM on the text to create AI generated reviews and we display all of this on our website.

Tools we used:

  • John Snow Spark NLP Library
  • AWS
  • Tableau
  • Javascript libraries (Leaflet, Anychart)
  • HTML/CSS
  • Databricks
  • PySpark
  • Recurrent Neural Network (RNN)

Sample Screenshots of Website

What the website looks like:

website-1 website-2 wesbite-3
website-4 website-5 website-6

Data Sources

References