Visualization of common machine learning regression algorithms on San Francisco AirBnB listing data.
See the application hosted here. For comparison, here is the original version of the web application.
This web application was built almost completely in Python, using Dash, a Python framework for building analytical web applications atop React and Flask. Dash plays particular well with plotly, which was used to create the live visualizations.
The data was imported and analyzed using pandas, while the regression models were built with scikit-learn, a common Python machine learning library. Specifically, the application showcases the polynomial regression, k-nearest neighbors and support vector machine regression (SVR) models available in scikit-learn.
To install and run this web application locally, following the steps below
after downloading the repository. First, we need to ensure the correct
dependencies are installed, which are listed in production/requirements.txt
.
This can be done from the terminal with the following command.
pip install -r production/requirements.txt
After installing the appropriate dependencies, navigate to the app folder and
start the server by simply running app.py
. Now the application should be
running locally: visit http://localhost:8050/
to see it.
cd app
python app.py
The the original version of the web application was initially built for the CapitalOne Software Engineering summit. As part of the challenge, we were required to build a web application which explored the following questions.
- Graph some (any 3) interesting metrics, maps, or trends from the dataset.
- Given the geo-location (latitude and longitude) of a new property, estimate the weekly average income the homeowner can make with Airbnb.
- Given the geo-location (latitude and longitude) of a property, what is the ideal price per night that will yield maximum bookings or revenue?
After taking a machine learning class, I decided that the data would be perfect for visualizing various machine learning regression algorithms since a 3D plot had a clear interpretation: it represents a price map over San Francisco.