Udacity-Data-Scientist-Nanodegree-Assignment-1

Goal & Motivation
The goal of this repository is to analyse a dataset using CRISP-DM methodology. The dataset used here is Seattle AirBnB data taken from kaggle.

In the notebook, we investigate a few questions using the data: are some neighborhood listings more expensive than others, are certain times of the year more expensive to visit Seattle, which are the important features related to the price of a property at any given point in time.

We wrangle the data a bit to get the relevant features from calender.csv and listings.csv and build some linear models and a Random Forest model.

Random Forest model did a good job of predicting the prices for the listings in the dataset.

Blog Post
A blog post describing the high level insights is published here

Directory Layout

├── Seattle AirBnB Data Analysis.ipynb  # Main Analysis File
├── calender.csv                    # Data
├── listings.csv                    # Data
├── reviews.csv                     # Data
└── README.md

Libraries Used

numpy
pandas
matplotlib
seaborn
sklearn (RandomForestRegressor, LinearRegression, Lasso)

Acknowledgements
1. Udacity DataScientist Nanodegree Program
2. Kaggle

Nirzaree/Udacity-Data-Scientist-Nanodegree-Assignment-1

Udacity-Data-Scientist-Nanodegree-Assignment-1