All commands are assumed to be run from the root folder if not specified otherwise. Some plots have difficulties rendering when viewed from the web version of GitHub. Therefore, it is recommended to use Visual Studio Code, as plots render as required.
(DataVic)
- Vicotria Suburb Data
- Victoria Local Government Area (Local Cities/Councils) Data
- Victoria Train Stations Data: Metro, Vline
Note: For 2 and 3
- Projection: Geographicals on GDA2020
- Buffer: no buffer
- Format: ESRi shape file
-
Open Route Service: A Local backend was setup to avoid call limitations of the API. Follow here for a detailed description of how to set up and use the backend service. OSM Data and GitHub Repo are required for set-up.
-
Overpass API, Nominatim API (Public, doesn't require API key)
-
Scraping, scraping_old_listing, getparks - Scrapes listing and external datasets.
-
Preprocessing, addressConv, routeCalc, count, crime - Cleaning and Feature Engineering of listing and external datasets.
-
Analysis, Distance Analysis, Ext_Analysis - analysis of internal and external feature datasets.
-
Prediction - Forecasting model for median rental price.
-
Summary - Overview of the entire project.
- Please create an environment with Python 3.8 or Python 3.10 installed and install the needed libraries from requirements.txt.
- Please download the
useragents.txt
(provided under MIT License) in the following way:cd data wget https://raw.githubusercontent.com/DavidWittman/requests-random-user-agent/master/requests_random_user_agent/useragents.txt cd ..
- Run the notebook
Scrape.ipynb
or directly from the command linecd scripts/ && python scrape.py
to get the scraped properties. - Visualisation requires shapefiles from the Australian government:
mkdir -p './data/raw/Suburb Shapes' && cd './data/raw/Suburb Shapes' wget https://data.gov.au/data/dataset/af33dd8c-0534-4e18-9245-fc64440f742e/resource/4494abe0-64ea-4fa6-931a-d1a389a14e57/download/vic_localities.zip -O temp.zip unzip -o temp.zip rm temp.zip pwd cd ../../.. ```
- Some data was obtained and joined via hand.
- Data produced in the process can be found in Google Drive: