TITLE: GA DSI07 Project 4: WEB SCRAPING JOB POSTINGS by Chua Chin Hon
[Github repo title: ga_project4_cch]
SUMMARY: This is my fourth assigned project at General Assembly (Singapore), as part of a 12-week Data Science Immersive course. The project involves scrapping for jobs data on a government jobs bank, and using the data to analyse salary trends and to see if they are predictive of high pay.
I've also published a Medium post on my findings: https://medium.com/@chinhonchua/10-charts-to-guide-your-search-for-a-data-science-job-in-singapore-e4e3be9f1135
My answers for the project are in the following files in the notebooks folder:
1.0-cch-project4-Webscrape.ipynb,
1.1-cch-project4-Data_Text_Cleaning.ipynb,
1.2-cch-project4-Visualisation.ipynb,
2.0-cch-project4-Question1_Modelling.ipynb,
3.0-cch-project4-Question2_Modelling.ipynb
4.0-cch-project4-Summary_Report.ipynb
FOLDERS
2 x Folders, one each for notebooks and data
FILES
Data folder [2 files, 1 sub-folder]
chromedriver: For the webscraping exercise
jobs.csv: Original jobs dataset via webscrapping
housing.csv: Cleaned up CSV file
Notebooks folder [7 files]
1.0-cch-project4-Webscrape.ipynb: Web-scraping for jobs data
1.1-cch-project4-Data_Text_Cleaning.ipynb: Data cleaning and minor feature engineering
1.2-cch-project4-Visualisation: Visualising key job and salary trends
2.0-cch-project4-Question1_Modelling.ipynb: Answering the business questions in Qn1
3.0-cch-project4-Question2_Modelling.ipynb: Answering the business questions in Qn2
4.0-cch-project4-Summary_Report.ipynb: A summary report of the key findings
README.md: This is the original list of questions for the project.