Team Leads (Contacts) : [Samuel Lawrence]:
Webscraper adapted from
Inspiration for the project was based on Ken Jee's youtube series 'data science project from scratch' Major changes include:
- Unique model building approach based on sklearn ensemble module
- The model was deployed to production via streamlit on heroku url:
- Updated webcrawler was in need of overall due to glassdoor's updated website
- Unique field objective
- Adapt web scraper for data for model
- Clean data for analysis
- Analyze data
- Submit findings
- Scale and Build Machine Learning Model
- Host product on heroku
The objective of this project is to further understand what it takes to be a financial analyst in London. This exercise will serve as a gateway to those seeking to become analyst themselves as well as create an entry point adapting a machine learning model in predicting what role may be expected in relation to the different variables.
- Inferential Statistics
- Machine Learning
- Data Visualization
- Predictive Modeling
- Python
- Pandas
- Numpy
- Matplotlib
- Nltk
- Wordcloud
- Seaborn
- Sklean
- Selenium
- Sklearn
As we move closer to the full cycle of graduates moving into the work force, the question has been posed is what does it take/what is it like to be a financial analyst? Some questions we plan on answering include:
What kind of salary should be expected?
What positions are the most popular?
Types of companies Hiring?
What industries are the most popular?
Similarities between different roles?
Other questions we might want answered as we explore the data some more?
- The data was gathered from Glassdoor job postings on 6/7/2020 via web scraper with the use of the Selenium Python library. As such, COVID-19 has remained a constant factor in our lives and should be taken into consideration.
- -1 represents data that wasn't specified in the job posting
- The sample size for this data set was 1,000 entries.
- We ran the web scraper multiple times to get a wider pool of data due to the number of missing data
- Some of the most common words mentioned in the analysis include: 'Problem Solving','Bachelor Degree','team' and 'attention to detail'
- Average salary came out to around 30K depending on the seniority level
- Most big corporations are doing the hiring at the moment
- With more data and better feature selection, users could calculate their exact salary