/Web-Scraping-Project

Web scraping LinkedIn profiles - Uber employees

Primary LanguageJupyter Notebook

Web-Scraping-Project

The objective of this project was to explore the data related job market and it's requirements within the industry. More particularly within the company Uber. Also, I wanted to gain more knowledge and experience in web-scraping.

LinkedIn is an American company established in 2002. They created a platform for the social media of professional networking. This platform is mainly used for professional networking which includes employers posting jobs and job seekers posting their resumes or CVs. Eventually most of LinkedIn's revenue started coming from selling access to information about its members to recruiters and sales professionals. As of 2019, LinkedIn has 610 million registered members in 200 countries. Of these users more than 250 million are active.

I wrote two python scripts, namely "LinkedInWebcrawler2019.py" and "ProfileCrawler2019.py" to responsibly scrape the LinkedIn website. I sent queries about Data Scientist/Data Engineer/Data Analyst jobs type roles within the company Uber. Used the selenium package and collected information about the Uber employees. "LinkedIncrawler2019.py" is code in which it gathers first csv file of profiles. "ProfileCrawler2019.py" is code in which it gathers information from each profile link.

I ran my analysis on about 1,000 employee profiles. The analysis code is in the Jypyter python notebook "profileScraper.ipynb".

Here is a blog post I wrote about the project and results: http://nycdatascience.com/blog/student-works/web-scraping/web-scraping-linkedin:-exploring-the-background-of-a-data-scientist/