Scraping LinkedIn profiles is not fun(at all). Without signing-in with a LinkedIn account, you can hardly access more than 15 profiles.
- You need to sign-in with your account.
- LinkedIn profile pages are dynamic and don't load completely unless you scroll the entire page.
This program let's you scrape profiles based on a keyword in the user's title. (You can edit the link to look for profiles of users working at a specific company or studying at some school) It generates a csv file of these profiles with columns:
- Months of Experience
- Skills (Seperated by ':')
- Recommendations received
- No. of Projects
- No. of Publications
- No. of Followers
- Python3
- Chrome web driver
- Selenium
- BeautifulSoup
https://chromedriver.storage.googleapis.com/2.33/chromedriver_linux64.zip
- Download and unzip this
- chmod +x chromedriver
- sudo mv -f chromedriver /usr/local/share/chromedriver
- sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
- sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriver
- pip3 install selenium (Try sudo -H pip3 install selenium if this fails)
- pip3 install beautifulsoup4
- pip3 install tqdm
- Edit urls.py by modifying the query_keyword variable that scrapes profiles with query_keyword as the title(eg. student or professor or founder), set no_of_pages to the number of search result pages you'd like to scrape(Each page has upto 10 profiles), and enter your linkedin credentials.
- python3 urls.py
- Edit extract.py by modifying query_keyword again.
- python3 extract.py
P.S. With great power comes great responsibility. Scrape too much too fast, and your account might get blocked. Scrape safe.