Scrapes current hiring data from levels.fyi and h1bdata.info. To run any scraper in this repo, follow the following preliminary steps and then follow specific steps under the scraper of interest.
- clone this repository using git clone
https://github.com/neelpawarcmu/hiring-data-scraper.git
- (optional) preferably install conda and create a conda env
conda install pip
- change active directory to cloned repo using
cd hiring-data-scraper
- install requirements using
pip install -r requirements.txt
Scrapes data from levels.fyi into a dataframe, parses unique company names and populates a google sheet with the data.
Scrapes data from h1data.info into a dataframe, parses unique company names and populates a google sheet with the data.
- follow this tutorial to get a json file, rename it to
google-sheet-key.json
and add it to thehiring-data-scraper/utils
directory - change the
spreadsheet_key
andwks_name
inh1b-scraper.py
bottom of code - make desired changes to
config-role-names.txt
andconfig.py
- run
python h1b-scraper/h1b-scraper.py