Simple python tools for scraping jobs into a dataframe, removing duplicates. Given the nature of web scraping, strategies have to be specific to sites so currently, only tools for jobs.ac.uk (primarily academic posts) have been created. The plan is to expand this further.
- clone this repository to a desired location
$ git clone https://github.com/VolodymyrChapman/job_scraper.git
- navigate into the cloned repository
$ cd job_scraper
- install dependencies from environment file
$ conda env create -f environment.yml
For usage examples, please refer to the example_usage.ipynb notebook.
Collaboration, especially to expand functionality to other sites would be greatly appreciated.
Please feel free to clone, expand, submit pull requests etc.