An Indeed scrapper that extracts all the jobs related to the field specified by the user
These instructions will tell you how to download, run and use this project:
https://github.com/david1707/indeed_scrapy
If you don't have Scrapy, create an enviroment (better pipenv than virtualenv) and install the requirements:
pip install -r requirements.txt
Navigate to the main folder, then run it with:
scrapy crawl indeed
If you want to save the yield result, do it like this:
scrapy crawl indeed -o jobs.json
By default will search all the jobs. If you want to specify a field, pass it as a 'job' argument:
scrapy crawl indeed -a job='Scrapy python' -o file.json
You can use .json, .xml, .csv....
If you have a local MognoDB Database, it will store every result at the 'indeed' database, 'jobs' collection. For more info, check pipelines.py and settings.py, rows 67-74
- Python - Python is an interpreted high-level programming language for general-purpose programming.
- Scrapy - An open source and collaborative framework for extracting the data you need from websites.
- David Membrives - Initial work - david1707
This project is licensed under the ISC License