The aim of this project is to create a simple service that can scrape useful data from ITJobswatch.
At present the app is set up to be cloned and used to simply scrape the below services:
- Home page top 30 job/roles / skills which can be found here
The aim will be to expand this to further services such as:
- Regular polling of pages and writing to a database for longer terms stats
- Bespoke calls for specific job role data
And much more.
Pre-Requisites
- Pycharm IDE
- Python 3.x + installed
The necessary packages needed to run this program should automatically be picked up by pycharm. You may find a a few pop ups within the IDE that state there are dependencies missing, if you simply install these through the IDE you should be set up correctly.
To test whether the program will work from your machine:
- Ensure the
config.ini
file has the test environment set tolive
- Click the
Terminal
icon which can be found on the menu in the bottom left of Pycharm. - Ensure you're in the root path of the project and type
python -m pytest tests/
This should execute the tests if any fail you may have issues with this program.
To use the program simply right click on the main.py
file and then click Run 'main'
. This will run the command line user interface.
Follow the instructions to download via the various options given.
- Adding a job details search option (essentially be able to search for a specific role and return the details in a CSV)
- create a connected database for full deployment
- Build a scheduler as part of a full deployment to poll and add to the database