/CanadaJobScraper

A web scraper for jobbank.gc.ca this scraper sends email templates and can also store the data in a sqlite db or csv file

Primary LanguageHTMLMIT LicenseMIT

Canada Job Scraper

This code uses Cheerio and Puppeteer to scrape job listings from the Government of Canada Job Bank website. The user can specify the job title and province to search for, as well as the number of pages of results to scrape. The scraped data can be saved to a database, saved as a CSV file, and/or used to send emails to the listed employers.

Live Version

View Scraped Data

Front end from the Canada Job Scraper FrontEnd project

Requirements

  • Node.js
  • npm (comes with Node.js)

Setup

  1. Install the required packages by running npm install
  2. Modify the following values in the code to your desired values:
    • name: Your name
    • phone: Your phone number
    • numberOfPages: The number of pages of job listings to scrape
    • jobTitle: The job title to search for (leave empty to search for all jobs)
    • province: The province to search in (leave empty to search in all provinces)
    • email: Your email address
    • password: Your Google App Password (obtain one here)
    • .env: create a .env file if one does not exist with your email settings in the sendMail.js file
  3. (Optional) Modify the following values in the code to customize the email template:
    • facebook: Your Facebook profile URL
    • linkedin: Your LinkedIn profile URL
    • twitter: Your Twitter profile URL
    • profilePic: URL of your profile picture to include in the email
    • skills: An array of your skills to include in the email
    • emailTitleLine1: The first line of the email title
    • emailTitleLine2: The second line of the email title
    • message: The body message of the email
  4. (Optional) Modify the template variable to customize the email template HTML. The template file is email.html

Usage

  1. Run the code with npm run start
  2. The scraped data will be saved to a database and/or CSV file if specified, and emails will be sent if specified.
  3. Converter.js file can be run after running your index.js if you saved the db. It converts the db to json.

Notes

  • The code includes a random timeout between requests to avoid being blocked by the website.
  • Email template from https://unlayer.com/
  • The code uses the template variable as an HTML template for the emails. You can use the provided email.html file as a starting point, or create your own template.
  • The code uses the Nodemailer package to send emails through Google's SMTP server. You will need to provide your email address and a Google App Password in order to send emails.
  • The code uses the dotenv package to allow us to save our email data in a .env file