/MookitScraping

Submission of Mookit Scraping Task for PClub Secy recruitment.

Primary LanguageHTML

Working

To scrape the page saved in page.html, run this in the terminal:

python3 scrape.py --number 4

The default number of lectures to scrape is 4 but it can be changed by replacing the 4 in above snippet by the desired number.

Description of Files

  • scrape.py: Contains the code to scrape the page saved in page.html.
  • data.csv: The CSV file provided has 6 of the latest lectures in my ESC201A course, and this file has a screenshot of the latest lecs in the same course for verification.
  • login.py: This file and chromedriver.exe are not required for scraping, but I had to write them to log-in to HelloIITK and get the page myself using selenium. I used login.py to save the content of the course page to page.html.

References