/wai-scraper

Webscraping library for the University of Waikato.

Primary LanguagePythonApache License 2.0Apache-2.0

wai-scraper

Python webscraping library for the University of Waikato.

Uses selenium and selenium-requests under the hood.

While on campus, the DUO two-factor authentication does not prompt the user, which allows using the library in non-interactive mode (init_driver(False)). However, when off-campus, it is necessary to run it in interactive mode (init_driver(True)), in order to tick the Remember me for 30 days box and click on the Send me a push button to accept the authentication on your mobile device.

The use of selenium was inspired by: https://stackoverflow.com/a/23929939/4698227

Installation

Create a virtual environment:

virtualenv -p /usr/bin/python3 venv

Install wai.scraper in the virtual environment:

./venv/bin/pip install git+https://github.com/fracpete/wai-scraper.git

Example

The following example logs into the university website via SSO and outputs the HTML content of the staff landing page.

import getpass
import wai.scraper as ws

# initialize logger with debugging output
ws.init_logger(True)

# run Firefox in interactive mode (eg when off-campus, for interacting with 2FA) 
driver = ws.init_driver(True)

# perform logins
user = input("Enter user: ")
pw = getpass.getpass("Enter password: ")
ws.sso(driver, user, pw, delay=15)

url = 'https://www.waikato.ac.nz/landing/staff.shtml'

# obtain staff landing page via selenium
ws.driver_get(driver, "staff landing page", url)
print("--> selenium")
print(driver.page_source)

# close the session
ws.close_driver(driver)