carleton-pathways/bot

Scrape Restrictions Correctly (Use Beautifulsoup for First Table)

Opened this issue · 0 comments

In the dataBot.py we are scraping two tables.

Screen Shot 2023-05-30 at 10 59 46 PM

First table is the top one with the registration and crn and all that. Second table is the one with meeting time and building and instructor. For the second table we use Beautiful soup to parse. First table we don't use beautiful soup and because of the following error arises:

When there are multiple restrictions for a given course, they are stored in different rows, so the dataBot.py only finds the first row. Could you fix this so that it considers all of the rows that pertain to a certain restriction?

Fails in a scenario like this:
Screen Shot 2023-05-23 at 7 24 24 PM

They are in different table rows and we are only considering the first row of the restriction.
Screen Shot 2023-05-23 at 7 25 58 PM

Possible Approach

Refactor the bot to use BeautifulSoup to scrape the first table, we already use beautiful for the second table. so that we can get every row of the restrictions.