morinted/schedule-generator

Should do some sort of automated testing on scraped schedules

davidschlachter opened this issue · 2 comments

Course updates frequently break in ways that are currently only detected by either my very incomplete manual testing, or by bug reports from users. If there were some way that we could do automated testing to validate scraped schedules, this would improve the program's reliability for users.

However, the main challenge would be to validate schedules without scraping validation data with the same system scraping the test data.

One possible solution could be to detect situations that are known to cause silent failures. For example, if two sections have the following activites:

  • Section A: two lectures, three DGDs
  • Section B: one DGD

This would indicate a scraping error in determining sections from the uOttawa data and will cause silent failures for users. This type of situation could be automatically flagged and trigger a notification that scraping has had some failures. These types of errors should be tested for and raised in the scraping program since this is now the most common failure point for users of this project.

If this is implemented by somehow comparing scraped schedules to reference schedules, then the courses most commonly searched for would probably be the best candidates. Here are the top ones from April 2018 – April 2020 (with the number of times searched):

 496 CEG2136
 451 CSI2110
 411 SEG2105
 392 ENG1112
 371 MAT1320
 331 MAT1322
 331 CHM1311
 321 MAT1341
 316 ITI1120
 316 ECO1104
 285 MAT2377
 260 PHI1101
 239 ECO1102
 214 ENG1100
 211 ADM1340
 205 MAT1348
 204 ITI1121
 194 ITI1100
 192 ECO1504
 190 PSY1102
 190 ECO1502
 189 CSI2132
 177 PSY1101
 172 MAT1300
 172 CSI2101

Implemented in 58a1c38 after email conversation with uschedule.me team. Each time schedules are updated, the test results will be available at https://schlachter.ca/schedgen/latest-unit-test-results.txt.