C++ recursive course scheduler for small school/co-op environments.
Input: When is everyone available? What courses is everyone involved with? Output: All possible schedules, but paired down with some filtering (see below).
See here for an example.
Unless you want to manually make the CSV file used below (not recommended), make a Google sheet with the input data.
- In the first sheet ("availability"), the first column contains each person's name, while remaining columns correspond to the timeslots in consideration. A "Y" indicates the given person's availability during the given timeslot; anything else (include nothing) indicates inavailability. Note that the first name in each column is considered the course's teacher; this has no meaning until the schedule ranking step (see below).
- In the second sheet ("course_involvements"), the columns correspond to courses. Under each course's name is a list of the people involved in (i.e., enrolled or teaching) that course. These names must correspond exactly to those in the first sheet. Note: spaces in the course/person names are fine, but commas are not.
- Clone this repo and compile with
g++ -O3 -o scheduler scheduler.cpp
. - Get the CSV files for each sheet (
File -> Download -> Comma Separated Values (.csv)
), move them to the directory with thescheduler
binary, and convert the Windows-style line endings to Unix-style line endings (seedos2unix
). - Run
./scheduler [availability_file].csv [course_involvements_file].csv
.
- Parse the input data (and output the parsed data).
- Run a recursive DFS, trying to schedule each course at each timeslot (this can take a long time; see below). This will tend to produce either no solutions or a large number of them.
- If there are multiple solutions, the program will rank them based on how well they minimize the amount of time that teachers must spend waiting in between classes (i.e., back-to-back teaching is preferred), assigning each solution a score. Lower scores are better. The best solutions, using this metric, will be displayed.
Here is the output with the simple example Google sheet referenced above:
Course involvements:
Course1: Dylan Bob, Musk Elon, Ford Harrison, Zuckerberg Mark,
Course2: Dylan Bob,
Course3: Musk Elon, Zuckerberg Mark,
Course4: Dylan Bob, Ford Harrison, Zuckerberg Mark,
Person availability:
Dylan Bob: Time1, Time2, Time3, Time4,
Musk Elon: Time3, Time5,
Ford Harrison: Time2, Time3, Time4, Time5,
Zuckerberg Mark: Time1, Time2, Time3, Time4, Time5,
Starting scheduling!
Done scheduling. Found 4 solutions.
Finding top solutions.
Best schedule (score = 0) no. 0:
Time1: Course2
Time2: Course4
Time3: Course1
Time4:
Time5: Course3
------------------
Best schedule (score = 0) no. 1:
Time1:
Time2: Course2
Time3: Course1
Time4: Course4
Time5: Course3
------------------
Best schedule (score = 0) no. 2:
Time1:
Time2: Course4
Time3: Course1
Time4: Course2
Time5: Course3
------------------
The DFS's run time can vary wildly due to the problems exponential nature. Due to increased pruning, when there are fewer solutions (or no solutions), the algorithm runs faster.
Directing the search can have a huge impact on performance. The DFS tries scheduling courses in the order that they appear in the Google sheet. Putting the courses which are harder to schedule/more likely to cause conflicts (teachers are available less often, more students enrolled, more grade levels covered, etc.) earlier in the input will cause the pruning to take effect sooner, significantly reducing the number of function calls which must occur.
Note that the output indicates how many solutions have been found so far. If there are any solutions, this number will probably start increasing early on. If there are no solutions, nothing will be printed while running. This can be a useful way of determining if there will be any solutions with the given input without waiting for the DFS to actually finish.