cyslf
automatically assigns players to teams based on registration data, coach evaluations, and parent requests.
For each league, there are two parts: (1) get all the raw form data into standard player and team csvs. (2) form teams from these standardized csvs!
Returning players are placed on their last season's team by default, then the remaining players are assigned while maintaining balanced teams and player convenience (see Scoring for details).
0. Install Python and open command prompt to run the following lines. (For Windows, try installing Miniconda and using anaconda prompt instead.)
convert-addresses --reg registrations.csv
OR
convert-addresses --folder ./girls5-6_data
- This command adds longitude and latitude columns to the registration csv containing the coordinates of the corresponding player's address. This step was originally part of the next step, but is now seperated out because the process of looking up addresses took the program a while.
- The program may fail to find coordinates for some addresses, which is fine. This tends to happen when an address is mispelled somehow. In this case you can either fix the spelling in the registration file or fill in the latitude and longitude manually using Google Maps.
--reg
sets the current registration csv. This should contain the player's address information seperated into columnsStreet
,City
,Region
, andPostal Code
. This should only contain players in one division, i.e. Boys Grades 3-4.--folder
or-f
can be used instead of--reg
to provide the registration file. It sets the folder the registration file must exist in, and the file must be namesregistration.csv
.
prepare-player-data --div "Boys Grades 3-4" --coach_evals coach-evaluations.csv --reg registrations.csv --par parent-requests.csv -o example-players.csv
OR
prepare-player-data --div "Girls Grades 7-8" --folder ./girls7-8_data -o prepared-player-data.csv
- This command takes raw form data and converts it into the standard player csv format (see example data below). It does a handful of things to clean up the data.
- Players on teams from the prior season will be placed on those teams by default.
- Players without skill are assigned a level of 5 (average)
- Players without a goalie skill are assigned a level of 6 (does not play goalie)
- Players coming from a lower division get their grades worsened by 1 point
- Players are matched by names -- so if names are spelled differently on different forms, they won't get matched, so we'll print names that look similar and maybe should've been matched.
- After completion, open the csv and manually make adjustments.
- Read the comments -- maybe parents are unable to reach certain fields
- If you do edit cells, make sure the formatting is consistent.
Danehy
is not the same asdanehy
.
--div
sets the division. If your input is found in the "Division" column of the old registration data, that player is considered to be a continuing player in the division. You don't need to put the full division name."Boys Grades 3-4"
is good enough to match to"Boys Grades 3-4 - Spring 2022 In-Town Soccer"
but you do need to be careful about spelling / upper case / lower case.--coach_evals
sets the coach evaluations csv. This needs to have names, coach evaluations, and past teams for each player.--par
sets the parent request csv. This should have practice location / day / teammate preferences.--reg
sets the current registration csv. This should have all other player-relevant data. This should only contain players in one division, i.e. Boys Grades 3-4.-m
sets the number of potential name matches to print. Default is-m 5
.-o
sets the output file.-r
lets you replace the output file if it exists.--folder
or-f
can be used to replace--coach_evals
,--par
, and--reg
by providing a directory that contains these three files. The names of the files must becoach_evals.csv
,parent_requests.csv
, andregistration.csv
(keep in mind parent requests is optional). Make sure the file names exactly match these names. Will atuomatically output a file namesprepared_player_data.csv
.
- These should have team name, practice day, and practice location (see example data below)
make-teams -i example-players.csv -t example-team.csv -o example-result.csv
OR
make-teams -f ./boys3-4_data -r
- This command reads the player and team csvs, assigns available players, and outputs the results as standard player and team csvs.
-i
sets the input player file.-t
sets the team information file.-o
sets the output player file.-c
can optionally be used to set a config file to control scoring weights (see below).-d
can optionally be used to set the search depth. This is how hard the algorithm tries to rearrange players. 4 or 5 will probably take too long to run, 2 or 3 are probably good enough.-r
can be used to replace the output file when it already exists.-f
can be used similarly to--folder
in the prepare player data step. Rather than providing files with-i
,-t
,-o
, and-c
, only provide the directory containing each of these files. The files must be namedprepared_player_data.csv
,teams.csv
, andfinal_results.csv
(which is automatically created if it doesn't already exist). You may also include aweights.txt
file as the-c
option. See the example for details on formatting.
- Load the player csv and see if any adjustments need to be made.
- If you want to re-run league formation, unfreeze players and remove their team values, download and run step 4 again.
The make-teams
command reads files in the standard format and outputs files in the standard format. This means you should be able to open these in google sheets / excel and move players around easily.
See this google sheets example.
The standard player csv is expected to have the following columns:
id
: a unique player ID numberlast_name
: player last namefirst_name
: player first namegrade
: player grade numberteam
: assigned team (if any)coach_skill
: coach evaluated skill (1 = good, 10 = bad)parent_skill
: parent evaluated skill (used when coach evaluation is missing)goalie_skill
: coach evaluated skill (1 = good, 6 = does not play goalie)preferred_locations
: preferred practice field namesbackup_locations
: practice fields determined automatically based on player's home addressdisallowed_locations
: practice fields that the player can't reachpreferred_days
: days player prefers to have practice. must be a string of characters from "MTWRF". For example, "MTR" means the player prefers to practice on Monday, Tuesday, or Thursday.unavailable_days
: days player is not able to practice. similar format as above.teammate_requests
: teammate nameslock
:TRUE
orFALSE
. If a player is locked and already assigned to a compatible team before make_teams is called, they'll remain on that team. If they're unlocked (lock
==FALSE
) or the team they're assigned to isn't compatible (i.e. incompatible practice day/location) then they'll be assigned a new team.school
: school namecomment
: special requests from the registration form
The standard player csv is expected to have the following columns:
name
: team namepractice_day
: one ofM
,T
,W
,R
,F
(Monday, Tuesday, Wednesday, Thursday, Friday)location
: practice field name. Valid field names are listed below.
Valid field names
Ahern
Common
Danehy
Donnelly
Magazine
Maher
Pacific
Raymond
Russell
Sennott
This implementation uses a greedy algorithm. We order players by skill then go through and assign them to the team that gives the best overall league score.
To assign a specific player, we try placing them on each team and keep track of which arrangement produces the highest score. We also try placing them on each team and having that team "trade" a player to another team (again looking for the highest scoring player arrangement). We can continue trying to trade players and evaluating arrangements by increasing the -d
argument to make-teams
(make-teams -d 4 ...
tells the algorithm to go 3 trades deep when conducting the search). Note that some arrangements are invalid -- for example if a player can't practice on Wednesday, they can't be placed on a team that practices Wednesday.
More optimal algorithms exist, but this algorithm is one of the most straightforward to understand. It also lends itself well to become a "recommended assignment" tool if we ever want to have the library give suggestions one player at a time.
The score of a particular arrangement of players is composed of a bunch of independent scores. The independent scores are combined with a weighted sum. These weights can be controlled by passing a config file to make-teams
(eg make-teams -c weights.txt ...
). Example weights file (put in weights.txt
):
[weights]
skill = 1 # balance average team skill
grade = 1 # balance average team grade
size = 1 # balance average team size
first_round = 1 # balance # of top rank players (skill=1)
top = 1 # balance the # of top tier players (skill=2, 3)
mid = 1 # balance the # of mid tier players (skill=4, 5, 6)
bottom = 1 # balance the # of bottom tier players (skill=7, 8, 9, 10)
goalie = 1 # balance goalie skill
location = 1 # minimize player distance to practice field
practice_day = 1 # maximize # of players' practicing on their preferred day
teammate = 1 # honor player teammate requests
If you run formation and teams aren't as good as you'd like for some score type, try increasing the weight. (For example if you really really care about giving players a nearby practice field, you could set location = 1000
).