Wyscout v3 events sometimes have 10 player formations for (opponent) team
DriesDeprest opened this issue ยท 5 comments
I noticed that the team / opponent team object in the events of Wyscout v3 event data, doesn't always have a formation where the total amount of players summed up equals to 11. For example, I've seen occurrences of events with a "4-4-1" (opponent) team formation. Currently, our serializer crashes when this is the case, as it does not recognize this formation.
When analysing the event data of a match where we have the 'troublesome' formations, I saw that this was the result of a team getting a red card booking and in the events that followed the formation of that team was described by a 10 player formation.
The team originally had a "4-4-1-1" formation, but after the red card this shifted to "4-4-1".
How do we want to handle this?
Option A:
- Description: We create new generic kloppy formation types for all possible 10 player formations present in Wyscout and use those to describe the formation of the (opponent) team in the data.
- Pro: This keeps the kloppy output close to the raw data input.
- Con: The behaviour of how we describe 10 player formations is different for Wyscout vs other providers. Where for other providers, we still use a 11 player formation after a red card. Resulting in a non-standardized approach for different providers.
Option B:
- Description: When we recognize a 10 player formation in the event data, we keep using the last valid 11 player formation observed in the data of that team for all future events of that team.
- Pro: This keeps the behaviour standard over different providers, where after a red card we still use 11 player formations to describe team formations.
- Con: We lose a level of detail of describing the formations of a team.
I think my preference would go to option B, to have standard behaviour across different providers.
In my personal opinion, it is always better for the data to reflect reality as accurately as possible.
In our case, it is important to know our own and our opponent's formation, as we analyze "behavior" with different schemes and clearly when there is one or more players less on the field, that changes.
Likewise, it would have to be seen what most users use Kloppy for, since these "own" issues can be solved, as until now, by performing our own processing on eventing data, in this case.
@koenvo @JanVanHaaren @probberechts thoughts? I'd like to start implementing this
@dvilches thanks for sharing your take on option A vs B. I understand your need of having an accurate description of a team's behaviour to perform qualitative performance analysis.
Since I'm using kloppy for reading in data from different providers, the aspect that we have a standardized output for different input vendors is more importantly for my use case than the level of detail that we get extra. Therefore, my preference for option B.
In the future, however, I think we should elaborate the possible Enum values of FormationType
to also include formations for when there are 10/9/8 players on the pitch and use these for all providers if there are < 11 players on the pitch of a given team.
For Wyscout, we can get the X player formation directly from the team
or opponentTeam
properties.
For other providers, where the formation data is not included in each event, we would need to do it in an alternative way. We would need to recognize when a team starts playing with < 11 players (due to a red card or sub off without a sub on) and based on the position (defender / midfielder / attacker) of the player that gets sent off, adapt the formation accordingly.
For example, if team A was playing in a 4-5-1 and their CM gets sent off, we would assume they now play in a 4-4-1 until they change formation again.
Hi @DriesDeprest, I agree with your perspective. That's why we're clarifying that we can resolve this issue "outside" of Kloppy, and that a quick solution for most users is more important than the "best solution" for us.
Thank you for your continued contributions to the project.
I don't have a strong opinion but I'm leaning towards option B.
In an ideal world, kloppy would be able to represent the actual formations for both teams at each point in a match, but the information that the data providers are offering might be too limited in some cases.