Standardize position types
DriesDeprest opened this issue ยท 7 comments
In a step to further standardize the output of kloppy's Dataset
, I propose to create a PositionType
Enum class, containing the different possible position types on a football pitch, commonly used in the industry. In each of the serializers, we would then map the possible positions ids / labels used by the provider to the equivalent kloppy PositionType
.
A proposition of the values in the class is the following:
class PositionType(Enum):
GK = "GK"
LB = "LB"
LCB = "LCB"
CB = "CB"
RCB = "RCB"
RB = "RB"
LW = "LW"
LM = "LM"
LDM = "LDM"
LCM = "LCM"
LAM = "LAM"
CDM = "CDM"
CM = "CM"
CAM = "CAM"
RW = "RW"
RM = "RM"
RDM = "RDM"
RCM = "RCM"
RAM = "RAM"
LF = "LF"
ST = "ST"
RF = "RF"
The Position class would then be updated as followed:
@dataclass(frozen=True)
class Position:
position_type: PositionType
coordinates: Optional[Point] = None
def __str__(self):
return self.name
@classmethod
def unknown(cls) -> "Position":
return cls(position_id="", name="Unknown")
Not sure what to do with the attributes position_id
or position_name
. We could keep them as provider_position_id
and/or provider_position_name
, but not sure whether that is required.
I would use a hierarchical position mapping (e.g., "Midfielder" > "AM" > "CAM") for two reasons.
- Not all data providers provide positions that are as fine-grained as the ones defined above. If a provider uses "AM" instead of "CAM" it would not be possible to map it to one of the standardized types.
- It makes it more straightforward to select position groups (e.g., all midfielders) without having to specify a list of positions.
Thanks for the feedback. Do you think the hierarchical position groups should be MECE? And how would the pseudocode look like for the PositionType
with the hierarchical mapping?
@probberechts very interesting comment indeed!
When you say "Midfielder" > "AM" > "CAM" would it make sense (although it might be overkill) to break up a positional into different "axes". Instead of going Midfielder, AM, CAM, why not describe the position as Midfielder, Central and Attacking.
Just some idea, which already breaks for LCM and LB, but something along these lines:
- line (Goalkeeper, Defender, Midfielder, Forward)
- area (Central, Left, Right)
- style (Attacking, Defending, Neutral)
- wide (Yes, No)
Not quite sure if it's desirable though, but worth considering maybe
@DriesDeprest I think they should be MECE yes. I can't really think about anything that would overlap actually. I haven't thought about the implementation yet.
@UnravelSports Ultimately it is not a big difference but for me, there is intuitively a natural ordering of these axes as "line" > "style" > "area" which leads to e.g., "Midfielder" > "DM" > "LDM". And "wide" does not seem to add anything. A wide "LDM" would just be a "LM", right? I am not sure whether other people also have the same mental map of positions or whether it is just me ๐
@probberechts, do you think something like this could work?
from enum import Enum
from dataclasses import dataclass
from typing import Optional, Union
class PositionType:
class Goalkeeper(Enum):
GK = "Goalkeeper"
class Defender:
class General(Enum):
DEF = "Defender"
class FullBack(Enum):
LB = "Left Back"
RB = "Right Back"
class CenterBack(Enum):
LCB = "Left Center Back"
CB = "Center Back"
RCB = "Right Center Back"
class Midfielder:
class General(Enum):
MID = "Midfielder"
class Defensive(Enum):
DM = "Defensive Midfield"
LDM = "Left Defensive Midfield"
CDM = "Center Defensive Midfield"
RDM = "Right Defensive Midfield"
class Central(Enum):
CM = "Central Midfield"
LCM = "Left Center Midfield"
CM = "Center Midfield"
RCM = "Right Center Midfield"
class Attacking(Enum):
AM = "Attacking Midfield"
LAM = "Left Attacking Midfield"
CAM = "Center Attacking Midfield"
RAM = "Right Attacking Midfield"
class Wide(Enum):
WM = "Wide Midfield"
LW = "Left Wing"
RW = "Right Wing"
LM = "Left Midfield"
RM = "Right Midfield"
class Attacker:
ATT = "Attacker"
LF = "Left Forward"
ST = "Striker"
RF = "Right Forward"
@dataclass(frozen=True)
class Position:
position_type: Union[Enum, PositionType]
coordinates: Optional['Point'] = None
def __str__(self):
return self.position_type.value
@classmethod
def unknown(cls) -> "Position":
return cls(position_type=PositionType.Goalkeeper.GK)
# Example usage of the updated Position class
specific_position = Position(position_type=PositionType.Midfielder.Defensive.LDM)
print(specific_position) # Output: Left Defensive Midfield
general_position = Position(position_type=PositionType.Midfielder.Defensive.DM)
print(general_position) # Output: Defensive Midfield
This is how I would implement it:
from enum import Enum
class PositionType(Enum):
Goalkeeper = ("Goalkeper", "GK", None)
Defender = ("Defender", "DEF", None)
FullBack = ("Full Back", "FB", "Defender")
LeftBack = ("Left Back", "LB", "FullBack")
RightBack = ("Right Back", "RB", "FullBack")
CenterBack = ("Center Back", "CB", "Defender")
LeftCenterBack = ("Left Center Back", "LCB", "CenterBack")
RightCenterBack = ("Right Center Back", "RCB", "CenterBack")
Midfielder = ("Midfielder", "MID", None)
# Add subtypes of Midfielder here...
Attacker = ("Attacker", "ATT", None)
# Add subtypes of Attacker here...
def __init__(self, long_name, code, parent):
self.long_name = long_name
self.code = code
self._parent = parent
@property
def parent(self):
if self._parent:
return PositionType[self._parent]
return None
def is_subtype_of(self, other):
current = self
while current is not None:
if current == other:
return True
current = current.parent
return False
def __str__(self):
return self.long_name
# Example usage
print(PositionType.LeftBack.parent) # Output: PositionType.FullBack
print(PositionType.Defender.is_subtype_of(PositionType.Defender)) # Output: True
print(PositionType.LeftCenterBack.is_subtype_of(PositionType.Defender)) # Output: True
print(PositionType.LeftBack.is_subtype_of(PositionType.Midfielder)) # Output: False
print(PositionType.LeftCenterBack) # Output: Left Center Back
print(PositionType.LeftCenterBack.code) # Output: LCB
@UnravelSports Ultimately it is not a big difference but for me, there is intuitively a natural ordering of these axes as "line" > "style" > "area" which leads to e.g., "Midfielder" > "DM" > "LDM". And "wide" does not seem to add anything. A wide "LDM" would just be a "LM", right? I am not sure whether other people also have the same mental map of positions or whether it is just me ๐
Yes I was trying to get at the same thing, added the "wide" last minute because I was trying to fit in all the acronyms, but that didn't work ha