/statsbomb-parser

Convert StatsBomb's JSON data into easy-to-use CSV format.

Primary LanguagePythonMIT LicenseMIT

StatsBomb JSON parser

PyPI version PyPI - Python Version Build Status codecov HitCount

Convert competitions/matches/lineups/events JSON data released by StatsBomb into easy-to-use CSV format.

A simple web interface for this package can be found here.

Installation

$ pip install statsbomb

Example usage

  • Parsing the competitions.json file:
import statsbomb as sb

comps = sb.Competitions()
print(len(comps))  # 3
json_data = comps.data  # underlying json data

df = comps.get_dataframe()
print(df)
competition_id competition_name country_name match_available match_updated season_id season_name
37 FA Women's Super League England 2018-09-08T07:33:39.356340 2018-09-08T07:33:39.356340 1 2017/2018
43 FIFA World Cup International 2018-09-08T07:33:39.356340 2018-09-08T14:30:04.356514 3 2018
49 NWSL United States of America 2018-09-08T07:33:39.356340 2018-09-08T07:33:39.356340 3 2018
  • Parsing a matches json file:
import statsbomb as sb

matches = sb.Matches(event_id='11', season_id='37')
df = matches.get_dataframe()
print(len(df))  # 7
  • Parsing an events json file to extract shots:
import statsbomb as sb

events = sb.Events(event_id='8658')
df = events.get_dataframe(event_type='shot')
print(len(df))  # 23

print(df)
event_type id index period timestamp minute second possession possession_team play_pattern off_camera team player position duration under_pressure statsbomb_xg key_pass_id body_part type outcome technique first_time follows_dribble redirect one_on_one open_goal deflected start_location_x start_location_y end_location_x end_location_y end_location_z
shot c3ffbb5f-d836-4d33-a02a-3a994990d253 577 1 00:20:51.227 20 51 39 Croatia From Free Kick False Croatia Domagoj Vida Left Center Back 1.013 0.05478843 baafd0a9-1031-46df-82a2-16538d6e94cf Head Open Play Off T Normal 112.0 49.0 119.0 36.7 4.7
shot d7a727de-1b60-47c7-b9fa-10948bb730ed 634 1 00:23:34.907 23 34 45 Croatia From Free Kick False Croatia Ivan Rakitić Left Center Midfield 2.053 0.04375982 9cc48e31-5a52-4074-97b1-5c3eafdd753d Left Foot Open Play Off T Volley 108.0 29.0 120.0 46.9 6.1
shot 20bcdb94-9507-4bed-8315-edddcbb84081 736 1 00:27:53.880 27 53 53 Croatia From Free Kick False Croatia Ivan Perišić Left Wing 0.587 0.12172278 90fdf286-3e32-4646-bcb1-a83a7d51593f Left Foot Open Play Goal Half Volley True True 105.0 32.0 120.0 43.3 0.7
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
  • Save data to CSV:
import statsbomb as sb

events = sb.Events(event_id='8658')
events.save_data(event_type='shot')  # outputs a file named events_8658_shot.csv

Contributing

Clone the repo:

git clone https://github.com/imrankhan17/statsbomb-parser.git
cd statsbomb-parser

Create a virtual environment:

python -m venv env
source env/bin/activate
pip install -r requirements.txt

Or use Docker:

docker build -t statsbomb-parser .

To run the CI pipeline locally, execute the commands in the script part of the .travis.yml files. Or using Docker:

docker run -it --rm -v $(pwd):/home -w /home statsbomb-parser python -m pycodestyle --max-line-length=119 statsbomb tests *.py
docker run -it --rm -v $(pwd):/home -w /home statsbomb-parser python -m pylint statsbomb tests *.py
docker run -it --rm -v $(pwd):/home -w /home statsbomb-parser python -m pytest --disable-pytest-warnings --cov=statsbomb --cov-report=html --durations=5 tests/