/camera-calibration-challenge

A camera calibration challenge on Basketball images. An opportunity to publish at MMSports @ ACMMM and to win 2x $500.

Primary LanguagePython

Discord Compete on EvalAI Win 2x $500 Kaggle Dataset

Camera Calibration Challenge

Maintainer: Davide Zambrano from Sportradar (d.zambrano@sportradar.com)

** The DeepSportRadar Challenges will come back next year (2023) with some improvements, stay tuned on our Discord channel! **

We present the "Camera Calibration Challenge" for ACM MMSports 2022 the 5th International ACM Workshop on Multimedia Content Analysis in Sports. This year, MMSports proposes a competition where participants will compete over State-of-the-art problems applied to real-world sport specific data. The competition is made of 4 individual challenges, each of which is sponsored by Sportradar with a $1'000.00 prize.

The "Camera Calibration Challenge" aims at predicting the camera calibration parameters from images taken from basketball games. Please refer to Challenge webpage for the general challenge rules.

If you use any DeepSportradar dataset in your research or wish to refer to the baseline results and discussion published in our paper, please use the following BibTeX entry:

@inproceedings{
Van_Zandycke_2022,
author = {Gabriel Van Zandycke and Vladimir Somers and Maxime Istasse and Carlo Del Don and Davide Zambrano},
title = {{DeepSportradar}-v1: Computer Vision Dataset for Sports Understanding with High Quality Annotations},
booktitle = {Proceedings of the 5th International {ACM} Workshop on Multimedia Content Analysis in Sports},
publisher = {{ACM}},
year = 2022,
month = {oct},
doi = {10.1145/3552437.3555699},
url = {https://doi.org/10.1145%2F3552437.3555699}
}

This repo is based on the Pytorch Project Template. We want to thank the authors for providing this tool, please refer to the original repo for the full documentation. This version applies some changes to the original code to specifically adapt it to the "Camera Calibration Challenge" for ACM MMSports 2022.

Table Of Contents

In a Nutshell

The purpose of this challenge is to predict the camera calibration parameters from a single frame of a basketball game. Participants have access to a dataset of 728 pairs of images and camera calibration parameters. By default these pairs are divided in train (480), val (164) and test (84) splits. Note that this test split is different from the one on which the challenge participants will be evaluated on. Therefore, all the 728 examples can be used for the training purpose.

Participants are encouraged to explore different methods to predict the camera calibration parameters. However, a baseline will be provided as described in the In Details section.

Predictions will be evaluated based on a Mean Squared Error of the projection error of 6 points--left, center and right extremities at the middle and bottom parts of the frame--in the 3D coordinates.

Installation

A convenience bash script is provided that sets up the python environment needed to run the camera-calibration-challenge project.

The script will try to install the library into a conda environment alongside with all dependencies. The conda environment name is defaulted to camera-calibration, but can be overridden by the user:

./install.sh [my-conda-env]

Otherwise, please make sure to install the proper requirements.

Requirements

As in the original repo, this project relies on:

  • yacs (Yet Another Configuration System)
  • PyTorch (An open source deep learning platform)
  • ignite (High-level library to help with training neural networks in PyTorch)

Moreover, data are handled by:

In Details

Download and prepare the dataset

The dataset can be found here. It can be downloaded and unzipped manually in the basketball-instants-dataset/ folder of the project.

We will here download it programmatically. First install the kaggle CLI.

pip install kaggle

Go to your Kaggle Account page and click on Create new API Token to download the file to be saved as ~/.kaggle/kaggle.json for authentication.

kaggle datasets download deepsportradar/basketball-instants-dataset
mkdir basketball-instants-dataset
unzip -qo ./basketball-instants-dataset.zip -d basketball-instants-dataset

The dataset has to be pre-processed to be used, please run:

python tools/download_dataset.py --dataset-folder ./basketball-instants-dataset --output-folder dataset

The processed dataset is then contained in a pickle file in the dataset folder. Please refer to .data\datasets\viewds.py methods as examples of usage. Specifically the class GenerateSViewDS applies the required transformations and splits the keys into train, val and test. Please consider that the test keys of this dataset are not the ones used for the challenge evaluation (those keys, without annotations, will be provided in a second phase of the challenge). The class SVIEWDS is an example of torch.utils.data.Dataset for PyTorch users. Finally, note that transformations are applied at each query of the key, thus returning a potentially infinite pairs of image (views) and calibration matrix. A pseudo-random transformation is applied for the val and test keys, thus views are fixed for these splits.

The challenge uses the split defined by DeepSportDatasetSplitter which

  1. Uses images from KS-FR-CAEN, KS-FR-LIMOGES and KS-FR-ROANNE arenas for the testing-set.
  2. Randomly samples 15% of the remaining images for the validation-set
  3. Uses the remaining images for the training-set.

The testing-set should be used to evaluate your model, both on the public EvalAI leaderboard that provides the temporary ranking, and when communicating about your method.

The challenge-set will be shared later, without the labels, and will be used for the official ranking. You are free to use the three sets defined above to build the final model on which your method will be evaluated in the EvalAI submission.

Each key in the dataset is associated with an item which contains the images to be used as input and the Calib object from calib3d library, which is what participants should predict.

Images are created as views of basketball games from the original cameras of the Keemotion system. These images can be considered as single frames of a broadcasted basketball game. Indeed, the view creation takes into account the location of the ball, and, in basketball, most of the action is around the KEY area under the rim (you can look at the Basketball court page and the utils/intersections.py file for some definitions). All the games in this dataset are from FIBA courts. In this challenge we consider un-distorted images only. Camera conventions are described here.

The Calib object is built around the K (calibration), T (translation) and R (rotation) matrixes (reference Camera matrix)

Challenge rules

The challenge goal is to obtain the lowest MSE (cm) on images that were not seen during training. In particular, the leaderboards that provide rewards will be built on an unannotated challenge set that will be provided late in June.

The competitors are asked to create models that only rely on the provided data for training. (except for initial weights that can come from well-established public methods pre-trained on public data. This must be clearly stated in publication/report)

Please see the challenge page for more details: https://deepsportradar.github.io/challenge.html.

The Baseline

We encourage participants to find innovative solutions to solve the camera calibration challenge. However, an initial baseline is provided as example. The baseline is composed by two models: the first is a segmentation model that predicts the 20 lines of the basketball court (DeepLabv3 in modeling/example_model.py); the second finds the 2D intersections in the image space and matches them with the visible 3D locations of the court (see utils/intersections.py). If enough intersections points are found (>5) the method cv2.calibrateCamera predicts the camera parameters (see compute_camera_model in modeling/example_camera_model.py). In all the other cases, the model returns an average of the camera parameters in the training set as default.

You can download the baseline weights as:

wget https://arena-data.keemotion.com/tmp/gva/model_best.pkl

Then move it into logs/sviewds_public_baseline.

Training the segmentation model

Once the dataset is downloaded (see Download and prepare the dataset), you can train the baseline by running:

python tools/train_net.py --config_file configs/train_sviewds_public_baseline.yml

Logs and weights will be saved in the OUTPUT_DIR folder specified in the config file; moreover, Tensorboard events will be saved in the runs folder. After the training, you can test the segmentation model using the same config file specifying the TEST.WEIGHT path in the parameters:

python tools/test_net.py --config_file configs/train_sviewds_public_baseline.yml

The config parameter DATASET.EVAL_ON specifies on which split the model will be tested: val or test.

Evaluate the camera calibration model

The script evaluate_net.py runs the inference on the trained model and generates the predictions.json file as required for the submission (see Submission format).

python tools/evaluate_net.py --config_file configs/eval_sviewds_public_baseline.yml

Note that the config file used for camera calibration model evaluation is different from the one used for training and testing the segmentation model only. Here, the config parameter DATASET.EVAL_ON specifies on which split the model will be tested: val or test. During evaluation, the dataloader will return now as target the segmentation mask and the groundtruth camera calibration parameters. As explained before, the predicted camera parameters will be computed based on the intersections found with the find_intersections method in utils/intersections.py and the compute_camera_model method in modeling/example_camera_model.py. Once the predictions have been saved, if the flag DATASETS.RUN_METRICS is set as True, the method run_metrics in engine/example_evaluation.py will compare the camera calibration parameters with the corresponding groundtruth camera calibration parameters (val or test). Please consider that the test keys are not the ones used for the challenge evaluation (those keys, without annotations, will be provided in a second phase of the challenge).

You can now submit the predictions.json on EvalAI for the Test phase and verify that the results are the same.

When the challenge set will be released, you will need to set DATASETS.RUN_METRICS as False and generate the prediction only.

CHALLENGE set

This section explains how to generate the predictions.json file for the CHALLENGE set.

Download the dataset zip file and unzip it as:

wget https://arena-data.keemotion.com/tmp/gva/challenge_set.zip
unzip challenge_set.zip -d .

You now have the images in the CHALLENGE folder. For convenience, the images have been generated of size [960, 540] (INPUT.MULTIPLICATIVE_FACTOR: 2). The relative evaluation script will consider this resolution.

To run the inference on these images you will need to modify your config file as:

  • DATASETS.TEST: "challenge"
  • DATASETS.RUN_METRICS: False

The config file configs/eval_challenge.yml is provided as an example. Then, run:

python tools/evaluate_net.py --config_file configs/eval_challenge.yml

This will create the predictions.json file needed to be updated in EvalAI.

NOTE: the CHALLENGE ground_truths and predictions are in the keys numeric order, which corresponds to the relative image filename: '0.png', '1.png', '2.png', '3.png' ...

Submission format

The submission format is a single json file containing a list of dicts. Each dict should contain all the camera parameters T, K, kc, R, C, P, Pinv, Kinv. Note that the evaluation script retrieves the camera parameters from the Projection Matrix P. See the class calib3d.Calib. Please consider that the evaluation script follows the list of images provided: an empty dict will be replaced by a diagonal homography (see run_metrics in engine/example_evaluation.py).

Once the camera model is provided, the evaluation script projects 6 points from the image space to the 3D coordinates. On these projections the mean squared error is computed.

The prediction file has to be submitted at the EvalAI page of the challenge.

Acknowledgments

Sportradar and collaborators: