/Tennis

A Tennis dataset and models for event detection & commentary generation

Primary LanguagePythonMIT LicenseMIT

Tennis

A Tennis dataset and models for event detection & commentary generation. Discussed in:

"TenniSet: A Dataset for Dense Fine-Grained Event Recognition, Localisation and Description"

NOTE: The results in the paper were with old Keras models, new results are with MXNet and Gluon models.

The Dataset

The tennis dataset consists of 5 matches and has manually annotated temporal events and commentary captions.

Individual shots (serve and hit) are used to generate 11 temporal event categories:

More about the sample numbers for these individual classes can be seen below in the split information.

.......

The Annotator

The annotator was used to annotate the videos with dense temporal events.

.......

Data Downloading and Pre-processing

See data for download and organisation information.

Once you have JSON annotation files from the annotator, you can run: utils/annotations/preprocess.py to perform pre-processing on the annotations, OR you can just download from my Google Drive

.......

The Splits

Due to the limited size of the dataset, there are two varieties of train, validation and testing splits. The first (01) uses the the entire V010 as the validation and test while the second (02) splits across all videos evenly.

The resulting statistics per event class are as follows:

.......

The Captions

There is one commentary style caption for each of the 746 points, as well as another 10817 captions not aligned to any imagery. Some examples are:

Both groups of captions are utilised to generate a word embedding for the 250 unique words in the vocabulary. The embedding is generated with train_embeddings.py utilising a SkipGram model. Below the 100 dimensional word embedding is visualised post t-SNE. The full embeddings can be found in data/embeddings-ex.txt.

The Models

There are a number of different models architectures to chose from, more information and download links to pretrained models can be found in the README in the models directory.

.......

Event Detection

These models are trained with train.py and evaluated with evaluate.py

Features can be extracted using --save_feats argument, and will save them as .npy in \data\features\$model_id$\ with the same structure as \data\frames\.

The table below shows the F1 scores per class on the test set for some of the different models

Below is a video of the CNN-RNN model on the 02 test set. This can be generated using --vis when running evaluate.py

YouTube video of results

.......

Captioning

NOTE: The captioning scripts require the nlg-eval package. Please install prior as recommended by thier README

These models are trained with train_gnmt.py and evaluated with evaluate_gnmt.py

The table below shows some example generated captions on the test split, the underline marks errors. 03 represents the point in the GIF at the top of this page.

Sharing is Caring

If you find any data or models useful please reference and cite

@inproceedings{faulkner2017tenniset,
  title={TenniSet: A Dataset for Dense Fine-Grained Event Recognition, Localisation and Description},
  author={Faulkner, Hayden and Dick, Anthony},
  booktitle={2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)},
  pages={1--8},
  organization={IEEE}
}