C9-Audio-Based-Interaction-Recognition

Challenge

To participate and submit to this challenge, register at the EPIC-SOUNDS Audio-Based Interaction Recognition Codalab Challenge. The labelled train/val annoations, along with the recognition test set timestamps are available on the EPIC-Sounds annotations repo. The baseline models can also be found here, where the inference script src/tools/test_net.py can be used as a template to correctly format models scores for the create_submission.py and evaluate.py scripts.

This repo is a modified version of the existing Action Recognition Challenge.

NOTE: For this version of the challenge (version "0.1"), the class "background" (class_id=13) has been redacted from the test set. The argument --redact_background is supported in evaluate.py to remove background labels from your validation set evaluation.

Result data formats

We support two formats for model results.

List format:

[
    {
        'interaction_output': Iterable of float, shape [44],
        'annotation_id': str, e.g. 'P01_101_1'
    }, ... # repeated for all segments in the val/test set.
]

Dict format:


{
    'interaction_output': np.ndarray of float32, shape [N, 44],
    'annotation_id': np.ndarray of str, shape [N,]
}

Either of these formats can saved via torch.save with .pt or .pyth suffix or with pickle.dump with a .pkl suffix.

Note that either of these layouts can be stored in a .pkl/.pt file--the dict format doesn't necessarily have to be in a .pkl.

Evaluating model results

We provide an evaluation script to compute the metrics we report in the paper on the validation set. You will also need to clone the annotations repo.

epic-kitchens/C9-epic-sounds

C9-Audio-Based-Interaction-Recognition

Challenge

Result data formats

Evaluating model results