
Diver segmentation for the Diving48 dataset (out of necessity, since DeeplabV3 trained on MS-COCO off-the-shelf does recognize divers as the Person class).

Primary LanguageJupyter Notebook


This repository contains code to 1) fine-tune a MaskRCNN to segment diver instances from the Diving48 dataset, and 2) code to perform inference with such a model, and save the segmented clips as videos. This is used in the paper Recur, Attend or Convolve? Frame Dependency Modeling Matters for Cross-Domain Robustness in Action Recognition by Broomé et al., arXiv 2112.12175.

The manually labelled frames can be downloaded here from Harvard Dataverse. A trained model checkpoint is also available on the same page (search for checkpoint among the files).

Please cite our paper if you found this code or dataset useful for your work.

      title={{Recur, Attend or Convolve? On Whether Frame Dependency Modeling Matters for Cross-Domain Robustness in Action Recognition}}, 
      author={Sofia Broomé and Ernest Pokropek and Boyu Li and Hedvig Kjellström},
      booktitle = {IEEE Winter Conference on Applications in Computer Vision (WACV)},
      month = {January}, 