/Uncertainty-aware-DA-for-AR

Uncertainty-aware domain adaption for action recognition

Primary LanguagePython

Uncertainty-Aware Domain Adaptation for Action Recognition


This is the official PyTorch implementation of our papers:

Uncertainty-Aware Domain Adaptation for Action Recognition

ABSTRACT

Domain Adaptation (DA) has been a crucial topic for action recognition, as the test set and training set are not always subject to the identical distribution, which will lead to significant performance degradation. Existing researches focus on DA methods based on the entire videos, ignoring the different contributions of different samples and regions. In this paper, we propose an uncertainty-aware domain adaptation method for action recognition from a new perspective. The aleatoric uncertainty is firstly used in the classifier to improve the performance by alleviating the impact of noisy labels. Then the aleatoric uncertainty calculated with Bayesian Neural Network is embedded in the discriminator to help the network focus on the spatial areas and temporal clips with lower uncertainty during training. The spatial-temporal attention map is generated to enhance the features with the guidance of backward passing. Extensive experiments are conducted on both small-scale and large-scale datasets, and the results indicate that the proposed method achieves competitive performance with fewer computational workloads.

Contents


Requirements

  • support Python 3.6, PyTorch 0.4, CUDA 9.0, CUDNN 7.1.4
  • install all the library with: pip install -r requirements.txt

Dataset Preparation

Data structure

You need to extract frame-level features for each video to run the codes. To extract features, please check dataset_preparation/.

Folder Structure:

DATA_PATH/
  DATASET/
    list_DATASET_SUFFIX.txt
    RGB/
      CLASS_01/
        VIDEO_0001.mp4
        VIDEO_0002.mp4
        ...
      CLASS_02/
      ...

    RGB-Feature/
      VIDEO_0001/
        img_00001.t7
        img_00002.t7
        ...
      VIDEO_0002/
      ...

RGB-Feature/ contains all the feature vectors for training/testing. RGB/ contains all the raw videos.

There should be at least two DATASET folders: source training set and validation set. If you want to do domain adaption, you need to have another DATASET: target training set.

File lists for training/validation

The file list list_DATASET_SUFFIX.txt is required for data feeding. Each line in the list contains the full path of the video folder, video frame number, and video class index. It looks like:

DATA_PATH/DATASET/RGB-Feature/VIDEO_0001/ 100 0
DATA_PATH/DATASET/RGB-Feature/VIDEO_0002/ 150 1
......

To generate the file list, please check dataset_preparation/.

Input data

Here we provide pre-extracted features and data list files, so you can skip the above two steps and directly try our training/testing codes. You may need to manually edit the path in the data list files.


Usage

  • training/validation: Run ./script_train_val.sh

All the commonly used variables/parameters have comments in the end of the line. Please check Options.

Training

All the outputs will be under the directory exp_path.

  • Outputs:
    • model weights: checkpoint.pth.tar, model_best.pth.tar
    • log files: train.log, train_short.log, val.log, val_short.log

Testing

You can choose one of model_weights for testing. All the outputs will be under the directory exp_path.

  • Outputs:
    • score_data: used to check the model output (scores_XXX.npz)
    • confusion matrix: confusion_matrix_XXX.png and confusion_matrix_XXX-topK.txt

Options

Domain Adaptation

In ./script_train_val.sh, there are several options related to our DA approaches.

  • use_target: switch on/off the DA mode
    • none: not use target data (no DA)
    • uSv/Sv: use target data in a unsupervised/supervised way

More options

For more details of all the arguments, please check opts.py.

Notes

The options in the scripts have comments with the following types:

  • no comment: user can still change it, but NOT recommend (may need to change the code or have different experimental results)
  • comments with choices (e.g. true | false): can only choose from choices
  • comments as depend on users: totally depend on users (mostly related to data path)