This is the official PyTorch implementation of our papers:
Uncertainty-Aware Domain Adaptation for Action Recognition
Domain Adaptation (DA) has been a crucial topic for action recognition, as the test set and training set are not always subject to the identical distribution, which will lead to significant performance degradation. Existing researches focus on DA methods based on the entire videos, ignoring the different contributions of different samples and regions. In this paper, we propose an uncertainty-aware domain adaptation method for action recognition from a new perspective. The aleatoric uncertainty is firstly used in the classifier to improve the performance by alleviating the impact of noisy labels. Then the aleatoric uncertainty calculated with Bayesian Neural Network is embedded in the discriminator to help the network focus on the spatial areas and temporal clips with lower uncertainty during training. The spatial-temporal attention map is generated to enhance the features with the guidance of backward passing. Extensive experiments are conducted on both small-scale and large-scale datasets, and the results indicate that the proposed method achieves competitive performance with fewer computational workloads.
- support Python 3.6, PyTorch 0.4, CUDA 9.0, CUDNN 7.1.4
- install all the library with:
pip install -r requirements.txt
You need to extract frame-level features for each video to run the codes. To extract features, please check dataset_preparation/
.
Folder Structure:
DATA_PATH/
DATASET/
list_DATASET_SUFFIX.txt
RGB/
CLASS_01/
VIDEO_0001.mp4
VIDEO_0002.mp4
...
CLASS_02/
...
RGB-Feature/
VIDEO_0001/
img_00001.t7
img_00002.t7
...
VIDEO_0002/
...
RGB-Feature/
contains all the feature vectors for training/testing. RGB/
contains all the raw videos.
There should be at least two DATASET
folders: source training set and validation set. If you want to do domain adaption, you need to have another DATASET
: target training set.
The file list list_DATASET_SUFFIX.txt
is required for data feeding. Each line in the list contains the full path of the video folder, video frame number, and video class index. It looks like:
DATA_PATH/DATASET/RGB-Feature/VIDEO_0001/ 100 0
DATA_PATH/DATASET/RGB-Feature/VIDEO_0002/ 150 1
......
To generate the file list, please check dataset_preparation/
.
Here we provide pre-extracted features and data list files, so you can skip the above two steps and directly try our training/testing codes. You may need to manually edit the path in the data list files.
-
Features
- UCF: download
- HMDB: download
- Olympic: training | validation
-
Data lists
- UCF-Olympic
- UCF: training list | validation list
- Olympic: training list | validation list
- UCF-HMDBsmall
- UCF: training list | validation list
- HMDB: training list | validation list
- UCF-HMDBfull
- UCF: training list | validation list
- HMDB: training list | validation list
- UCF-Olympic
-
Kinetics-Gameplay: please fill this form to request the features and data lists.
The Kinetics-Gameplay dataset is licensed under CC BY-NC-SA 4.0 for non-commercial purposes only.
- training/validation: Run
./script_train_val.sh
All the commonly used variables/parameters have comments in the end of the line. Please check Options.
All the outputs will be under the directory exp_path
.
- Outputs:
- model weights:
checkpoint.pth.tar
,model_best.pth.tar
- log files:
train.log
,train_short.log
,val.log
,val_short.log
- model weights:
You can choose one of model_weights for testing. All the outputs will be under the directory exp_path
.
- Outputs:
- score_data: used to check the model output (
scores_XXX.npz
) - confusion matrix:
confusion_matrix_XXX.png
andconfusion_matrix_XXX-topK.txt
- score_data: used to check the model output (
In ./script_train_val.sh
, there are several options related to our DA approaches.
use_target
: switch on/off the DA modenone
: not use target data (no DA)uSv
/Sv
: use target data in a unsupervised/supervised way
For more details of all the arguments, please check opts.py.
The options in the scripts have comments with the following types:
- no comment: user can still change it, but NOT recommend (may need to change the code or have different experimental results)
- comments with choices (e.g.
true | false
): can only choose from choices - comments as
depend on users
: totally depend on users (mostly related to data path)