/TransferSeg

Unseen Object Segmentation in Videos via Transferable Representations, ACCV 2018 (oral)

Primary LanguageC++

TransferSeg

Caffe implementation of our method for transferring knowledge from seen objects in images to unseen objects in videos.
Contact: Yi-Wen Chen (chenyiwena at gmail dot com)

Paper

Please cite our paper if you find it useful for your research.

Unseen Object Segmentation in Videos via Transferable Representations
Yi-Wen Chen, Yi-Hsuan Tsai, Chu-Ya Yang, Yen-Yu Lin and Ming-Hsuan Yang
Asian Conference on Computer Vision (ACCV), 2018 (oral)
Best Student Paper Award Honorable Mention

@inproceedings{Chen_TransferSeg_2018,
  author = {Yi-Wen Chen and Yi-Hsuan Tsai and Chu-Ya Yang and Yen-Yu Lin and Ming-Hsuan Yang},
  booktitle = {Asian Conference on Computer Vision (ACCV)},
  title = {Unseen Object Segmentation in Videos via Transferable Representations},
  year = {2018}
}

VOSTR: Video Object Segmentation via Transferable Representations
Yi-Wen Chen, Yi-Hsuan Tsai, Yen-Yu Lin and Ming-Hsuan Yang
International Journal of Computer Vision (IJCV), 2020

@inproceedings{Chen_VOSTR_2020,
  author = {Yi-Wen Chen and Yi-Hsuan Tsai and Yen-Yu Lin and Ming-Hsuan Yang},
  journal = {International Journal of Computer Vision (IJCV)},
  title = {VOSTR: Video Object Segmentation via Transferable Representations},
  volume = {128},
  number = {4},
  pages = {931-949},
  year = {2020}
}

Installation

git clone https://github.com/wenz116/TransferSeg.git
cd TransferSeg
  • Prepare for MBS
  1. Go to the folder utils/MBS/mex.

  2. Modify the opencv include and lib paths in compile.m/compile_win.m (for Linux/Windows).

  3. Run compile/compile_win in MATLAB (for Linux/Windows).

Dataset

  • Download the PASCAL VOC Dataset as the source image dataset, and put it in the data/PASCAL/VOC2011 folder.

  • Download the DAVIS Dataset as the target video dataset, and put it in the data/DAVIS folder.

Training

  • Download the FCN model pre-trained on PASCAL VOC, and put it in the nets folder.

  • Go to the folder scripts.

  1. Compute optical flow of the input video. Run compute_optical_flow('<VIDEO_NAME>') in MATLAB. The optical flow images will be saved at data/DAVIS/Motion/480p/<VIDEO_NAME>/.

  2. Compute motion prior of the input video via minimum barrier distance. Run get_prior('<VIDEO_NAME>') in MATLAB. The motion prior images will be saved at data/DAVIS/Prior/480p/<VIDEO_NAME>/.

  3. Extract features of each category in PASCAL VOC. The extracted features will be saved at cache/features/, named as features_PASCAL_<CLASS_NAME>_fc7.p.

python get_feature_PASCAL.py <GPU_ID>
  1. Extract features of the input video. The extracted features will be saved at cache/features/, named as features_DAVIS_<VIDEO_NAME>_fc7.p.
python get_feature_DAVIS.py <GPU_ID> <VIDEO_NAME>
  1. Segment mining. The selected segments will be saved at data/DAVIS/Train/480p/<VIDEO_NAME>/.
python get_score.py <GPU_ID> <VIDEO_NAME>
  1. Self learning. The trained models will be saved at output/snapshot/.
./train.sh <GPU_ID> <VIDEO_NAME>

Note

The model and code are available for non-commercial research purposes only.

  • 12/2018: code released