/VST

Primary LanguagePython

VST in MSRA

avatar

More details please see VST Github code and VST paper.

1.png

2.png

3.png

4.png

Requirement

$ pip install -r /utils/requirements.txt

VST for SOD

Data Preparation for SOD

Training Set

We use the training set of DUTS to train our VST for RGB SOD. Besides, we follow Egnet to generate contour maps of DUTS trainset for training. You can directly download the generated contour maps DUTS-TR-Contour from [baidu pan fetch code: ow76 | Google drive] and put it into data folder.

Testing Set

We use the testing set of DUTS, ECSSD, HKU-IS, PASCAL-S, DUT-O, and SOD to test our VST. After Downloading, put them into /data folder.

Your Data folder should look like this:

-- Data
   |-- DUTS
   |   |-- DUTS-TR
   |   |-- | DUTS-TR-Image
   |   |-- | DUTS-TR-Mask
   |   |-- | DUTS-TR-Contour
   |   |-- DUTS-TE
   |   |-- | DUTS-TE-Image
   |   |-- | DUTS-TE-Mask
   |-- ECSSD
   |   |--images
   |   |--GT
   ...

Training, Testing, and Evaluation

  1. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  2. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB VST Model

  1. Download our pretrained RGB_VST.pth[baidu pan fetch code: pe54 | Google drive] and then put it in checkpoint/ folder.
  2. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: 92t0 | Google drive].

VST for SOD in MSRA

Data Preparation for SOD in MSRA

Download our dataset baidu pan and fetch code: 6666

And then put it in data/ folder.

Your /data folder should look like this:

-- data
   |-- input
   |   |-- 1.gif
   |   |-- 2.gif
   |...
   |-- comment.png
   |-- name_card.png
   |-- video_title.png

Testing

  1. Download our pretrained RGB_VST.pth[baidu pan fetch code: pe54 | Google drive] and then put it in checkpoint/ folder.
  2. Run Main.py for testing

More details please see baidu pan fetch code: 6666.

Results

input

data/input/eazy_2.gif

SOD map

data/res/eazy_2/eazy_2.gif

output

res.gif

Summary and Discussion

Summary and Discussion

Acknowledgement

We thank the authors of VST for providing codes of VST.

Citation

If you think our work is helpful, please cite

@InProceedings{Liu_2021_ICCV,
    author    = {Liu, Nian and Zhang, Ni and Wan, Kaiyuan and Shao, Ling and Han, Junwei},
    title     = {Visual Saliency Transformer},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {4722-4732}
}