It is a re-implementation code for the following paper:
Kao Zhang, Zhenzhong Chen. Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network.
IEEE Trans. Circuits Syst. Video Techn. 2018. [Online] Avaliable: https://ieeexplore.ieee.org/document/8543830
The code was developed using Python 3.6 & Keras 2.2.4 & CUDA 9.0. There may be a problem related to software versions.To fix the problem, you may look at the implementation in "zk_models.py" file and replace the syntax to match the new keras environment.
- Windows10/Ubuntu16.04
- Anaconda 5.2.0
- Python 3.6
- CUDA 9.0 and cudnn7.1.2
Download the pre-trained models and put the pre-trained model into the "Models" file.
-
[TwoS-model] [Baidu Drive] [Google Drive]
-
[SF-Net-model] [Baidu Drive] [Google Drive]
Currently, the code supports python 3.6
- conda
- Keras ( >= 2.2.4)
- tensorflow ( >= 1.12.0)
- python-opencv
- hdf5storage
-
please change the working directory: "wkdir" to your path in the "zk_config.py" file, like
dataDir = 'E:/Code/IIP_TwoS_Saliency/DataSet'
-
More parameters are in the "zk_config.py" file.
-
Run the demo "Test_TwoS_Net.py" and "Train_TwoS_Net.py" to test or train the model.
The full training process:
Our model is trained on SALICON and part of the DIEM dataset. We train the SF-Net in spatial stream based on the pre-trained VGG-16 model and the training set of SALICON dataset. Then, we train the whole network on the training set of DIEM dataset, and fix the parameters of the trained SF-Net.
- Please download SALICON and DIEM dataset.
- Run the demo "Train_Test_ST_Net.py" to get pre-trained SF-Net model.
- Run the demo "Train_TwoS_Net.py" to train the whole model.
And it is easy to change the output format in our code.
- The results of video task is saved by ".mat"(uint8) formats.
- You can get the color visualization results based on the "Visualization Tools".
- You can evaluate the performance based on the "EvalScores Tools".
If you use the TwoS video saliency model, please cite the following paper:
@article{Zhang2018Video,
author = {Kao Zhang and Zhenzhong Chen},
title = {Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network},
journal = {IEEE Transactions on Circuits and Systems for Video Technology },
year = {2018}
}
Kao ZHANG
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: zhangkao@whu.edu.cn
Zhenzhong CHEN (Professor and Director)
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: zzchen@whu.edu.cn
Web: http://iip.whu.edu.cn/~zzchen/