Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks, IEEE TNNLS 2021. Please refer to our website page (http://dpfan.net/d3netbenchmark/) for more details.
- [2020/08/02] 💥 Release the training code.
Figure 1: Illustration of the proposed D3Net. In the training stage (Left), the input RGB and depth images are processed with three parallel sub-networks, e.g., RgbNet, RgbdNet, and DepthNet. The three sub-networks are based on a same modified structure of Feature Pyramid Networks (FPN) (see § IV-A for details). We introduced these sub-networks to obtain three saliency maps (i.e., Srgb, Srgbd, and Sdepth) which considered both coarse and fine details of the input. In the test phase (Right), a novel depth depurator unit (DDU) (§ IV-B) is utilized for the first time in this work to explicitly discard (i.e., Srgbd) or keep (i.e., Srgbd) the saliency map introduced by the depth map. In the training/test phase, these components form a nested structure and are elaborately designed (e.g., gate connection in DDU) to automatically learn the salient object from the RGB image and Depth image jointly.
Our training dataset is:
https://drive.google.com/open?id=1osdm_PRnupIkM82hFbz9u0EKJC_arlQI
Our testing dataset is:
https://drive.google.com/open?id=1ABYxq0mL4lPq2F0paNJ7-5T9ST6XVHl1
- PyTorch>=0.4.1
- Opencv
Put the three datasets 'NJU2K_TRAIN', 'NLPR_TRAIN','NJU2K_TEST' into the created folder "dataset".
Put the vgg-pretrained model 'vgg16_feat.pth' ( GoogleDrive | BaiduYun code: zsxh ) into the created folder "model".
python train.py --net RgbNet
python train.py --net RgbdNet
python train.py --net DepthNet
Put the three pretrained models into the created folder "eval/pretrained_model".
python eval.py
-RgbdNet,RgbNet,DepthNet pretrained models can be downloaded from ( GoogleDrive | BaiduYun code: xf1h )
Results of our model on seven benchmark datasets can be found:Baidu Pan(https://pan.baidu.com/s/13z0ZEptUfEU6hZ6yEEISuw) 提取码: r295
Google Drive(https://drive.google.com/drive/folders/1T46FyPzi3XjsB18i3HnLEqkYQWXVbCnK?usp=sharing)
https://github.com/taozh2017/RGBD-SODsurvey
https://paperswithcode.com/task/rgb-d-salient-object-detection
No. | Dataset | Year | Pub. | Size | #Obj. | Types | Resolution | Download |
---|---|---|---|---|---|---|---|---|
1 | STERE | 2012 | CVPR | 1000 | ~One | Internet | [251-1200] * [222-900] | link |
2 | GIT | 2013 | BMVC | 80 | Multiple | Home environment | 640 * 480 | link |
3 | DES | 2014 | ICIMCS | 135 | One | Indoor | 640 * 480 | link |
4 | NLPR | 2014 | ECCV | 1000 | Multiple | Indoor/outdoor | 640 * 480, 480 * 640 | link |
5 | LFSD | 2014 | CVPR | 100 | One | Indoor/outdoor | 360 * 360 | link |
6 | NJUD | 2014 | ICIP | 1985 | ~One | Moive/internet/photo | [231-1213] * [274-828] | link |
7 | SSD | 2017 | ICCVW | 80 | Multiple | Movies | 960 *1080 | link |
8 | DUT-RGBD | 2019 | ICCV | 1200 | Multiple | Indoor/outdoor | 400 * 600 | link |
9 | SIP | 2020 | TNNLS | 929 | Multiple | Person in wild | 992 * 774 | link |
If you find this work or code is helpful in your research, please cite:
@article{fan2019rethinking,
title={{Rethinking RGB-D salient object detection: Models, datasets, and large-scale benchmarks}},
author={Fan, Deng-Ping and Lin, Zheng and Zhang, Zhao and Zhu, Menglong and Cheng, Ming-Ming},
journal={IEEE TNNLS},
year={2021}
}
@article{zhou2021rgbd,
title={RGB-D Salient Object Detection: A Survey},
author={Zhou, Tao and Fan, Deng-Ping and Cheng, Ming-Ming and Shen, Jianbing and Shao, Ling},
journal={CVMJ},
year={2021}
}