The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.
Citation: Bojian Chen, Wenbin Wu, Zhezhou Li, Tengfei Han, Zhuolei Chen, Weihao Zhang. Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection[J]. Electronic Research Archive, 2024, 32(1): 643-669. doi: 10.3934/era.2024031
The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.
[1] | Y. Zhao, Y. Peng, Saliency-guided video classification via adaptively weighted learning, in 2017 IEEE International Conference on Multimedia and Expo (ICME), (2017), 847–852. https://doi.org/10.1109/ICME.2017.8019343 |
[2] | X. Hu, Y. Wang, J. Shan, Automatic recognition of cloud images by using visual saliency features, IEEE Geosci. Remote Sens. Lett., 12 (2015), 1760–1764. https://doi.org/10.1109/LGRS.2015.2424531 doi: 10.1109/LGRS.2015.2424531 |
[3] | J. C. Ni, Y. Luo, D. Wang, J. Liang, Q. Zhang, Saliency-based sar target detection via convolutional sparse feature enhancement and bayesian inference, IEEE Trans. Geosci. Remote Sens., 61 (2023), 1–15. https://doi.org/10.1109/TGRS.2023.3237632 doi: 10.1109/TGRS.2023.3237632 |
[4] | Z. Yu, Y. Zhuge, H. Lu, L. Zhang, Joint learning of saliency detection and weakly supervised semantic segmentation, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7222–7232. https://doi.org/10.1109/ICCV.2019.00732 |
[5] | S. Lee, M. Lee, J. Lee, H. Shim, Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5491–5501. https://doi.org/10.1109/CVPR46437.2021.00545 |
[6] | W. Feng, R. Han, Q. Guo, J. Zhu, S, Wang, Dynamic saliency-aware regularization for correlation filter-based object tracking, IEEE Trans.n Image Process., 28 (2019), 3232–3245. https://doi.org/10.1109/TIP.2019.2895411 doi: 10.1109/TIP.2019.2895411 |
[7] | J. Y. Zhu, J. Wu, Y. Xu, E. Chang, Z. Tu, Unsupervised object class discovery via saliency-guided multiple class learning, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 862–875. https://doi.org/10.1109/TPAMI.2014.2353617 doi: 10.1109/TPAMI.2014.2353617 |
[8] | S. Wei, L. Liao, J. Li, Q. Zheng, F. Yang, Y. Zhao, Saliency inside: Learning attentive cnns for content-based image retrieval, IEEE Trans. Image Process., 28 (2019), 4580–4593. https://doi.org/10.1109/TIP.2019.2913513 doi: 10.1109/TIP.2019.2913513 |
[9] | A. Kim, R. M. Eustice, Real-time visual slam for autonomous underwater hull inspection using visual saliency, IEEE Trans. Rob., 29 (2013), 719–733. https://doi.org/10.1109/TRO.2012.2235699 doi: 10.1109/TRO.2012.2235699 |
[10] | R. Li, C. H. Wu, S. Liu, J. Wang, G. Wang, G. Liu, B. Zeng, SDP-GAN: Saliency detail preservation generative adversarial networks for high perceptual quality style transfer, IEEE Trans. Image Process., 30 (2021), 374–385. https://doi.org/10.1109/TIP.2020.3036754 doi: 10.1109/TIP.2020.3036754 |
[11] | L. Jiang, M. Xu, X. Wang, L. Sigal, Saliency-guided image translation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 16504–16513. https://doi.org/10.1109/CVPR46437.2021.01624 |
[12] | S. Li, M. Xu, Y. Ren, Z. Wang, Closed-form optimization on saliency-guided image compression for HEVC-MSP, IEEE Trans. Multimedia, 20 (2018), 155–170. https://doi.org/10.1109/TMM.2017.2721544 doi: 10.1109/TMM.2017.2721544 |
[13] | Y. Patel, S. Appalaraju, R. Manmatha, Saliency driven perceptual image compression, in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), (2021), 227–231. https://doi.org/10.1109/WACV48630.2021.00027 |
[14] | C. Yang, L. Zhang, H. Lu, X. Ruan, M. H. Yang, Saliency detection via graph-based manifold ranking, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 3166–3173. https://doi.org/10.1109/CVPR.2013.407 |
[15] | W. Zhu, S. Liang, Y. Wei, J. Sun, Saliency optimization from robust background detection, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2814–2821. https://doi.org/10.1109/CVPR.2014.360 |
[16] | K. Shi, K. Wang, J. Lu, L. Lin, Pisa: Pixelwise image saliency by aggregating complementary appearance contrast measures with spatial priors, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 2115–2122. https://doi.org/10.1109/CVPR.2013.275 |
[17] | M. M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, S. M. Hu., Global contrast based salient region detection, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 569–582. https://doi.org/10.1109/CVPR.2011.5995344 https://doi.org/10.1109/CVPR.2011.5995344 doi: 10.1109/CVPR.2011.5995344 |
[18] | W. C. Tu, S. He, Q. Yang, S. Y. Chien, Real-time salient object detection with a minimum spanning tree, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 2334–2342. https://doi.org/10.1109/CVPR.2016.256 |
[19] | R. Zhao, W. Ouyang, H. Li, X, Wang, Saliency detection by multi-context deep learning, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 1265–1274. https://doi.org/10.1109/CVPR.2015.7298731 |
[20] | G. Li, Y. Yu, Visual saliency based on multiscale deep features, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 5455–5463. |
[21] | L. Wang, H. Lu, X. Ruan, M. H. Yang, Deep networks for saliency detection via local estimation and global search, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 3183–3192. https://doi.org/10.1109/CVPR.2015.7298938 |
[22] | Z. Luo, A. Mishra, A. Achkar, J. Eichel, S. Li, P. M. Jodoin, Non-local deep features for salient object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 6593–6601. https://doi.org/10.1109/CVPR.2017.698 |
[23] | P. Zhang, D. Wang, H. Lu, H. Wang, X. Ruan, Amulet: Aggregating multi-level convolutional features for salient object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 202–211. https://doi.org/10.1109/ICCV.2017.31 |
[24] | Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. Torr, Deeply supervised salient object detection with short connections, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 5300–5309. https://doi.org/10.1109/CVPR.2017.563 |
[25] | W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, R. Yang, Salient object detection in the deep learning era: An in-depth survey, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 3239–3259. https://doi.org/10.1109/TPAMI.2021.3051099 doi: 10.1109/TPAMI.2021.3051099 |
[26] | A. Borji, M. M. Cheng, Q. Hou, H. Jiang, J. Li, Salient object detection: A survey, Comput. Vis. Media, 5 (2019), 117–150. https://doi.org/10.1007/s41095-019-0149-9 doi: 10.1007/s41095-019-0149-9 |
[27] | T. Zhou, D. P. Fan, M. M. Cheng, J. Shen, L. Shao, RGB-D salient object detection: A survey, Comput. Vis. Media, 7 (2021), 37–69. https://doi.org/10.1007/s41095-020-0199-z doi: 10.1007/s41095-020-0199-z |
[28] | X. Song, D. Zhou, W. Li, Y. Dai, L. Liu, H. Li, et al., WAFP-Net: Weighted attention fusion based progressive residual learning for depth map super-resolution, IEEE Trans. Multimedia, 24 (2022), 4113–4127. https://doi.org/10.1109/TMM.2021.3118282 doi: 10.1109/TMM.2021.3118282 |
[29] | P. F. Proença, Y. Gao, Splode: Semi-probabilistic point and line odometry with depth estimation from RGB-D camera motion, in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2017), 1594–1601. https://doi.org/10.1109/IROS.2017.8205967 |
[30] | X. Xing, Y. Cai, T. Lu, Y. Yang, D. Wen, Joint self-supervised monocular depth estimation and SLAM, in 2022 26th International Conference on Pattern Recognition (ICPR), (2022), 4030–4036. https://doi.org/10.1109/ICPR56361.2022.9956576 |
[31] | Q. Chen, Z. Liu, Y. Zhang, K. Fu, Q. Zhao, H. Du, RGB-D salient object detection via 3d convolutional neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 1063–1071. https://doi.org/10.1609/aaai.v35i2.16191 |
[32] | F. Wang, J. Pan, S. Xu, J. Tang, Learning discriminative cross-modality features for RGB-D saliency detection, IEEE Trans. Image Process., 31 (2022), 1285–1297. https://doi.org/10.1109/TIP.2022.3140606 doi: 10.1109/TIP.2022.3140606 |
[33] | Z. Wu, G. Allibert, F. Meriaudeau, C. Ma, C. Demonceaux, Hidanet: RGB-D salient object detection via hierarchical depth awareness., IEEE Trans. Image Process., 32 (2023), 2160–2173. https://doi.org/10.1109/TIP.2023.3263111 doi: 10.1109/TIP.2023.3263111 |
[34] | J. Zhang, Q. Liang, Q. Guo, J. Yang, Q. Zhang, Y. Shi, R2net: Residual refinement network for salient object detection, Image Vision Comput., 120 (2022), 104423. https://doi.org/10.1016/j.imavis.2022.104423 doi: 10.1016/j.imavis.2022.104423 |
[35] | R. Shigematsu, D. Feng, S. You, N. Barnes, Learning RGB-D salient object detection using background enclosure, depth contrast, and top-down features, in 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), (2017), 2749–2757. |
[36] | L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell., 20 (1998), 1254–1259. https://doi.org/10.1109/34.730558 doi: 10.1109/34.730558 |
[37] | C. Yang, L. Zhang, H. Lu, Graph-regularized saliency detection with convex-hull-based center prior, IEEE Signal Process. Lett., 20 (2013), 637–640. https://doi.org/10.1109/LSP.2013.2260737 doi: 10.1109/LSP.2013.2260737 |
[38] | P. Jiang, H. Ling, J. Yu, J. Peng, Salient region detection by ufo: Uniqueness, focusness and objectness, in 2013 IEEE International Conference on Computer Vision, (2013), 1976–1983. |
[39] | R. S. Srivatsa, R. V. Babu, Salient object detection via objectness measure, in 2015 IEEE International Conference on Image Processing (ICIP), (2015), 4481–4485. https://doi.org/10.1109/ICIP.2015.7351654 |
[40] | C. Scharfenberger, A. Wong, K. Fergani, J. S. Zelek, D. A. Clausi, Statistical textural distinctiveness for salient region detection in natural images, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 979–986. https://doi.org/10.1109/CVPR.2013.131 |
[41] | A. Borji, M. M. Cheng, H. Jiang, J. Li, Salient object detection: A benchmark, IEEE Trans. Image Process., 24 (2015), 5706–5722. https://doi.org/10.1109/TIP.2015.2487833 doi: 10.1109/TIP.2015.2487833 |
[42] | J. Han, D. Zhang, G. Cheng, N. Liu, D. Xu, Advanced deep-learning techniques for salient and category-specific object detection: A survey, IEEE Signal Process. Mag., 35 (2018), 84–100. https://doi.org/10.1109/MSP.2017.2749125 doi: 10.1109/MSP.2017.2749125 |
[43] | A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012 (2012), 25. https://doi.org/10.1145/3065386 doi: 10.1145/3065386 |
[44] | K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[45] | N. Liu, J. Han, M. H. Yang, Picanet: Learning pixel-wise contextual attention for saliency detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3089–3098. https://doi.org/10.1109/CVPR.2018.00326 |
[46] | S. Chen, X. Tan, B. Wang, X. Hu, Reverse attention for salient object detection, in Proceedings of the European conference on computer vision (ECCV), (2018), 234–250. https://doi.org/10.1007/978-3-030-01240-3_15 |
[47] | J. J. Liu, Q. Hou, M. M. Cheng, J. Feng, J. Jiang, A simple pooling-based design for real-time salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3912–3921. https://doi.org/10.1109/CVPR.2019.00404 |
[48] | Y. Pang, X. Zhao, L. Zhang, H. Lu, Multi-scale interactive network for salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 9410–9419. https://doi.org/10.1109/CVPR42600.2020.00943 |
[49] | Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. H. S. Torr, Deeply supervised salient object detection with short connections, IEEE Trans. Pattern Anal. Mach. Intell., 41 (2019), 815–828. https://doi.org/10.1109/CVPR.2017.563 https://doi.org/10.1109/TPAMI.2018.2815688 doi: 10.1109/CVPR.2017.563 |
[50] | X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan, M. Jagersand, Basnet: Boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 7471–7481. https://doi.org/10.1109/CVPR.2019.00766 |
[51] | P. Zhang, W. Liu, H. Lu, C. Shen, Salient object detection with lossless feature reflection and weighted structural loss, IEEE Trans. Image Process., 28 (2019), 3048–3060. https://doi.org/10.1109/TIP.2019.2893535 doi: 10.1109/TIP.2019.2893535 |
[52] | K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in 3rd International Conference on Learning Representations, 2015. |
[53] | W. Wang, S. Zhao, J. Shen, S. C. H. Hoi, A. Borji, Salient object detection with pyramid attention and salient edges, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1448–1457. https://doi.org/10.1109/CVPR.2019.00154 |
[54] | T. Zhao, X. Wu, Pyramid feature attention network for saliency detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3080–3089. wangzhi |
[55] | S. Chen, X. Tan, B. Wang, H. Lu, X. Hu, Y. Fu, Reverse attention-based residual network for salient object detection, IEEE Trans. Image Process., 29 (2020), 3763–3776. https://doi.org/10.1109/TIP.2020.2965989 doi: 10.1109/TIP.2020.2965989 |
[56] | M. Feng, H. Lu, E. Ding, Attentive feedback network for boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1623–1632. https://doi.org/10.1109/CVPR.2019.00172 |
[57] | J. Zhao, J. J. Liu, D. P. Fan, Y. Cao, J. Yang, M. M. Cheng, Egnet: Edge guidance network for salient object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 8778–8787. https://doi.org/10.1109/ICCV.2019.00887 |
[58] | Y. Niu, Y. Geng, X. Li, F. Liu, Leveraging stereopsis for saliency analysis, in 2012 IEEE Conference on Computer Vision and Pattern Recognition, (2012), 454–461. |
[59] | H. Peng, B. Li, W. Xiong, W. Hu, R. Ji, RGBD salient object detection: A benchmark and algorithms, in Computer Vision–ECCV 2014: 13th European Conference, (2014), 92–109. https://doi.org/10.1007/978-3-319-10578-9_7 |
[60] | Y. Cheng, H. Fu, X. Wei, J. Xiao, X. Cao, Depth enhanced saliency detection method, in Proceedings of international conference on internet multimedia computing and service, (2014), 23–27. https://doi.org/10.1145/2632856.2632866 |
[61] | R. Ju, L. Ge, W. Geng, T. Ren, G. Wu, Depth saliency based on anisotropic center-surround difference, in 2014 IEEE International Conference on Image Processing (ICIP), (2014), 1115–1119. https://doi.org/10.1109/ICIP.2014.7025222 |
[62] | J. Ren, X. Gong, L. Yu, W. Zhou, M. Y. Yang, Exploiting global priors for rgb-d saliency detection, in 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2015), 25–32. https://doi.org/10.1109/CVPRW.2015.7301391 |
[63] | A. Wang, M. Wang, RGB-D salient object detection via minimum barrier distance transform and saliency fusion, IEEE Signal Process. Lett., 24 (2017), 663–667. https://doi.org/10.1109/LSP.2017.2688136 doi: 10.1109/LSP.2017.2688136 |
[64] | R. Cong, J. Lei, H. Fu, J. Hou, Q. Huang, S. Kwong, Going from RGB to RGBD saliency: A depth-guided transformation model, IEEE Trans. Cyber., 50 (2020), 3627–3639. https://doi.org/10.1109/TCYB.2019.2932005 doi: 10.1109/TCYB.2019.2932005 |
[65] | L. Qu, S. He, J. Zhang, J. Tian, Y. Tang, Q. Yang, RGBD salient object detection via deep fusion, IEEE Trans. Image Process., 26 (2017), 2274–2285. https://doi.org/10.1109/TIP.2017.2682981 doi: 10.1109/TIP.2017.2682981 |
[66] | Y. Piao, W. Ji, J. Li, M. Zhang, H. Lu, Depth-induced multi-scale recurrent attention network for saliency detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7253–7262. https://doi.org/10.1109/ICCV.2019.00735 |
[67] | N. Liu, N. Zhang, J. Han, Learning selective self-mutual attention for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 13753–13762. https://doi.org/10.1109/CVPR42600.2020.01377 |
[68] | C. Li, R. Cong, S. Kwong, J. Hou, H. Fu, G. Zhu, et al., ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection, IEEE Trans. Cyber., 51 (2021), 88–100. https://doi.org/10.1109/TCYB.2020.2969255 doi: 10.1109/TCYB.2020.2969255 |
[69] | G. Li, Z. Liu, M. Chen, Z. Bai, W. Lin, H. Ling, Hierarchical alternate interaction network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3528–3542. https://doi.org/10.1109/TIP.2021.3062689 doi: 10.1109/TIP.2021.3062689 |
[70] | Y. H. Wu, Y. Liu, J. Xu, J. W. Bian, Y. C. Gu, M. M. Cheng, MobileSal: Extremely efficient RGB-D salient object detection, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 10261–10269. https://doi.org/10.1109/TPAMI.2021.3134684 doi: 10.1109/TPAMI.2021.3134684 |
[71] | N. Huang, Y. Yang, D. Zhang, Q. Zhang, J. Han, Employing bilinear fusion and saliency prior information for RGB-D salient object detection, IEEE Trans. Multimedia, 24 (2022), 1651–1664. https://doi.org/10.1109/TMM.2021.3069297 doi: 10.1109/TMM.2021.3069297 |
[72] | X. Wang, S. Li, C. Chen, Y. Fang, A. Hao, H. Qin, Data-level recombination and lightweight fusion scheme for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 458–471. https://doi.org/10.1109/TIP.2020.3037470 doi: 10.1109/TIP.2020.3037470 |
[73] | X. Zhao, L. Zhang, Y. Pang, H. Lu, L. Zhang, A single stream network for robust and real-time RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 646–662. https://doi.org/10.1007/978-3-030-58542-6_39 |
[74] | K. Fu, D. P. Fan, G. P. Ji, Q. Zhao, JL-DCF: Joint learning and densely-cooperative fusion framework for RGB-D salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3049–3059. https://doi.org/10.1109/CVPR42600.2020.00312 |
[75] | J. Han, H. Chen, N. Liu, C. Yan, X. Li, CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion, IEEE Trans. Cyber., 48 (2018), 3171–3183. https://doi.org/10.1109/TCYB.2017.2761775 doi: 10.1109/TCYB.2017.2761775 |
[76] | N. Wang, X. Gong, Adaptive fusion for RGB-D salient object detection, IEEE Access, 7 (2019), 55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107 doi: 10.1109/ACCESS.2019.2913107 |
[77] | G. Li, Z. Liu, H. Ling, ICNet: Information conversion network for RGB-D based salient object detection, IEEE Trans. Image Process., 29 (2020), 4873–4884. https://doi.org/10.1109/TIP.2020.2976689 doi: 10.1109/TIP.2020.2976689 |
[78] | H. Chen, Y. Li, Progressively complementarity-aware fusion network for RGB-D salient object detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3051–3060. https://doi.org/10.1109/CVPR.2018.00322 |
[79] | M. Zhang, W. Ren, Y. Piao, Z. Rong, H.Lu, Select, supplement and focus for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3469–3478. https://doi.org/10.1109/CVPR42600.2020.00353 |
[80] | C. Chen, J. Wei, C. Peng, H. Qin, Depth-quality-aware salient object detection, IEEE Trans. Image Process., 30 (2021), 2350–2363. https://doi.org/10.1109/TIP.2021.3052069 doi: 10.1109/TIP.2021.3052069 |
[81] | Y. Zhai, D. P. Fan, J. Yang, A. Borji, L. Shao, J. Han, L. Wang, Bifurcated backbone strategy for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 8727–8742. https://doi.org/10.1109/TIP.2021.3116793 doi: 10.1109/TIP.2021.3116793 |
[82] | W. D. Jin, J. Xu, Q. Han, Y. Zhang, M. M. Cheng, CDNet: Complementary depth network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3376–3390. https://doi.org/10.1109/TIP.2021.3060167 doi: 10.1109/TIP.2021.3060167 |
[83] | Z. Zhang, Z. Lin, J. Xu, W. D. Jin, S. P. Lu, D. P. Fan, Bilateral attention network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 1949–1961. https://doi.org/10.1109/TIP.2021.3049959 doi: 10.1109/TIP.2021.3049959 |
[84] | H. Chen, Y. Li, Three-stream attention-aware network for RGB-D salient object detection, IEEE Trans. Image Process., 28 (2019), 2825–2835. https://doi.org/10.1109/TIP.2019.2891104 doi: 10.1109/TIP.2019.2891104 |
[85] | J. Zhang, D. P. Fan, Y. Dai, S. Anwar, F. S. Saleh, T. Zhang, et al., Uc-net: Uncertainty inspired rgb-d saliency detection via conditional variational autoencoders, iIn 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 8579–8588. https://doi.org/10.1109/CVPR42600.2020.00861 |
[86] | A. Luo, X. Li, F. Yang, Z. Jiao, H. Cheng, S. Lyu, Cascade graph neural networks for RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 346–364. https://doi.org/10.1007/978-3-030-58610-2_21 |
[87] | B. Jiang, Z. Zhou, X. Wang, J. Tang, B. Luo, CmSalGAN: RGB-D salient object detection with cross-view generative adversarial networks, IEEE Trans. Multimedia, 23 (2021), 1343–1353. https://doi.org/10.1109/TMM.2020.2997184 doi: 10.1109/TMM.2020.2997184 |
[88] | T. Zhou, H. Fu, G. Chen, Y. Zhou, D. P. Fan, L. Shao, Specificity-preserving RGB-D saliency detection, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 4661–4671. https://doi.org/10.1109/ICCV48922.2021.00464 |
[89] | T. Zhou, Y. Zhou, C. Gong, J. Yang, Y. Zhang, Feature aggregation and propagation network for camouflaged object detection, IEEE Trans. Image Process., 31 (2022), 7036–7047. https://doi.org/10.1109/TIP.2022.3217695 doi: 10.1109/TIP.2022.3217695 |
[90] | M. Song, W. Song, G. Yang, C. Chen, Improving RGB-D salient object detection via modality-aware decoder, IEEE Trans. Image Process., 31 (2022), 6124–6138. https://doi.org/10.1109/TIP.2022.3205747 doi: 10.1109/TIP.2022.3205747 |
[91] | Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y. Zhao, et al., Ce-net: Context encoder network for 2d medical image segmentation, IEEE Trans. Med. Imaging, 38 (2019), 2281–2292. https://doi.org/10.1109/TMI.2019.2903562 doi: 10.1109/TMI.2019.2903562 |
[92] | S. Woo, J. Park, J. Y. Lee, I. S. Kweon, Cbam: Convolutional block attention module, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1 |
[93] | W. Gao, G. Liao, S. Ma, G. Li, Y. Liang, W. Lin, Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 2091–2106. https://doi.org/10.1109/TCSVT.2021.3082939 doi: 10.1109/TCSVT.2021.3082939 |
[94] | K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23 doi: 10.1007/978-3-319-10578-9_23 |
[95] | I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in 7th International Conference on Learning Representations, 2019. |
[96] | J. X. Zhao, Y. Cao, D. P. Fan, M. M. Cheng, X. Y. Li, L. Zhang, Contrast prior and fluid pyramid integration for RGB-D salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3922–3931. |
[97] | N. Li, J. Ye, Y. Ji, H. Ling, J. Yu, Saliency detection on light field, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2806–2813. https://doi.org/10.1109/CVPR.2014.359 |
[98] | D. P. Fan, Z. Lin, Z. Zhang, M. Zhu, M. M. Cheng, Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks, IEEE Trans. Neural Networks Learn. Syst., 32 (2021), 2075–2089. https://doi.org/10.1109/TNNLS.2020.2996406 doi: 10.1109/TNNLS.2020.2996406 |
[99] | W. Ji, J. Li, M. Zhang, Y. Piao, H. Lu, Accurate RGB-D salient object detection via collaborative learning, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 52–69. https://doi.org/10.1007/978-3-030-58523-5_4 |
[100] | W. Zhang, G. P. Ji, Z. Wang, K. Fu, Q. Zhao, Depth quality-inspired feature manipulation for efficient RGB-D salient object detection, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475240 |
[101] | W. Zhang, Y. Jiang, K. Fu, Q. Zhao, BTS-Net: Bi-directional transfer-and-selection network for RGB-D salient object detection, in 2021 IEEE International Conference on Multimedia and Expo (ICME), (2021), 1–6. https://doi.org/10.1109/ICME51207.2021.9428263 |
[102] | M. Zhang, S. Yao, B. Hu, Y. Piao, W. Ji, C$^{2}$DFNet: Criss-cross dynamic filter network for rgb-d salient object detection, IEEE Trans. Multimedia, 2022 (2022), 1–13. |
[103] | X. Cheng, X. Zheng, J. Pei, H. Tang, Z. Lyu, C. Chen, Depth-induced gap-reducing network for RGB-D salient object detection: An interaction, guidance and refinement approach, IEEE Trans. Multimedia, 2022 (2022). |
[104] | Y. Pang, X. Zhao, L. Zhang, H. Lu, Caver: Cross-modal view-mixed transformer for bi-modal salient object detection, IEEE Trans. Image Process., 32 (2023), 892–904. https://doi.org/10.1109/TIP.2023.3234702 doi: 10.1109/TIP.2023.3234702 |
[105] | D. P. Fan, C. Gong, Y. Cao, B. Ren, M. M. Cheng, A. Borji, Enhanced-alignment measure for binary foreground map evaluation, in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, (2018), 698–704. https://doi.org/10.24963/ijcai.2018/97 |
[106] | D. P. Fan, M. M. Cheng, Y. Liu, T. Li, A. Borji, Structure-measure: A new way to evaluate foreground maps, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4558–4567. https://doi.org/10.1109/ICCV.2017.487 |
[107] | G. Chen, F. Shao, X. Chai, H. Chen, Q. Jiang, X. Meng, Y. S. Ho, Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 1787–1801. https://doi.org/10.1109/TCSVT.2022.3215979 doi: 10.1109/TCSVT.2022.3215979 |
[108] | Z. Liu, Y. Wang, Z. Tu, Y. Xiao, B. Tang, Tritransnet, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475601 |
[109] | R. Cong, Q. Lin, C. Zhang, C. Li, X. Cao, Q. Huang, Y. Zhao, CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection, IEEE Trans. Image Process., 31 (2022), 6800–6815. https://doi.org/10.1109/TIP.2022.3216198 doi: 10.1109/TIP.2022.3216198 |
[110] | Z. Liu, Y. Tan, Q. He, Y. Xiao, Swinnet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 4486–4497. https://doi.org/10.1109/TCSVT.2021.3127149 doi: 10.1109/TCSVT.2021.3127149 |
[111] | R. Cong, H. Liu, C. Zhang, W. Zhang, F. Zheng, R. Song, S. Kwong, Point-aware interaction and cnn-induced refinement network for RGB-D salient object detection, in Proceedings of the 31st ACM International Conference on Multimedia, 2023. https://doi.org/10.1145/3581783.3611982 |