Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection

Bojian Chen; Wenbin Wu; Zhezhou Li; Tengfei Han; Zhuolei Chen; Weihao Zhang; Bojian Chen; Wenbin Wu; Zhezhou Li; Tengfei Han; Zhuolei Chen; Weihao Zhang

doi:10.3934/era.2024031

Electronic Research Archive

2024, Volume 32, Issue 1: 643-669. doi: 10.3934/era.2024031

Previous Article Next Article

Research article Special Issues

Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection

State Grid Fujian Electric Power Research Institute, No.64 Shoushan Road, Cangshan District, Fuzhou, China

Academic Editor: William Guo

Received: 23 September 2023 Revised: 27 November 2023 Accepted: 10 December 2023 Published: 09 January 2024

The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.
- salient object detection (SOD),
- RGB-D,
- feature aggregation,
- attention,
- cross-modal
Citation: Bojian Chen, Wenbin Wu, Zhezhou Li, Tengfei Han, Zhuolei Chen, Weihao Zhang. Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection[J]. Electronic Research Archive, 2024, 32(1): 643-669. doi: 10.3934/era.2024031

Related Papers:

Abstract

The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.

References

[1]	Y. Zhao, Y. Peng, Saliency-guided video classification via adaptively weighted learning, in 2017 IEEE International Conference on Multimedia and Expo (ICME), (2017), 847–852. https://doi.org/10.1109/ICME.2017.8019343
[2]	X. Hu, Y. Wang, J. Shan, Automatic recognition of cloud images by using visual saliency features, IEEE Geosci. Remote Sens. Lett., 12 (2015), 1760–1764. https://doi.org/10.1109/LGRS.2015.2424531 doi: 10.1109/LGRS.2015.2424531
[3]	J. C. Ni, Y. Luo, D. Wang, J. Liang, Q. Zhang, Saliency-based sar target detection via convolutional sparse feature enhancement and bayesian inference, IEEE Trans. Geosci. Remote Sens., 61 (2023), 1–15. https://doi.org/10.1109/TGRS.2023.3237632 doi: 10.1109/TGRS.2023.3237632
[4]	Z. Yu, Y. Zhuge, H. Lu, L. Zhang, Joint learning of saliency detection and weakly supervised semantic segmentation, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7222–7232. https://doi.org/10.1109/ICCV.2019.00732
[5]	S. Lee, M. Lee, J. Lee, H. Shim, Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5491–5501. https://doi.org/10.1109/CVPR46437.2021.00545
[6]	W. Feng, R. Han, Q. Guo, J. Zhu, S, Wang, Dynamic saliency-aware regularization for correlation filter-based object tracking, IEEE Trans.n Image Process., 28 (2019), 3232–3245. https://doi.org/10.1109/TIP.2019.2895411 doi: 10.1109/TIP.2019.2895411
[7]	J. Y. Zhu, J. Wu, Y. Xu, E. Chang, Z. Tu, Unsupervised object class discovery via saliency-guided multiple class learning, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 862–875. https://doi.org/10.1109/TPAMI.2014.2353617 doi: 10.1109/TPAMI.2014.2353617
[8]	S. Wei, L. Liao, J. Li, Q. Zheng, F. Yang, Y. Zhao, Saliency inside: Learning attentive cnns for content-based image retrieval, IEEE Trans. Image Process., 28 (2019), 4580–4593. https://doi.org/10.1109/TIP.2019.2913513 doi: 10.1109/TIP.2019.2913513
[9]	A. Kim, R. M. Eustice, Real-time visual slam for autonomous underwater hull inspection using visual saliency, IEEE Trans. Rob., 29 (2013), 719–733. https://doi.org/10.1109/TRO.2012.2235699 doi: 10.1109/TRO.2012.2235699
[10]	R. Li, C. H. Wu, S. Liu, J. Wang, G. Wang, G. Liu, B. Zeng, SDP-GAN: Saliency detail preservation generative adversarial networks for high perceptual quality style transfer, IEEE Trans. Image Process., 30 (2021), 374–385. https://doi.org/10.1109/TIP.2020.3036754 doi: 10.1109/TIP.2020.3036754
[11]	L. Jiang, M. Xu, X. Wang, L. Sigal, Saliency-guided image translation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 16504–16513. https://doi.org/10.1109/CVPR46437.2021.01624
[12]	S. Li, M. Xu, Y. Ren, Z. Wang, Closed-form optimization on saliency-guided image compression for HEVC-MSP, IEEE Trans. Multimedia, 20 (2018), 155–170. https://doi.org/10.1109/TMM.2017.2721544 doi: 10.1109/TMM.2017.2721544
[13]	Y. Patel, S. Appalaraju, R. Manmatha, Saliency driven perceptual image compression, in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), (2021), 227–231. https://doi.org/10.1109/WACV48630.2021.00027
[14]	C. Yang, L. Zhang, H. Lu, X. Ruan, M. H. Yang, Saliency detection via graph-based manifold ranking, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 3166–3173. https://doi.org/10.1109/CVPR.2013.407
[15]	W. Zhu, S. Liang, Y. Wei, J. Sun, Saliency optimization from robust background detection, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2814–2821. https://doi.org/10.1109/CVPR.2014.360
[16]	K. Shi, K. Wang, J. Lu, L. Lin, Pisa: Pixelwise image saliency by aggregating complementary appearance contrast measures with spatial priors, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 2115–2122. https://doi.org/10.1109/CVPR.2013.275
[17]	M. M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, S. M. Hu., Global contrast based salient region detection, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 569–582. https://doi.org/10.1109/CVPR.2011.5995344 https://doi.org/10.1109/CVPR.2011.5995344 doi: 10.1109/CVPR.2011.5995344
[18]	W. C. Tu, S. He, Q. Yang, S. Y. Chien, Real-time salient object detection with a minimum spanning tree, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 2334–2342. https://doi.org/10.1109/CVPR.2016.256
[19]	R. Zhao, W. Ouyang, H. Li, X, Wang, Saliency detection by multi-context deep learning, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 1265–1274. https://doi.org/10.1109/CVPR.2015.7298731
[20]	G. Li, Y. Yu, Visual saliency based on multiscale deep features, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 5455–5463.
[21]	L. Wang, H. Lu, X. Ruan, M. H. Yang, Deep networks for saliency detection via local estimation and global search, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 3183–3192. https://doi.org/10.1109/CVPR.2015.7298938
[22]	Z. Luo, A. Mishra, A. Achkar, J. Eichel, S. Li, P. M. Jodoin, Non-local deep features for salient object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 6593–6601. https://doi.org/10.1109/CVPR.2017.698
[23]	P. Zhang, D. Wang, H. Lu, H. Wang, X. Ruan, Amulet: Aggregating multi-level convolutional features for salient object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 202–211. https://doi.org/10.1109/ICCV.2017.31
[24]	Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. Torr, Deeply supervised salient object detection with short connections, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 5300–5309. https://doi.org/10.1109/CVPR.2017.563
[25]	W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, R. Yang, Salient object detection in the deep learning era: An in-depth survey, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 3239–3259. https://doi.org/10.1109/TPAMI.2021.3051099 doi: 10.1109/TPAMI.2021.3051099
[26]	A. Borji, M. M. Cheng, Q. Hou, H. Jiang, J. Li, Salient object detection: A survey, Comput. Vis. Media, 5 (2019), 117–150. https://doi.org/10.1007/s41095-019-0149-9 doi: 10.1007/s41095-019-0149-9
[27]	T. Zhou, D. P. Fan, M. M. Cheng, J. Shen, L. Shao, RGB-D salient object detection: A survey, Comput. Vis. Media, 7 (2021), 37–69. https://doi.org/10.1007/s41095-020-0199-z doi: 10.1007/s41095-020-0199-z
[28]	X. Song, D. Zhou, W. Li, Y. Dai, L. Liu, H. Li, et al., WAFP-Net: Weighted attention fusion based progressive residual learning for depth map super-resolution, IEEE Trans. Multimedia, 24 (2022), 4113–4127. https://doi.org/10.1109/TMM.2021.3118282 doi: 10.1109/TMM.2021.3118282
[29]	P. F. Proença, Y. Gao, Splode: Semi-probabilistic point and line odometry with depth estimation from RGB-D camera motion, in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2017), 1594–1601. https://doi.org/10.1109/IROS.2017.8205967
[30]	X. Xing, Y. Cai, T. Lu, Y. Yang, D. Wen, Joint self-supervised monocular depth estimation and SLAM, in 2022 26th International Conference on Pattern Recognition (ICPR), (2022), 4030–4036. https://doi.org/10.1109/ICPR56361.2022.9956576
[31]	Q. Chen, Z. Liu, Y. Zhang, K. Fu, Q. Zhao, H. Du, RGB-D salient object detection via 3d convolutional neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 1063–1071. https://doi.org/10.1609/aaai.v35i2.16191
[32]	F. Wang, J. Pan, S. Xu, J. Tang, Learning discriminative cross-modality features for RGB-D saliency detection, IEEE Trans. Image Process., 31 (2022), 1285–1297. https://doi.org/10.1109/TIP.2022.3140606 doi: 10.1109/TIP.2022.3140606
[33]	Z. Wu, G. Allibert, F. Meriaudeau, C. Ma, C. Demonceaux, Hidanet: RGB-D salient object detection via hierarchical depth awareness., IEEE Trans. Image Process., 32 (2023), 2160–2173. https://doi.org/10.1109/TIP.2023.3263111 doi: 10.1109/TIP.2023.3263111
[34]	J. Zhang, Q. Liang, Q. Guo, J. Yang, Q. Zhang, Y. Shi, R2net: Residual refinement network for salient object detection, Image Vision Comput., 120 (2022), 104423. https://doi.org/10.1016/j.imavis.2022.104423 doi: 10.1016/j.imavis.2022.104423
[35]	R. Shigematsu, D. Feng, S. You, N. Barnes, Learning RGB-D salient object detection using background enclosure, depth contrast, and top-down features, in 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), (2017), 2749–2757.
[36]	L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell., 20 (1998), 1254–1259. https://doi.org/10.1109/34.730558 doi: 10.1109/34.730558
[37]	C. Yang, L. Zhang, H. Lu, Graph-regularized saliency detection with convex-hull-based center prior, IEEE Signal Process. Lett., 20 (2013), 637–640. https://doi.org/10.1109/LSP.2013.2260737 doi: 10.1109/LSP.2013.2260737
[38]	P. Jiang, H. Ling, J. Yu, J. Peng, Salient region detection by ufo: Uniqueness, focusness and objectness, in 2013 IEEE International Conference on Computer Vision, (2013), 1976–1983.
[39]	R. S. Srivatsa, R. V. Babu, Salient object detection via objectness measure, in 2015 IEEE International Conference on Image Processing (ICIP), (2015), 4481–4485. https://doi.org/10.1109/ICIP.2015.7351654
[40]	C. Scharfenberger, A. Wong, K. Fergani, J. S. Zelek, D. A. Clausi, Statistical textural distinctiveness for salient region detection in natural images, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 979–986. https://doi.org/10.1109/CVPR.2013.131
[41]	A. Borji, M. M. Cheng, H. Jiang, J. Li, Salient object detection: A benchmark, IEEE Trans. Image Process., 24 (2015), 5706–5722. https://doi.org/10.1109/TIP.2015.2487833 doi: 10.1109/TIP.2015.2487833
[42]	J. Han, D. Zhang, G. Cheng, N. Liu, D. Xu, Advanced deep-learning techniques for salient and category-specific object detection: A survey, IEEE Signal Process. Mag., 35 (2018), 84–100. https://doi.org/10.1109/MSP.2017.2749125 doi: 10.1109/MSP.2017.2749125
[43]	A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012 (2012), 25. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
[44]	K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
[45]	N. Liu, J. Han, M. H. Yang, Picanet: Learning pixel-wise contextual attention for saliency detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3089–3098. https://doi.org/10.1109/CVPR.2018.00326
[46]	S. Chen, X. Tan, B. Wang, X. Hu, Reverse attention for salient object detection, in Proceedings of the European conference on computer vision (ECCV), (2018), 234–250. https://doi.org/10.1007/978-3-030-01240-3_15
[47]	J. J. Liu, Q. Hou, M. M. Cheng, J. Feng, J. Jiang, A simple pooling-based design for real-time salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3912–3921. https://doi.org/10.1109/CVPR.2019.00404
[48]	Y. Pang, X. Zhao, L. Zhang, H. Lu, Multi-scale interactive network for salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 9410–9419. https://doi.org/10.1109/CVPR42600.2020.00943
[49]	Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. H. S. Torr, Deeply supervised salient object detection with short connections, IEEE Trans. Pattern Anal. Mach. Intell., 41 (2019), 815–828. https://doi.org/10.1109/CVPR.2017.563 https://doi.org/10.1109/TPAMI.2018.2815688 doi: 10.1109/CVPR.2017.563
[50]	X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan, M. Jagersand, Basnet: Boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 7471–7481. https://doi.org/10.1109/CVPR.2019.00766
[51]	P. Zhang, W. Liu, H. Lu, C. Shen, Salient object detection with lossless feature reflection and weighted structural loss, IEEE Trans. Image Process., 28 (2019), 3048–3060. https://doi.org/10.1109/TIP.2019.2893535 doi: 10.1109/TIP.2019.2893535
[52]	K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in 3rd International Conference on Learning Representations, 2015.
[53]	W. Wang, S. Zhao, J. Shen, S. C. H. Hoi, A. Borji, Salient object detection with pyramid attention and salient edges, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1448–1457. https://doi.org/10.1109/CVPR.2019.00154
[54]	T. Zhao, X. Wu, Pyramid feature attention network for saliency detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3080–3089. wangzhi
[55]	S. Chen, X. Tan, B. Wang, H. Lu, X. Hu, Y. Fu, Reverse attention-based residual network for salient object detection, IEEE Trans. Image Process., 29 (2020), 3763–3776. https://doi.org/10.1109/TIP.2020.2965989 doi: 10.1109/TIP.2020.2965989
[56]	M. Feng, H. Lu, E. Ding, Attentive feedback network for boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1623–1632. https://doi.org/10.1109/CVPR.2019.00172
[57]	J. Zhao, J. J. Liu, D. P. Fan, Y. Cao, J. Yang, M. M. Cheng, Egnet: Edge guidance network for salient object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 8778–8787. https://doi.org/10.1109/ICCV.2019.00887
[58]	Y. Niu, Y. Geng, X. Li, F. Liu, Leveraging stereopsis for saliency analysis, in 2012 IEEE Conference on Computer Vision and Pattern Recognition, (2012), 454–461.
[59]	H. Peng, B. Li, W. Xiong, W. Hu, R. Ji, RGBD salient object detection: A benchmark and algorithms, in Computer Vision–ECCV 2014: 13th European Conference, (2014), 92–109. https://doi.org/10.1007/978-3-319-10578-9_7
[60]	Y. Cheng, H. Fu, X. Wei, J. Xiao, X. Cao, Depth enhanced saliency detection method, in Proceedings of international conference on internet multimedia computing and service, (2014), 23–27. https://doi.org/10.1145/2632856.2632866
[61]	R. Ju, L. Ge, W. Geng, T. Ren, G. Wu, Depth saliency based on anisotropic center-surround difference, in 2014 IEEE International Conference on Image Processing (ICIP), (2014), 1115–1119. https://doi.org/10.1109/ICIP.2014.7025222
[62]	J. Ren, X. Gong, L. Yu, W. Zhou, M. Y. Yang, Exploiting global priors for rgb-d saliency detection, in 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2015), 25–32. https://doi.org/10.1109/CVPRW.2015.7301391
[63]	A. Wang, M. Wang, RGB-D salient object detection via minimum barrier distance transform and saliency fusion, IEEE Signal Process. Lett., 24 (2017), 663–667. https://doi.org/10.1109/LSP.2017.2688136 doi: 10.1109/LSP.2017.2688136
[64]	R. Cong, J. Lei, H. Fu, J. Hou, Q. Huang, S. Kwong, Going from RGB to RGBD saliency: A depth-guided transformation model, IEEE Trans. Cyber., 50 (2020), 3627–3639. https://doi.org/10.1109/TCYB.2019.2932005 doi: 10.1109/TCYB.2019.2932005
[65]	L. Qu, S. He, J. Zhang, J. Tian, Y. Tang, Q. Yang, RGBD salient object detection via deep fusion, IEEE Trans. Image Process., 26 (2017), 2274–2285. https://doi.org/10.1109/TIP.2017.2682981 doi: 10.1109/TIP.2017.2682981
[66]	Y. Piao, W. Ji, J. Li, M. Zhang, H. Lu, Depth-induced multi-scale recurrent attention network for saliency detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7253–7262. https://doi.org/10.1109/ICCV.2019.00735
[67]	N. Liu, N. Zhang, J. Han, Learning selective self-mutual attention for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 13753–13762. https://doi.org/10.1109/CVPR42600.2020.01377
[68]	C. Li, R. Cong, S. Kwong, J. Hou, H. Fu, G. Zhu, et al., ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection, IEEE Trans. Cyber., 51 (2021), 88–100. https://doi.org/10.1109/TCYB.2020.2969255 doi: 10.1109/TCYB.2020.2969255
[69]	G. Li, Z. Liu, M. Chen, Z. Bai, W. Lin, H. Ling, Hierarchical alternate interaction network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3528–3542. https://doi.org/10.1109/TIP.2021.3062689 doi: 10.1109/TIP.2021.3062689
[70]	Y. H. Wu, Y. Liu, J. Xu, J. W. Bian, Y. C. Gu, M. M. Cheng, MobileSal: Extremely efficient RGB-D salient object detection, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 10261–10269. https://doi.org/10.1109/TPAMI.2021.3134684 doi: 10.1109/TPAMI.2021.3134684
[71]	N. Huang, Y. Yang, D. Zhang, Q. Zhang, J. Han, Employing bilinear fusion and saliency prior information for RGB-D salient object detection, IEEE Trans. Multimedia, 24 (2022), 1651–1664. https://doi.org/10.1109/TMM.2021.3069297 doi: 10.1109/TMM.2021.3069297
[72]	X. Wang, S. Li, C. Chen, Y. Fang, A. Hao, H. Qin, Data-level recombination and lightweight fusion scheme for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 458–471. https://doi.org/10.1109/TIP.2020.3037470 doi: 10.1109/TIP.2020.3037470
[73]	X. Zhao, L. Zhang, Y. Pang, H. Lu, L. Zhang, A single stream network for robust and real-time RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 646–662. https://doi.org/10.1007/978-3-030-58542-6_39
[74]	K. Fu, D. P. Fan, G. P. Ji, Q. Zhao, JL-DCF: Joint learning and densely-cooperative fusion framework for RGB-D salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3049–3059. https://doi.org/10.1109/CVPR42600.2020.00312
[75]	J. Han, H. Chen, N. Liu, C. Yan, X. Li, CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion, IEEE Trans. Cyber., 48 (2018), 3171–3183. https://doi.org/10.1109/TCYB.2017.2761775 doi: 10.1109/TCYB.2017.2761775
[76]	N. Wang, X. Gong, Adaptive fusion for RGB-D salient object detection, IEEE Access, 7 (2019), 55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107 doi: 10.1109/ACCESS.2019.2913107
[77]	G. Li, Z. Liu, H. Ling, ICNet: Information conversion network for RGB-D based salient object detection, IEEE Trans. Image Process., 29 (2020), 4873–4884. https://doi.org/10.1109/TIP.2020.2976689 doi: 10.1109/TIP.2020.2976689
[78]	H. Chen, Y. Li, Progressively complementarity-aware fusion network for RGB-D salient object detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3051–3060. https://doi.org/10.1109/CVPR.2018.00322
[79]	M. Zhang, W. Ren, Y. Piao, Z. Rong, H.Lu, Select, supplement and focus for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3469–3478. https://doi.org/10.1109/CVPR42600.2020.00353
[80]	C. Chen, J. Wei, C. Peng, H. Qin, Depth-quality-aware salient object detection, IEEE Trans. Image Process., 30 (2021), 2350–2363. https://doi.org/10.1109/TIP.2021.3052069 doi: 10.1109/TIP.2021.3052069
[81]	Y. Zhai, D. P. Fan, J. Yang, A. Borji, L. Shao, J. Han, L. Wang, Bifurcated backbone strategy for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 8727–8742. https://doi.org/10.1109/TIP.2021.3116793 doi: 10.1109/TIP.2021.3116793
[82]	W. D. Jin, J. Xu, Q. Han, Y. Zhang, M. M. Cheng, CDNet: Complementary depth network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3376–3390. https://doi.org/10.1109/TIP.2021.3060167 doi: 10.1109/TIP.2021.3060167
[83]	Z. Zhang, Z. Lin, J. Xu, W. D. Jin, S. P. Lu, D. P. Fan, Bilateral attention network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 1949–1961. https://doi.org/10.1109/TIP.2021.3049959 doi: 10.1109/TIP.2021.3049959
[84]	H. Chen, Y. Li, Three-stream attention-aware network for RGB-D salient object detection, IEEE Trans. Image Process., 28 (2019), 2825–2835. https://doi.org/10.1109/TIP.2019.2891104 doi: 10.1109/TIP.2019.2891104
[85]	J. Zhang, D. P. Fan, Y. Dai, S. Anwar, F. S. Saleh, T. Zhang, et al., Uc-net: Uncertainty inspired rgb-d saliency detection via conditional variational autoencoders, iIn 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 8579–8588. https://doi.org/10.1109/CVPR42600.2020.00861
[86]	A. Luo, X. Li, F. Yang, Z. Jiao, H. Cheng, S. Lyu, Cascade graph neural networks for RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 346–364. https://doi.org/10.1007/978-3-030-58610-2_21
[87]	B. Jiang, Z. Zhou, X. Wang, J. Tang, B. Luo, CmSalGAN: RGB-D salient object detection with cross-view generative adversarial networks, IEEE Trans. Multimedia, 23 (2021), 1343–1353. https://doi.org/10.1109/TMM.2020.2997184 doi: 10.1109/TMM.2020.2997184
[88]	T. Zhou, H. Fu, G. Chen, Y. Zhou, D. P. Fan, L. Shao, Specificity-preserving RGB-D saliency detection, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 4661–4671. https://doi.org/10.1109/ICCV48922.2021.00464
[89]	T. Zhou, Y. Zhou, C. Gong, J. Yang, Y. Zhang, Feature aggregation and propagation network for camouflaged object detection, IEEE Trans. Image Process., 31 (2022), 7036–7047. https://doi.org/10.1109/TIP.2022.3217695 doi: 10.1109/TIP.2022.3217695
[90]	M. Song, W. Song, G. Yang, C. Chen, Improving RGB-D salient object detection via modality-aware decoder, IEEE Trans. Image Process., 31 (2022), 6124–6138. https://doi.org/10.1109/TIP.2022.3205747 doi: 10.1109/TIP.2022.3205747
[91]	Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y. Zhao, et al., Ce-net: Context encoder network for 2d medical image segmentation, IEEE Trans. Med. Imaging, 38 (2019), 2281–2292. https://doi.org/10.1109/TMI.2019.2903562 doi: 10.1109/TMI.2019.2903562
[92]	S. Woo, J. Park, J. Y. Lee, I. S. Kweon, Cbam: Convolutional block attention module, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
[93]	W. Gao, G. Liao, S. Ma, G. Li, Y. Liang, W. Lin, Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 2091–2106. https://doi.org/10.1109/TCSVT.2021.3082939 doi: 10.1109/TCSVT.2021.3082939
[94]	K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23 doi: 10.1007/978-3-319-10578-9_23
[95]	I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in 7th International Conference on Learning Representations, 2019.
[96]	J. X. Zhao, Y. Cao, D. P. Fan, M. M. Cheng, X. Y. Li, L. Zhang, Contrast prior and fluid pyramid integration for RGB-D salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3922–3931.
[97]	N. Li, J. Ye, Y. Ji, H. Ling, J. Yu, Saliency detection on light field, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2806–2813. https://doi.org/10.1109/CVPR.2014.359
[98]	D. P. Fan, Z. Lin, Z. Zhang, M. Zhu, M. M. Cheng, Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks, IEEE Trans. Neural Networks Learn. Syst., 32 (2021), 2075–2089. https://doi.org/10.1109/TNNLS.2020.2996406 doi: 10.1109/TNNLS.2020.2996406
[99]	W. Ji, J. Li, M. Zhang, Y. Piao, H. Lu, Accurate RGB-D salient object detection via collaborative learning, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 52–69. https://doi.org/10.1007/978-3-030-58523-5_4
[100]	W. Zhang, G. P. Ji, Z. Wang, K. Fu, Q. Zhao, Depth quality-inspired feature manipulation for efficient RGB-D salient object detection, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475240
[101]	W. Zhang, Y. Jiang, K. Fu, Q. Zhao, BTS-Net: Bi-directional transfer-and-selection network for RGB-D salient object detection, in 2021 IEEE International Conference on Multimedia and Expo (ICME), (2021), 1–6. https://doi.org/10.1109/ICME51207.2021.9428263
[102]	M. Zhang, S. Yao, B. Hu, Y. Piao, W. Ji, C$^{2}$DFNet: Criss-cross dynamic filter network for rgb-d salient object detection, IEEE Trans. Multimedia, 2022 (2022), 1–13.
[103]	X. Cheng, X. Zheng, J. Pei, H. Tang, Z. Lyu, C. Chen, Depth-induced gap-reducing network for RGB-D salient object detection: An interaction, guidance and refinement approach, IEEE Trans. Multimedia, 2022 (2022).
[104]	Y. Pang, X. Zhao, L. Zhang, H. Lu, Caver: Cross-modal view-mixed transformer for bi-modal salient object detection, IEEE Trans. Image Process., 32 (2023), 892–904. https://doi.org/10.1109/TIP.2023.3234702 doi: 10.1109/TIP.2023.3234702
[105]	D. P. Fan, C. Gong, Y. Cao, B. Ren, M. M. Cheng, A. Borji, Enhanced-alignment measure for binary foreground map evaluation, in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, (2018), 698–704. https://doi.org/10.24963/ijcai.2018/97
[106]	D. P. Fan, M. M. Cheng, Y. Liu, T. Li, A. Borji, Structure-measure: A new way to evaluate foreground maps, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4558–4567. https://doi.org/10.1109/ICCV.2017.487
[107]	G. Chen, F. Shao, X. Chai, H. Chen, Q. Jiang, X. Meng, Y. S. Ho, Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 1787–1801. https://doi.org/10.1109/TCSVT.2022.3215979 doi: 10.1109/TCSVT.2022.3215979
[108]	Z. Liu, Y. Wang, Z. Tu, Y. Xiao, B. Tang, Tritransnet, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475601
[109]	R. Cong, Q. Lin, C. Zhang, C. Li, X. Cao, Q. Huang, Y. Zhao, CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection, IEEE Trans. Image Process., 31 (2022), 6800–6815. https://doi.org/10.1109/TIP.2022.3216198 doi: 10.1109/TIP.2022.3216198
[110]	Z. Liu, Y. Tan, Q. He, Y. Xiao, Swinnet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 4486–4497. https://doi.org/10.1109/TCSVT.2021.3127149 doi: 10.1109/TCSVT.2021.3127149
[111]	R. Cong, H. Liu, C. Zhang, W. Zhang, F. Zheng, R. Song, S. Kwong, Point-aware interaction and cnn-induced refinement network for RGB-D salient object detection, in Proceedings of the 31st ACM International Conference on Multimedia, 2023. https://doi.org/10.1145/3581783.3611982

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)