Research article Special Issues

Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection


  • Received: 23 September 2023 Revised: 27 November 2023 Accepted: 10 December 2023 Published: 09 January 2024
  • The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.

    Citation: Bojian Chen, Wenbin Wu, Zhezhou Li, Tengfei Han, Zhuolei Chen, Weihao Zhang. Attention-guided cross-modal multiple feature aggregation network for RGB-D salient object detection[J]. Electronic Research Archive, 2024, 32(1): 643-669. doi: 10.3934/era.2024031

    Related Papers:

  • The goal of RGB-D salient object detection is to aggregate the information of the two modalities of RGB and depth to accurately detect and segment salient objects. Existing RGB-D SOD models can extract the multilevel features of single modality well and can also integrate cross-modal features, but it can rarely handle both at the same time. To tap into and make the most of the correlations of intra- and inter-modality information, in this paper, we proposed an attention-guided cross-modal multi-feature aggregation network for RGB-D SOD. Our motivation was that both cross-modal feature fusion and multilevel feature fusion are crucial for RGB-D SOD task. The main innovation of this work lies in two points: One is the cross-modal pyramid feature interaction (CPFI) module that integrates multilevel features from both RGB and depth modalities in a bottom-up manner, and the other is cross-modal feature decoder (CMFD) that aggregates the fused features to generate the final saliency map. Extensive experiments on six benchmark datasets showed that the proposed attention-guided cross-modal multiple feature aggregation network (ACFPA-Net) achieved competitive performance over 15 state of the art (SOTA) RGB-D SOD methods, both qualitatively and quantitatively.



    加载中


    [1] Y. Zhao, Y. Peng, Saliency-guided video classification via adaptively weighted learning, in 2017 IEEE International Conference on Multimedia and Expo (ICME), (2017), 847–852. https://doi.org/10.1109/ICME.2017.8019343
    [2] X. Hu, Y. Wang, J. Shan, Automatic recognition of cloud images by using visual saliency features, IEEE Geosci. Remote Sens. Lett., 12 (2015), 1760–1764. https://doi.org/10.1109/LGRS.2015.2424531 doi: 10.1109/LGRS.2015.2424531
    [3] J. C. Ni, Y. Luo, D. Wang, J. Liang, Q. Zhang, Saliency-based sar target detection via convolutional sparse feature enhancement and bayesian inference, IEEE Trans. Geosci. Remote Sens., 61 (2023), 1–15. https://doi.org/10.1109/TGRS.2023.3237632 doi: 10.1109/TGRS.2023.3237632
    [4] Z. Yu, Y. Zhuge, H. Lu, L. Zhang, Joint learning of saliency detection and weakly supervised semantic segmentation, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7222–7232. https://doi.org/10.1109/ICCV.2019.00732
    [5] S. Lee, M. Lee, J. Lee, H. Shim, Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5491–5501. https://doi.org/10.1109/CVPR46437.2021.00545
    [6] W. Feng, R. Han, Q. Guo, J. Zhu, S, Wang, Dynamic saliency-aware regularization for correlation filter-based object tracking, IEEE Trans.n Image Process., 28 (2019), 3232–3245. https://doi.org/10.1109/TIP.2019.2895411 doi: 10.1109/TIP.2019.2895411
    [7] J. Y. Zhu, J. Wu, Y. Xu, E. Chang, Z. Tu, Unsupervised object class discovery via saliency-guided multiple class learning, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 862–875. https://doi.org/10.1109/TPAMI.2014.2353617 doi: 10.1109/TPAMI.2014.2353617
    [8] S. Wei, L. Liao, J. Li, Q. Zheng, F. Yang, Y. Zhao, Saliency inside: Learning attentive cnns for content-based image retrieval, IEEE Trans. Image Process., 28 (2019), 4580–4593. https://doi.org/10.1109/TIP.2019.2913513 doi: 10.1109/TIP.2019.2913513
    [9] A. Kim, R. M. Eustice, Real-time visual slam for autonomous underwater hull inspection using visual saliency, IEEE Trans. Rob., 29 (2013), 719–733. https://doi.org/10.1109/TRO.2012.2235699 doi: 10.1109/TRO.2012.2235699
    [10] R. Li, C. H. Wu, S. Liu, J. Wang, G. Wang, G. Liu, B. Zeng, SDP-GAN: Saliency detail preservation generative adversarial networks for high perceptual quality style transfer, IEEE Trans. Image Process., 30 (2021), 374–385. https://doi.org/10.1109/TIP.2020.3036754 doi: 10.1109/TIP.2020.3036754
    [11] L. Jiang, M. Xu, X. Wang, L. Sigal, Saliency-guided image translation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 16504–16513. https://doi.org/10.1109/CVPR46437.2021.01624
    [12] S. Li, M. Xu, Y. Ren, Z. Wang, Closed-form optimization on saliency-guided image compression for HEVC-MSP, IEEE Trans. Multimedia, 20 (2018), 155–170. https://doi.org/10.1109/TMM.2017.2721544 doi: 10.1109/TMM.2017.2721544
    [13] Y. Patel, S. Appalaraju, R. Manmatha, Saliency driven perceptual image compression, in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), (2021), 227–231. https://doi.org/10.1109/WACV48630.2021.00027
    [14] C. Yang, L. Zhang, H. Lu, X. Ruan, M. H. Yang, Saliency detection via graph-based manifold ranking, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 3166–3173. https://doi.org/10.1109/CVPR.2013.407
    [15] W. Zhu, S. Liang, Y. Wei, J. Sun, Saliency optimization from robust background detection, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2814–2821. https://doi.org/10.1109/CVPR.2014.360
    [16] K. Shi, K. Wang, J. Lu, L. Lin, Pisa: Pixelwise image saliency by aggregating complementary appearance contrast measures with spatial priors, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 2115–2122. https://doi.org/10.1109/CVPR.2013.275
    [17] M. M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, S. M. Hu., Global contrast based salient region detection, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 569–582. https://doi.org/10.1109/CVPR.2011.5995344 https://doi.org/10.1109/CVPR.2011.5995344 doi: 10.1109/CVPR.2011.5995344
    [18] W. C. Tu, S. He, Q. Yang, S. Y. Chien, Real-time salient object detection with a minimum spanning tree, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 2334–2342. https://doi.org/10.1109/CVPR.2016.256
    [19] R. Zhao, W. Ouyang, H. Li, X, Wang, Saliency detection by multi-context deep learning, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 1265–1274. https://doi.org/10.1109/CVPR.2015.7298731
    [20] G. Li, Y. Yu, Visual saliency based on multiscale deep features, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 5455–5463.
    [21] L. Wang, H. Lu, X. Ruan, M. H. Yang, Deep networks for saliency detection via local estimation and global search, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 3183–3192. https://doi.org/10.1109/CVPR.2015.7298938
    [22] Z. Luo, A. Mishra, A. Achkar, J. Eichel, S. Li, P. M. Jodoin, Non-local deep features for salient object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 6593–6601. https://doi.org/10.1109/CVPR.2017.698
    [23] P. Zhang, D. Wang, H. Lu, H. Wang, X. Ruan, Amulet: Aggregating multi-level convolutional features for salient object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 202–211. https://doi.org/10.1109/ICCV.2017.31
    [24] Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. Torr, Deeply supervised salient object detection with short connections, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 5300–5309. https://doi.org/10.1109/CVPR.2017.563
    [25] W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, R. Yang, Salient object detection in the deep learning era: An in-depth survey, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 3239–3259. https://doi.org/10.1109/TPAMI.2021.3051099 doi: 10.1109/TPAMI.2021.3051099
    [26] A. Borji, M. M. Cheng, Q. Hou, H. Jiang, J. Li, Salient object detection: A survey, Comput. Vis. Media, 5 (2019), 117–150. https://doi.org/10.1007/s41095-019-0149-9 doi: 10.1007/s41095-019-0149-9
    [27] T. Zhou, D. P. Fan, M. M. Cheng, J. Shen, L. Shao, RGB-D salient object detection: A survey, Comput. Vis. Media, 7 (2021), 37–69. https://doi.org/10.1007/s41095-020-0199-z doi: 10.1007/s41095-020-0199-z
    [28] X. Song, D. Zhou, W. Li, Y. Dai, L. Liu, H. Li, et al., WAFP-Net: Weighted attention fusion based progressive residual learning for depth map super-resolution, IEEE Trans. Multimedia, 24 (2022), 4113–4127. https://doi.org/10.1109/TMM.2021.3118282 doi: 10.1109/TMM.2021.3118282
    [29] P. F. Proença, Y. Gao, Splode: Semi-probabilistic point and line odometry with depth estimation from RGB-D camera motion, in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2017), 1594–1601. https://doi.org/10.1109/IROS.2017.8205967
    [30] X. Xing, Y. Cai, T. Lu, Y. Yang, D. Wen, Joint self-supervised monocular depth estimation and SLAM, in 2022 26th International Conference on Pattern Recognition (ICPR), (2022), 4030–4036. https://doi.org/10.1109/ICPR56361.2022.9956576
    [31] Q. Chen, Z. Liu, Y. Zhang, K. Fu, Q. Zhao, H. Du, RGB-D salient object detection via 3d convolutional neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, (2021), 1063–1071. https://doi.org/10.1609/aaai.v35i2.16191
    [32] F. Wang, J. Pan, S. Xu, J. Tang, Learning discriminative cross-modality features for RGB-D saliency detection, IEEE Trans. Image Process., 31 (2022), 1285–1297. https://doi.org/10.1109/TIP.2022.3140606 doi: 10.1109/TIP.2022.3140606
    [33] Z. Wu, G. Allibert, F. Meriaudeau, C. Ma, C. Demonceaux, Hidanet: RGB-D salient object detection via hierarchical depth awareness., IEEE Trans. Image Process., 32 (2023), 2160–2173. https://doi.org/10.1109/TIP.2023.3263111 doi: 10.1109/TIP.2023.3263111
    [34] J. Zhang, Q. Liang, Q. Guo, J. Yang, Q. Zhang, Y. Shi, R2net: Residual refinement network for salient object detection, Image Vision Comput., 120 (2022), 104423. https://doi.org/10.1016/j.imavis.2022.104423 doi: 10.1016/j.imavis.2022.104423
    [35] R. Shigematsu, D. Feng, S. You, N. Barnes, Learning RGB-D salient object detection using background enclosure, depth contrast, and top-down features, in 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), (2017), 2749–2757.
    [36] L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell., 20 (1998), 1254–1259. https://doi.org/10.1109/34.730558 doi: 10.1109/34.730558
    [37] C. Yang, L. Zhang, H. Lu, Graph-regularized saliency detection with convex-hull-based center prior, IEEE Signal Process. Lett., 20 (2013), 637–640. https://doi.org/10.1109/LSP.2013.2260737 doi: 10.1109/LSP.2013.2260737
    [38] P. Jiang, H. Ling, J. Yu, J. Peng, Salient region detection by ufo: Uniqueness, focusness and objectness, in 2013 IEEE International Conference on Computer Vision, (2013), 1976–1983.
    [39] R. S. Srivatsa, R. V. Babu, Salient object detection via objectness measure, in 2015 IEEE International Conference on Image Processing (ICIP), (2015), 4481–4485. https://doi.org/10.1109/ICIP.2015.7351654
    [40] C. Scharfenberger, A. Wong, K. Fergani, J. S. Zelek, D. A. Clausi, Statistical textural distinctiveness for salient region detection in natural images, in 2013 IEEE Conference on Computer Vision and Pattern Recognition, (2013), 979–986. https://doi.org/10.1109/CVPR.2013.131
    [41] A. Borji, M. M. Cheng, H. Jiang, J. Li, Salient object detection: A benchmark, IEEE Trans. Image Process., 24 (2015), 5706–5722. https://doi.org/10.1109/TIP.2015.2487833 doi: 10.1109/TIP.2015.2487833
    [42] J. Han, D. Zhang, G. Cheng, N. Liu, D. Xu, Advanced deep-learning techniques for salient and category-specific object detection: A survey, IEEE Signal Process. Mag., 35 (2018), 84–100. https://doi.org/10.1109/MSP.2017.2749125 doi: 10.1109/MSP.2017.2749125
    [43] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012 (2012), 25. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
    [44] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
    [45] N. Liu, J. Han, M. H. Yang, Picanet: Learning pixel-wise contextual attention for saliency detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3089–3098. https://doi.org/10.1109/CVPR.2018.00326
    [46] S. Chen, X. Tan, B. Wang, X. Hu, Reverse attention for salient object detection, in Proceedings of the European conference on computer vision (ECCV), (2018), 234–250. https://doi.org/10.1007/978-3-030-01240-3_15
    [47] J. J. Liu, Q. Hou, M. M. Cheng, J. Feng, J. Jiang, A simple pooling-based design for real-time salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3912–3921. https://doi.org/10.1109/CVPR.2019.00404
    [48] Y. Pang, X. Zhao, L. Zhang, H. Lu, Multi-scale interactive network for salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 9410–9419. https://doi.org/10.1109/CVPR42600.2020.00943
    [49] Q. Hou, M. M. Cheng, X. Hu, A. Borji, Z. Tu, P. H. S. Torr, Deeply supervised salient object detection with short connections, IEEE Trans. Pattern Anal. Mach. Intell., 41 (2019), 815–828. https://doi.org/10.1109/CVPR.2017.563 https://doi.org/10.1109/TPAMI.2018.2815688 doi: 10.1109/CVPR.2017.563
    [50] X. Qin, Z. Zhang, C. Huang, C. Gao, M. Dehghan, M. Jagersand, Basnet: Boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 7471–7481. https://doi.org/10.1109/CVPR.2019.00766
    [51] P. Zhang, W. Liu, H. Lu, C. Shen, Salient object detection with lossless feature reflection and weighted structural loss, IEEE Trans. Image Process., 28 (2019), 3048–3060. https://doi.org/10.1109/TIP.2019.2893535 doi: 10.1109/TIP.2019.2893535
    [52] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in 3rd International Conference on Learning Representations, 2015.
    [53] W. Wang, S. Zhao, J. Shen, S. C. H. Hoi, A. Borji, Salient object detection with pyramid attention and salient edges, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1448–1457. https://doi.org/10.1109/CVPR.2019.00154
    [54] T. Zhao, X. Wu, Pyramid feature attention network for saliency detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3080–3089. wangzhi
    [55] S. Chen, X. Tan, B. Wang, H. Lu, X. Hu, Y. Fu, Reverse attention-based residual network for salient object detection, IEEE Trans. Image Process., 29 (2020), 3763–3776. https://doi.org/10.1109/TIP.2020.2965989 doi: 10.1109/TIP.2020.2965989
    [56] M. Feng, H. Lu, E. Ding, Attentive feedback network for boundary-aware salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 1623–1632. https://doi.org/10.1109/CVPR.2019.00172
    [57] J. Zhao, J. J. Liu, D. P. Fan, Y. Cao, J. Yang, M. M. Cheng, Egnet: Edge guidance network for salient object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 8778–8787. https://doi.org/10.1109/ICCV.2019.00887
    [58] Y. Niu, Y. Geng, X. Li, F. Liu, Leveraging stereopsis for saliency analysis, in 2012 IEEE Conference on Computer Vision and Pattern Recognition, (2012), 454–461.
    [59] H. Peng, B. Li, W. Xiong, W. Hu, R. Ji, RGBD salient object detection: A benchmark and algorithms, in Computer Vision–ECCV 2014: 13th European Conference, (2014), 92–109. https://doi.org/10.1007/978-3-319-10578-9_7
    [60] Y. Cheng, H. Fu, X. Wei, J. Xiao, X. Cao, Depth enhanced saliency detection method, in Proceedings of international conference on internet multimedia computing and service, (2014), 23–27. https://doi.org/10.1145/2632856.2632866
    [61] R. Ju, L. Ge, W. Geng, T. Ren, G. Wu, Depth saliency based on anisotropic center-surround difference, in 2014 IEEE International Conference on Image Processing (ICIP), (2014), 1115–1119. https://doi.org/10.1109/ICIP.2014.7025222
    [62] J. Ren, X. Gong, L. Yu, W. Zhou, M. Y. Yang, Exploiting global priors for rgb-d saliency detection, in 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2015), 25–32. https://doi.org/10.1109/CVPRW.2015.7301391
    [63] A. Wang, M. Wang, RGB-D salient object detection via minimum barrier distance transform and saliency fusion, IEEE Signal Process. Lett., 24 (2017), 663–667. https://doi.org/10.1109/LSP.2017.2688136 doi: 10.1109/LSP.2017.2688136
    [64] R. Cong, J. Lei, H. Fu, J. Hou, Q. Huang, S. Kwong, Going from RGB to RGBD saliency: A depth-guided transformation model, IEEE Trans. Cyber., 50 (2020), 3627–3639. https://doi.org/10.1109/TCYB.2019.2932005 doi: 10.1109/TCYB.2019.2932005
    [65] L. Qu, S. He, J. Zhang, J. Tian, Y. Tang, Q. Yang, RGBD salient object detection via deep fusion, IEEE Trans. Image Process., 26 (2017), 2274–2285. https://doi.org/10.1109/TIP.2017.2682981 doi: 10.1109/TIP.2017.2682981
    [66] Y. Piao, W. Ji, J. Li, M. Zhang, H. Lu, Depth-induced multi-scale recurrent attention network for saliency detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7253–7262. https://doi.org/10.1109/ICCV.2019.00735
    [67] N. Liu, N. Zhang, J. Han, Learning selective self-mutual attention for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 13753–13762. https://doi.org/10.1109/CVPR42600.2020.01377
    [68] C. Li, R. Cong, S. Kwong, J. Hou, H. Fu, G. Zhu, et al., ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection, IEEE Trans. Cyber., 51 (2021), 88–100. https://doi.org/10.1109/TCYB.2020.2969255 doi: 10.1109/TCYB.2020.2969255
    [69] G. Li, Z. Liu, M. Chen, Z. Bai, W. Lin, H. Ling, Hierarchical alternate interaction network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3528–3542. https://doi.org/10.1109/TIP.2021.3062689 doi: 10.1109/TIP.2021.3062689
    [70] Y. H. Wu, Y. Liu, J. Xu, J. W. Bian, Y. C. Gu, M. M. Cheng, MobileSal: Extremely efficient RGB-D salient object detection, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 10261–10269. https://doi.org/10.1109/TPAMI.2021.3134684 doi: 10.1109/TPAMI.2021.3134684
    [71] N. Huang, Y. Yang, D. Zhang, Q. Zhang, J. Han, Employing bilinear fusion and saliency prior information for RGB-D salient object detection, IEEE Trans. Multimedia, 24 (2022), 1651–1664. https://doi.org/10.1109/TMM.2021.3069297 doi: 10.1109/TMM.2021.3069297
    [72] X. Wang, S. Li, C. Chen, Y. Fang, A. Hao, H. Qin, Data-level recombination and lightweight fusion scheme for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 458–471. https://doi.org/10.1109/TIP.2020.3037470 doi: 10.1109/TIP.2020.3037470
    [73] X. Zhao, L. Zhang, Y. Pang, H. Lu, L. Zhang, A single stream network for robust and real-time RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 646–662. https://doi.org/10.1007/978-3-030-58542-6_39
    [74] K. Fu, D. P. Fan, G. P. Ji, Q. Zhao, JL-DCF: Joint learning and densely-cooperative fusion framework for RGB-D salient object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3049–3059. https://doi.org/10.1109/CVPR42600.2020.00312
    [75] J. Han, H. Chen, N. Liu, C. Yan, X. Li, CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion, IEEE Trans. Cyber., 48 (2018), 3171–3183. https://doi.org/10.1109/TCYB.2017.2761775 doi: 10.1109/TCYB.2017.2761775
    [76] N. Wang, X. Gong, Adaptive fusion for RGB-D salient object detection, IEEE Access, 7 (2019), 55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107 doi: 10.1109/ACCESS.2019.2913107
    [77] G. Li, Z. Liu, H. Ling, ICNet: Information conversion network for RGB-D based salient object detection, IEEE Trans. Image Process., 29 (2020), 4873–4884. https://doi.org/10.1109/TIP.2020.2976689 doi: 10.1109/TIP.2020.2976689
    [78] H. Chen, Y. Li, Progressively complementarity-aware fusion network for RGB-D salient object detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3051–3060. https://doi.org/10.1109/CVPR.2018.00322
    [79] M. Zhang, W. Ren, Y. Piao, Z. Rong, H.Lu, Select, supplement and focus for RGB-D saliency detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3469–3478. https://doi.org/10.1109/CVPR42600.2020.00353
    [80] C. Chen, J. Wei, C. Peng, H. Qin, Depth-quality-aware salient object detection, IEEE Trans. Image Process., 30 (2021), 2350–2363. https://doi.org/10.1109/TIP.2021.3052069 doi: 10.1109/TIP.2021.3052069
    [81] Y. Zhai, D. P. Fan, J. Yang, A. Borji, L. Shao, J. Han, L. Wang, Bifurcated backbone strategy for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 8727–8742. https://doi.org/10.1109/TIP.2021.3116793 doi: 10.1109/TIP.2021.3116793
    [82] W. D. Jin, J. Xu, Q. Han, Y. Zhang, M. M. Cheng, CDNet: Complementary depth network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3376–3390. https://doi.org/10.1109/TIP.2021.3060167 doi: 10.1109/TIP.2021.3060167
    [83] Z. Zhang, Z. Lin, J. Xu, W. D. Jin, S. P. Lu, D. P. Fan, Bilateral attention network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 1949–1961. https://doi.org/10.1109/TIP.2021.3049959 doi: 10.1109/TIP.2021.3049959
    [84] H. Chen, Y. Li, Three-stream attention-aware network for RGB-D salient object detection, IEEE Trans. Image Process., 28 (2019), 2825–2835. https://doi.org/10.1109/TIP.2019.2891104 doi: 10.1109/TIP.2019.2891104
    [85] J. Zhang, D. P. Fan, Y. Dai, S. Anwar, F. S. Saleh, T. Zhang, et al., Uc-net: Uncertainty inspired rgb-d saliency detection via conditional variational autoencoders, iIn 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 8579–8588. https://doi.org/10.1109/CVPR42600.2020.00861
    [86] A. Luo, X. Li, F. Yang, Z. Jiao, H. Cheng, S. Lyu, Cascade graph neural networks for RGB-D salient object detection, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 346–364. https://doi.org/10.1007/978-3-030-58610-2_21
    [87] B. Jiang, Z. Zhou, X. Wang, J. Tang, B. Luo, CmSalGAN: RGB-D salient object detection with cross-view generative adversarial networks, IEEE Trans. Multimedia, 23 (2021), 1343–1353. https://doi.org/10.1109/TMM.2020.2997184 doi: 10.1109/TMM.2020.2997184
    [88] T. Zhou, H. Fu, G. Chen, Y. Zhou, D. P. Fan, L. Shao, Specificity-preserving RGB-D saliency detection, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 4661–4671. https://doi.org/10.1109/ICCV48922.2021.00464
    [89] T. Zhou, Y. Zhou, C. Gong, J. Yang, Y. Zhang, Feature aggregation and propagation network for camouflaged object detection, IEEE Trans. Image Process., 31 (2022), 7036–7047. https://doi.org/10.1109/TIP.2022.3217695 doi: 10.1109/TIP.2022.3217695
    [90] M. Song, W. Song, G. Yang, C. Chen, Improving RGB-D salient object detection via modality-aware decoder, IEEE Trans. Image Process., 31 (2022), 6124–6138. https://doi.org/10.1109/TIP.2022.3205747 doi: 10.1109/TIP.2022.3205747
    [91] Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y. Zhao, et al., Ce-net: Context encoder network for 2d medical image segmentation, IEEE Trans. Med. Imaging, 38 (2019), 2281–2292. https://doi.org/10.1109/TMI.2019.2903562 doi: 10.1109/TMI.2019.2903562
    [92] S. Woo, J. Park, J. Y. Lee, I. S. Kweon, Cbam: Convolutional block attention module, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
    [93] W. Gao, G. Liao, S. Ma, G. Li, Y. Liang, W. Lin, Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 2091–2106. https://doi.org/10.1109/TCSVT.2021.3082939 doi: 10.1109/TCSVT.2021.3082939
    [94] K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1007/978-3-319-10578-9_23 doi: 10.1007/978-3-319-10578-9_23
    [95] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in 7th International Conference on Learning Representations, 2019.
    [96] J. X. Zhao, Y. Cao, D. P. Fan, M. M. Cheng, X. Y. Li, L. Zhang, Contrast prior and fluid pyramid integration for RGB-D salient object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3922–3931.
    [97] N. Li, J. Ye, Y. Ji, H. Ling, J. Yu, Saliency detection on light field, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 2806–2813. https://doi.org/10.1109/CVPR.2014.359
    [98] D. P. Fan, Z. Lin, Z. Zhang, M. Zhu, M. M. Cheng, Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks, IEEE Trans. Neural Networks Learn. Syst., 32 (2021), 2075–2089. https://doi.org/10.1109/TNNLS.2020.2996406 doi: 10.1109/TNNLS.2020.2996406
    [99] W. Ji, J. Li, M. Zhang, Y. Piao, H. Lu, Accurate RGB-D salient object detection via collaborative learning, in Computer Vision—ECCV 2020: 16th European Conference, (2020), 52–69. https://doi.org/10.1007/978-3-030-58523-5_4
    [100] W. Zhang, G. P. Ji, Z. Wang, K. Fu, Q. Zhao, Depth quality-inspired feature manipulation for efficient RGB-D salient object detection, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475240
    [101] W. Zhang, Y. Jiang, K. Fu, Q. Zhao, BTS-Net: Bi-directional transfer-and-selection network for RGB-D salient object detection, in 2021 IEEE International Conference on Multimedia and Expo (ICME), (2021), 1–6. https://doi.org/10.1109/ICME51207.2021.9428263
    [102] M. Zhang, S. Yao, B. Hu, Y. Piao, W. Ji, C$^{2}$DFNet: Criss-cross dynamic filter network for rgb-d salient object detection, IEEE Trans. Multimedia, 2022 (2022), 1–13.
    [103] X. Cheng, X. Zheng, J. Pei, H. Tang, Z. Lyu, C. Chen, Depth-induced gap-reducing network for RGB-D salient object detection: An interaction, guidance and refinement approach, IEEE Trans. Multimedia, 2022 (2022).
    [104] Y. Pang, X. Zhao, L. Zhang, H. Lu, Caver: Cross-modal view-mixed transformer for bi-modal salient object detection, IEEE Trans. Image Process., 32 (2023), 892–904. https://doi.org/10.1109/TIP.2023.3234702 doi: 10.1109/TIP.2023.3234702
    [105] D. P. Fan, C. Gong, Y. Cao, B. Ren, M. M. Cheng, A. Borji, Enhanced-alignment measure for binary foreground map evaluation, in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, (2018), 698–704. https://doi.org/10.24963/ijcai.2018/97
    [106] D. P. Fan, M. M. Cheng, Y. Liu, T. Li, A. Borji, Structure-measure: A new way to evaluate foreground maps, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4558–4567. https://doi.org/10.1109/ICCV.2017.487
    [107] G. Chen, F. Shao, X. Chai, H. Chen, Q. Jiang, X. Meng, Y. S. Ho, Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 1787–1801. https://doi.org/10.1109/TCSVT.2022.3215979 doi: 10.1109/TCSVT.2022.3215979
    [108] Z. Liu, Y. Wang, Z. Tu, Y. Xiao, B. Tang, Tritransnet, in Proceedings of the 29th ACM International Conference on Multimedia, 2021. https://doi.org/10.1145/3474085.3475601
    [109] R. Cong, Q. Lin, C. Zhang, C. Li, X. Cao, Q. Huang, Y. Zhao, CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection, IEEE Trans. Image Process., 31 (2022), 6800–6815. https://doi.org/10.1109/TIP.2022.3216198 doi: 10.1109/TIP.2022.3216198
    [110] Z. Liu, Y. Tan, Q. He, Y. Xiao, Swinnet: Swin transformer drives edge-aware RGB-D and RGB-T salient object detection, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 4486–4497. https://doi.org/10.1109/TCSVT.2021.3127149 doi: 10.1109/TCSVT.2021.3127149
    [111] R. Cong, H. Liu, C. Zhang, W. Zhang, F. Zheng, R. Song, S. Kwong, Point-aware interaction and cnn-induced refinement network for RGB-D salient object detection, in Proceedings of the 31st ACM International Conference on Multimedia, 2023. https://doi.org/10.1145/3581783.3611982
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1079) PDF downloads(89) Cited by(1)

Article outline

Figures and Tables

Figures(6)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog