The fusion of infrared (IR) and visible (VIS) images aims to synthesize fused images with salient targets and enriched details. However, existing fusion methods face challenges in integrating modality-specific features. Accordingly, we proposed an image fusion method based on a dual-channel fusion strategy, termed DCGAN-Fuse. First, we created a dual-channel fusion strategy and constructed a dual-channel fusion module (DCFM) to integrate shared and complementary information across both modalities. Second, during the feature enhancement phase, we designed an attention-enhanced gradient retention module (AEGRM) to enhance edge feature extraction and enforce spatial consistency. Moreover, we used the multi-scale module (MSM) to capture fused features and avoid information loss from the two source images. We have improved the loss function by introducing the maximum intensity loss function for our proposed method. Experiments on public datasets demonstrated that our method generates fused images with highlighted infrared targets and enriched textures. Both subjective and objective assessments indicated that our DCGAN-Fuse is better than the other thirteen advanced algorithms.
Citation: Qianying Wang, Haiyan Xie, Huimin Qu. Dual-channel fusion and dual-discriminator GAN for infrared and visible image fusion[J]. Electronic Research Archive, 2025, 33(9): 5471-5495. doi: 10.3934/era.2025245
The fusion of infrared (IR) and visible (VIS) images aims to synthesize fused images with salient targets and enriched details. However, existing fusion methods face challenges in integrating modality-specific features. Accordingly, we proposed an image fusion method based on a dual-channel fusion strategy, termed DCGAN-Fuse. First, we created a dual-channel fusion strategy and constructed a dual-channel fusion module (DCFM) to integrate shared and complementary information across both modalities. Second, during the feature enhancement phase, we designed an attention-enhanced gradient retention module (AEGRM) to enhance edge feature extraction and enforce spatial consistency. Moreover, we used the multi-scale module (MSM) to capture fused features and avoid information loss from the two source images. We have improved the loss function by introducing the maximum intensity loss function for our proposed method. Experiments on public datasets demonstrated that our method generates fused images with highlighted infrared targets and enriched textures. Both subjective and objective assessments indicated that our DCGAN-Fuse is better than the other thirteen advanced algorithms.
| [1] |
L. Bai, W. Zhang, X. Pan, C. Zhao, Underwater image enhancement based on global and local equalization of histogram and dual-image multi-scale fusion, IEEE Access, 8 (2020), 128973–128990. https://doi.org/10.1109/ACCESS.2020.3009161 doi: 10.1109/ACCESS.2020.3009161
|
| [2] |
M. Rashid, M. A. Khan, M. Alhaisoni, S. H. Wang, S. R. Naqvi, A. Rehman, et al., A sustainable deep learning framework for object recognition using multi-layers deep features fusion and selection, Sustainability, 12 (2020), 5037. https://doi.org/10.3390/su12125037 doi: 10.3390/su12125037
|
| [3] |
X. Liang, J. Zhang, L. Zhuo, Y. Li, Q. Tian, Small object detection in unmanned aerial vehicle images using feature fusion scaling-based single shot detector with spatial context analysis, IEEE Trans. Circuits Syst. Video Technol., 30 (2020), 1758–1770. https://doi.org/10.1109/TCSVT.2019.2905881 doi: 10.1109/TCSVT.2019.2905881
|
| [4] |
M. Jiang, Y. Zhao, J. Kong, Mutual learning and feature fusion siamese networks for visual object tracking, IEEE Trans. Circuits Syst. Video Technol., 31 (2021), 3154–3167. https://doi.org/10.1109/TCSVT.2020.3037947 doi: 10.1109/TCSVT.2020.3037947
|
| [5] | V. I. Adamchuk, R. A. Viscarra Rossel, K. A. Sudduth, P. S. Lammers, Sensor fusion for precision agriculture, Sensor Fusion - Found. Appl., 2011. https://doi.org/10.5772/19983 |
| [6] |
W. Xiong, Z. Xiong, Y. Cui, L. Huang, R. Yang, An interpretable fusion siamese network for multi-modality remote sensing ship image retrieval, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 2696–2712. https://doi.org/10.1109/TCSVT.2022.3224068 doi: 10.1109/TCSVT.2022.3224068
|
| [7] |
M. A. N. U. Ghani, K. She, M. U. Saeed, N. Latif, Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms, Electronic Res. Arch., 32 (2024), 2267–2285. https://doi.org/10.3934/era.2024103 doi: 10.3934/era.2024103
|
| [8] |
W. Zhang, M. Dai, B. Zhou, C. Wang, MCADFusion: a novel multi-scale convolutional attention decomposition method for enhanced infrared and visible light image fusion, Electronic Res. Arch., 32 (2024), 5067–5089. https://doi.org/10.3934/era.2024233 doi: 10.3934/era.2024233
|
| [9] |
Y. Liu, J. Jin, Q. Wang, Y. Shen, X. Dong, Region level based multi-focus image fusion using quaternion wavelet and normalized cut, Signal Process., 97 (2014), 9–30. https://doi.org/10.1016/j.sigpro.2013.10.010 doi: 10.1016/j.sigpro.2013.10.010
|
| [10] |
Y. Liu, X. Yang, R. Zhang, M. K. Albertini, T. Celik, G. Jeon, Entropy-based image fusion with joint sparse representation and rolling guidance filter, Entropy, 22 (2020), 118. https://doi.org/10.3390/e22010118 doi: 10.3390/e22010118
|
| [11] |
A. Wang, M. Wang, RGB-D salient object detection via minimum barrier distance transform and saliency fusion, IEEE Signal Process. Lett., 24 (2017), 663–667. https://doi.org/10.1109/LSP.2017.2688136 doi: 10.1109/LSP.2017.2688136
|
| [12] |
W. Kong, Y. Lei, H. Zhao, Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization, Infrared Phys. Technol., 67 (2014), 161–172. https://doi.org/10.1016/j.infrared.2014.07.019 doi: 10.1016/j.infrared.2014.07.019
|
| [13] |
R. Nie, J. Cao, D. Zhou, W. Qian, Multi-source information exchange encoding with PCNN for medical image fusion, IEEE Trans. Circuits Syst. Video Technol., 31 (2021), 986–1000. https://doi.org/10.1109/TCSVT.2020.2998696 doi: 10.1109/TCSVT.2020.2998696
|
| [14] |
J. Zhao, G. Cui, X. Gong, Y. Zang, S. Tao, D. Wang, Fusion of visible and infrared images using global entropy and gradient constrained regularization, Infrared Phys. Technol., 81 (2017), 201–209. https://doi.org/10.1016/j.infrared.2017.01.012 doi: 10.1016/j.infrared.2017.01.012
|
| [15] |
J. Tang, A contrast based image fusion technique in the DCT domain, Digital Signal Process., 14 (2004), 218–226. https://doi.org/10.1016/j.dsp.2003.06.001 doi: 10.1016/j.dsp.2003.06.001
|
| [16] | J. Tang, Q. Sun, Z. Wang, Y. Cao, Perfect-reconstruction four-tap size-limited filter banks for image fusion application, in 2007 International Conference on Mechatronics and Automation (ICMA), (2007), 255–260. https://doi.org/10.1109/ICMA.2007.4303550 |
| [17] |
W. Zhou, X. Lin, J. Lei, L. Yu, J. N. Hwang, MFFENet: multiscale feature fusion and enhancement network for RGB–thermal urban road scene parsing, IEEE Trans. Multimedia, 24 (2022), 2526–2538. https://doi.org/10.1109/TMM.2021.3086618 doi: 10.1109/TMM.2021.3086618
|
| [18] | K. R. Prabhakar, V. S. Srikar, R. V. Babu, DeepFuse: a deep unsupervised approach for exposure fusion with extreme exposure image pairs, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4724–4732. https://doi.org/10.1109/ICCV.2017.505 |
| [19] |
H. Li, X. J. Wu, T. S. Durrani, Infrared and visible image fusion with ResNet and zero-phase component analysis, Infrared Phys. Technol., 102 (2019), 103039. https://doi.org/10.1016/j.infrared.2019.103039 doi: 10.1016/j.infrared.2019.103039
|
| [20] |
H. Li, X. J. Wu, DenseFuse: a fusion approach to infrared and visible images, IEEE Trans. Image Process., 28 (2019), 2614–2623. https://doi.org/10.1109/TIP.2018.2887342 doi: 10.1109/TIP.2018.2887342
|
| [21] |
J. Ma, W. Yu, P. Liang, C. Li, J. Jiang, FusionGAN: a generative adversarial network for infrared and visible image fusion, Inf. Fusion, 48 (2019), 11–26. https://doi.org/10.1016/j.inffus.2018.09.004 doi: 10.1016/j.inffus.2018.09.004
|
| [22] |
J. Ma, H. Xu, J. Jiang, X. Mei, X. Zhang, DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion, IEEE Trans. Image Process., 29 (2020), 4980–4995. https://doi.org/10.1109/TIP.2020.2977573 doi: 10.1109/TIP.2020.2977573
|
| [23] |
H. Zhou, W. Wu, Y. Zhang, J. Ma, H. Ling, Semantic-supervised infrared and visible image fusion via a dual-discriminator generative adversarial network, IEEE Trans. Multimedia, 25 (2023), 635–648. https://doi.org/10.1109/TMM.2021.3129609 doi: 10.1109/TMM.2021.3129609
|
| [24] | Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, et al., Swin Transformer: hierarchical vision Transformer using shifted windows, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986 |
| [25] |
W. Tang, F. He, Y. Liu, Y. Duan, T. Si, DATFuse: infrared and visible image fusion via dual attention transformer, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 3159–3172. https://doi.org/10.1109/TCSVT.2023.3234340 doi: 10.1109/TCSVT.2023.3234340
|
| [26] |
J. Li, J. Zhu, C. Li, X. Chen, B. Yang, CGTF: Convolution-guided transformer for infrared and visible image fusion, IEEE Trans. Instrum. Meas., 71 (2022), 1–14. https://doi.org/10.1109/TIM.2022.3175055 doi: 10.1109/TIM.2022.3175055
|
| [27] |
W. Tang, F. He, Y. Liu, YDTR: Infrared and visible image fusion via Y-shape dynamic transformer, IEEE Trans. Instrum. Meas., 25 (2023), 5413–5428. https://doi.org/10.1109/TMM.2022.3192661 doi: 10.1109/TMM.2022.3192661
|
| [28] |
X. Hu, Y. Liu, F. Yang, PFCFuse: a Poolformer and CNN fusion network for infrared-visible image fusion, IEEE Trans. Instrum. Meas., 71 (2024), 1–14. https://doi.org/10.1109/TIM.2024.3450061 doi: 10.1109/TIM.2024.3450061
|
| [29] |
W. Tang, F. He, Y. Liu, ITFuse: An interactive transformer for infrared and visible image fusion, Pattern Recognit., 156 (2024), 110822. https://doi.org/10.1016/j.patcog.2024.110822 doi: 10.1016/j.patcog.2024.110822
|
| [30] | M. Arjovsky, L. Bottou, Towards principled methods for training generative adversarial networks, in 5th International Conference on Learning Representations (ICLR), (2017), 24–26. https://doi.org/10.48550/arXiv.1701.04862 |
| [31] | H. Petzka, A. Fischer, D. Lukovnicov, On the regularization of wasserstein GANs, in 6th International Conference on Learning Representations (ICLR), (2018), 11–13. https://doi.org/10.48550/arXiv.1709.08894 |
| [32] | A. Zhang, The research of single image super-resolution reconstruction based on improved generative adversarial network, CNKI, 2022. https://doi.org/10.26989/d.cnki.gdlhu.2022.001747 |
| [33] | A. Toet, TNO Image Fusion Dataset, figshare, 2014. https://doi.org/10.6084/m9.figshare.1008029 |
| [34] |
H. Xu, J. Ma, J. Jiang, X. Guo, H. Ling, U2Fusion: a unified unsupervised image fusion network, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 502–518. https://doi.org/10.1109/TPAMI.2020.3012548 doi: 10.1109/TPAMI.2020.3012548
|
| [35] |
Y. Yang, J. Li, Y. Wang, Review of image fusion quality evaluation methods, J. Front. Comput. Sci. Technol., 12 (2018), 1021–1035. http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.1710001 doi: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.1710001
|
| [36] | V. Petrovic, C. Xydeas, Objective image fusion performance characterisation, in 10th IEEE International Conference on Computer Vision (ICCV), (2005), 1866–1871. https://doi.org/10.1109/ICCV.2005.175 |
| [37] | R. Balakrishnan, R. Priya, Hybrid multimodality medical image fusion technique for feature enhancement in medical diagnosis, J. Med. Imaging Health Inf., 8 (2018), 52–60. https://www.researchgate.net/publication/326913363 |
| [38] |
M. Hossny, S. Nahavandi, D. Creighton, Comments on information measure for performance of image fusion, Electron. Lett., 44 (2008), 1066–1067. https://doi.org/10.1049/el:20081754 doi: 10.1049/el:20081754
|
| [39] |
Q. Wang, Y. Shen, J. Zhang, A nonlinear correlation measure for multivariable data set, Physica D, 200 (2005), 287–295. https://doi.org/10.1016/j.physd.2004.11.001 doi: 10.1016/j.physd.2004.11.001
|
| [40] |
J. J. Lewis, R. J. O'Callaghan, S. G. Nikolov, D. R. Bull, N. Canagarajah, Pixel- and region-based image fusion with complex wavelets, Inf. Fusion, 8 (2007), 119–130. https://doi.org/10.1016/j.inffus.2005.09.006 doi: 10.1016/j.inffus.2005.09.006
|
| [41] |
Y. Liu, X. Chen, R. K. Ward, Z. J. Wang, Image fusion with convolutional sparse representation, IEEE Signal Process. Lett., 23 (2016), 1882–1886. https://doi.org/10.1109/LSP.2016.2618776 doi: 10.1109/LSP.2016.2618776
|
| [42] |
J. Ma, H. Zhang, Z. Shao, P. Liang, H. Xu, GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion, IEEE Trans. Instrum. Meas., 70 (2021), 1–14. https://doi.org/10.1109/TIM.2020.3038013 doi: 10.1109/TIM.2020.3038013
|
| [43] | H. Zhang, H. Xu, Y. Xiao, X. Guo, J. Ma, Rethinking the image fusion: a fast unified image fusion network based on proportional maintenance of gradient and intensity, in Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 12797–12804. https://doi.org/10.1609/aaai.v34i07.6975 |
| [44] |
X. Wang, L. Fang, J. Zhao, Z. Pan, H. Li, Y. Li, MMAE: A universal image fusion method via mask attention mechanism, Pattern Recognit., 158 (2025), 111041. https://doi.org/10.1016/j.patcog.2024.111041 doi: 10.1016/j.patcog.2024.111041
|