An infrared image super-resolution network fusing convolution and attention mechanisms

Sihang Luo; Yong Gan; Xuan Wang; Sihang Luo; Yong Gan; Xuan Wang

doi:10.3934/era.2026194

Electronic Research Archive

2026, Volume 34, Issue 7: 4387-4409. doi: 10.3934/era.2026194

Previous Article Next Article

Research article Special Issues

An infrared image super-resolution network fusing convolution and attention mechanisms

School of Computer Science and Artificial Intelligence, Zhengzhou University of Light Industry, Zhengzhou 450001, China

Received: 07 April 2026 Revised: 24 April 2026 Accepted: 27 April 2026 Published: 20 May 2026

Infrared imaging technology plays an indispensable role in critical applications such as military surveillance, autonomous driving, and medical diagnostics. However, its inherent low-resolution and low-contrast characteristics often limit operational performance. While deep learning-based super-resolution (SR) techniques offer a software-driven solution, models face severe feature redundancy caused by simply stacking deep layers, and a lack of discriminative power in distinguishing critical textures from thermal noise. To address these issues, we proposed a novel Convolutional and Attention-based Super-Resolution Network (CASRNet). The novelty of our model lies in the synergistic fusion of a channel splitting (CS) strategy and a dual attention mechanism. First, the CS strategy decomposes feature maps into parallel streams, extracting diverse and less redundant representations. Second, a novel channel and spatial attention residual block (CSA_ResBlock) was designed to adaptively focus on informative feature channels and critical spatial boundaries. Quantitatively, CASRNet achieved superior performance on public benchmarks. Specifically, for the FLIR dataset (× 2), our model achieved a peak signal-to-noise ratio (PSNR) of 39.73 dB and structural similarity (SSIM) of 0.9639, outperforming the state-of-the-art infrared-specific model TherISuRNet by 0.48 dB and standard models like VDSR by 0.15 dB. Similar robust improvements (e.g., an exceptional 40.45 dB PSNR on the ThermalTau2 dataset) demonstrated the general applicability and high fidelity of CASRNet for real-world infrared enhancement tasks.
- infrared image,
- super-resolution,
- convolutional neural networks,
- attention mechanism
Citation: Sihang Luo, Yong Gan, Xuan Wang. An infrared image super-resolution network fusing convolution and attention mechanisms[J]. Electronic Research Archive, 2026, 34(7): 4387-4409. doi: 10.3934/era.2026194

Related Papers:

Abstract

Infrared imaging technology plays an indispensable role in critical applications such as military surveillance, autonomous driving, and medical diagnostics. However, its inherent low-resolution and low-contrast characteristics often limit operational performance. While deep learning-based super-resolution (SR) techniques offer a software-driven solution, models face severe feature redundancy caused by simply stacking deep layers, and a lack of discriminative power in distinguishing critical textures from thermal noise. To address these issues, we proposed a novel Convolutional and Attention-based Super-Resolution Network (CASRNet). The novelty of our model lies in the synergistic fusion of a channel splitting (CS) strategy and a dual attention mechanism. First, the CS strategy decomposes feature maps into parallel streams, extracting diverse and less redundant representations. Second, a novel channel and spatial attention residual block (CSA_ResBlock) was designed to adaptively focus on informative feature channels and critical spatial boundaries. Quantitatively, CASRNet achieved superior performance on public benchmarks. Specifically, for the FLIR dataset (× 2), our model achieved a peak signal-to-noise ratio (PSNR) of 39.73 dB and structural similarity (SSIM) of 0.9639, outperforming the state-of-the-art infrared-specific model TherISuRNet by 0.48 dB and standard models like VDSR by 0.15 dB. Similar robust improvements (e.g., an exceptional 40.45 dB PSNR on the ThermalTau2 dataset) demonstrated the general applicability and high fidelity of CASRNet for real-world infrared enhancement tasks.

References

[1]	K. I. Danaci, E. Akagunduz, A survey on infrared image & video sets, Multimedia Tools Appl. , 83 (2024), 16485-16523. https://doi.org/10.1007/s11042-023-15327-8 doi: 10.1007/s11042-023-15327-8
[2]	J. Wang, J. Ou, Y. Fan, L. Cai, M. Zhou, Online monitoring of electrical equipment condition based on infrared image temperature data visualization, IEEJ Trans. Electr. Electron. Eng. , 17 (2022), 583-591. https://doi.org/10.1002/tee.23545 doi: 10.1002/tee.23545
[3]	F. Hou, Y. Zhang, Y. Zhou, M. Zhang, B. Lv, J. Wu, Review on infrared imaging technology, Sustainability, 14 (2022), 11161. https://doi.org/10.3390/su141811161 doi: 10.3390/su141811161
[4]	M. Alhameed, F. Jeribi, B. M. E. Elnaim, M. A. Hossain, M. E. Abdelhag, Pandemic disease detection through wireless communication using infrared image based on deep learning, Math. Biosci. Eng. , 20 (2023), 1083-1105. https://doi.org/10.3934/mbe.2023050 doi: 10.3934/mbe.2023050
[5]	B. Jiang, S. Chen, B. Wang, B. Luo, MGLNN: Semi-supervised learning via multiple graph cooperative learning neural networks, Neural Networks, 153 (2022), 204-214. https://doi.org/10.1016/j.neunet.2022.05.024 doi: 10.1016/j.neunet.2022.05.024
[6]	A. M. Roy, J. Bhaduri, DenseSPH-YOLOv5: An automated damage detection model based on DenseNet and Swin-Transformer prediction head-enabled YOLOv5 with attention mechanism, Adv. Eng. Inf. , 56 (2023), 102007. https://doi.org/10.1016/j.aei.2023.102007 doi: 10.1016/j.aei.2023.102007
[7]	S. Jamil, A. M. Roy, An efficient and robust phonocardiography (PCG)-based valvular heart diseases (VHD) detection framework using vision transformer (VIT), Comput. Biol. Med. , 158 (2023), 106734. https://doi.org/10.1016/j.compbiomed.2023.106734 doi: 10.1016/j.compbiomed.2023.106734
[8]	D. C. Lepcha, B. Goyal, A. Dogra, V. Goyal, Image super-resolution: A comprehensive review, recent trends, challenges and applications, Inf. Fusion, 91 (2023), 230-260. https://doi.org/10.1016/j.inffus.2022.10.007 doi: 10.1016/j.inffus.2022.10.007
[9]	D. Qiu, Y. Cheng, X. Wang, Medical image super-resolution reconstruction algorithms based on deep learning: A survey, Comput. Methods Programs Biomed. , 238 (2023), 107590. https://doi.org/10.1016/j.cmpb.2023.107590 doi: 10.1016/j.cmpb.2023.107590
[10]	K. Chauhan, S. N. Patel, M. Kumhar, J. Bhatia, S. Tanwar, I. E. Davidson, Deep learning-based single-image super-resolution: A comprehensive review, IEEE Access, 11 (2023), 21811-21830. https://doi.org/10.1109/ACCESS.2023.3251396 doi: 10.1109/ACCESS.2023.3251396
[11]	X. Wang, J. Yi, J. Guo, Y. Song, J. Lyu, J. Xu, et al., A review of image super-resolution approaches based on deep learning and applications in remote sensing, Remote Sens. , 14 (2022), 5423. https://doi.org/10.3390/rs14215423 doi: 10.3390/rs14215423
[12]	M. Wei, X. Zhang, Super-resolution neural operator, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2023), 18247-18256. https://doi.org/10.3390/rs14215423
[13]	Y. Zhang, Q. Fan, F. Bao, Y. Liu, C. Zhang, Single-image super-resolution based on rational fractal interpolation, IEEE Trans. Image Process. , 27 (2018), 3782-3797. https://doi.org/10.1109/TIP.2018.2826139 doi: 10.1109/TIP.2018.2826139
[14]	S. D. Sims, Frequency domain-based perceptual loss for super resolution, in 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), (2020), 1-6. https://doi.org/10.1109/MLSP49062.2020.9231718
[15]	P. Singh, S. S. Bose, A quantum-clustering optimization method for COVID-19 CT scan image segmentation, Expert Syst. Appl. , 185 (2021), 115637. https://doi.org/10.1016/j.eswa.2021.115637 doi: 10.1016/j.eswa.2021.115637
[16]	P. Singh, S. S. Bose, Ambiguous D-means fusion clustering algorithm based on ambiguous set theory: Special application in clustering of CT scan images of COVID-19, Knowledge-Based Syst. , 231 (2021), 107432. https://doi.org/10.1016/j.knosys.2021.107432 doi: 10.1016/j.knosys.2021.107432
[17]	P. Singh, Y. P. Huang, AKDC: Ambiguous kernel distance clustering algorithm for COVID-19 CT scans analysis, IEEE Trans. Syst. Man Cybern. : Syst. , 54 (2024), 6218-6229. https://doi.org/10.1109/TSMC.2024.3418411 doi: 10.1109/TSMC.2024.3418411
[18]	P. Singh, Y. P. Huang, An ambiguous edge detection method for computed tomography scans of coronavirus disease 2019 cases, IEEE Trans. Syst. Man Cybern. : Syst. , 54 (2023), 352-364. https://doi.org/10.1109/TSMC.2023.3307393 doi: 10.1109/TSMC.2023.3307393
[19]	M. C. Catalbas, Modified VDSR-based single image super-resolution using naturalness image quality evaluator, Signal, Image Video Process. , 16 (2022), 661-668. https://doi.org/10.1007/s11760-021-02005-1 doi: 10.1007/s11760-021-02005-1
[20]	F. Kong, M. Li, S. Liu, D. Liu, J. He, Y. Bai, et al., Residual local feature network for efficient super-resolution, preprint, arXiv: 2205.07514.
[21]	B. M. Kuriakose, J. Archpaul, V. E. Naveen, A. Lincy, EDSR: Empowering super-resolution algorithms with high-quality DIV2K images, Intell. Decis. Technol. , 17 (2023), 1249-1263. https://doi.org/10.3233/IDT-230043 doi: 10.3233/IDT-230043
[22]	X. Wang, L. Sun, A. Chehri, Y. Song, A review of GAN-based super-resolution reconstruction for optical remote sensing images, Remote Sens. , 15 (2023), 5062. https://doi.org/10.3390/rs15205062 doi: 10.3390/rs15205062
[23]	V. Chudasama, H. Patel, K. Prajapati, K. Upla, R. Ramachandra, K. Raja, et al., Therisurnet-a computationally efficient thermal image super-resolution network, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, (2020), 388-397. https://doi.org/10.1109/CVPRW50498.2020.00051
[24]	K. Umehara, J. Ota, T. Ishida, Application of super-resolution convolutional neural network for enhancing image resolution in chest CT, J. Digital Imaging, 31 (2018), 441-450. https://doi.org/10.1007/s10278-017-0033-z doi: 10.1007/s10278-017-0033-z
[25]	K. Prajapati, V. Chudasama, H. Patel, K. Upla, K. Raja, R. Ramachandra, Direct unsupervised super-resolution using generative adversarial network (DUS-GAN) for real-world data, IEEE Trans. Image Process. , 30 (2021), 8251-8264. https://doi.org/10.1109/TIP.2021.3113783 doi: 10.1109/TIP.2021.3113783
[26]	Z. Wang, B. Du, Y. Guo, Domain adaptation with neural embedding matching, IEEE Trans. Neural Networks Learn. Syst. , 31 (2019), 2387-2397. https://doi.org/10.1109/TNNLS.2019.2935608 doi: 10.1109/TNNLS.2019.2935608
[27]	Y. Ma, X. Wang, W. Gao, Y. Du, J. Huang, F. Fan, Progressive fusion network based on infrared light field equipment for infrared image enhancement, IEEE/CAA J. Autom. Sin. , 9 (2022), 1687-1690. https://doi.org/10.1109/JAS.2022.105812 doi: 10.1109/JAS.2022.105812
[28]	R. Hou, D. Zhou, R. Nie, D. Liu, L. Xiong, Y. Guo, et al., VIF-Net: An unsupervised framework for infrared and visible image fusion, IEEE Trans. Comput. Imaging, 6 (2020), 640-651. https://doi.org/10.1109/TCI.2020.2965304 doi: 10.1109/TCI.2020.2965304
[29]	R. Hou, X. Li, T. Ren, D. Zhou, G. Wu, J. Cao, et al., HyPSAM: Hybrid prompt-driven segment anything model for RGB-thermal salient object detection, IEEE Trans. Circuits Syst. Video Technol. , 36 (2026), 2697-2712. https://doi.org/10.1109/TCSVT.2025.3613770 doi: 10.1109/TCSVT.2025.3613770
[30]	Y. Liu, Y. Wang, N. Li, X. Cheng, Y. Zhang, Y. Huang, et al., An attention-based approach for single image super resolution, in 2018 24Th International Conference on Pattern Recognition (ICPR), (2018), 2777-2784. https://doi.org/10.1109/ICPR.2018.8545760
[31]	J. Cai, Z. Meng, C. M. Ho, Residual channel attention generative adversarial network for image super-resolution and noise reduction, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, (2020), 454-455. https://doi.org/10.1109/CVPRW50498.2020.00234
[32]	B. Cui, H. Zhang, W. Jing, H. Liu, J. Cui, SRSe-net: Super-resolution-based semantic segmentation network for green tide extraction, Remote Sens. , 14 (2022), 710. https://doi.org/10.3390/rs14030710 doi: 10.3390/rs14030710
[33]	Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, Y. Fu, Image super-resolution using very deep residual channel attention networks, in Computer Vision-ECCV 2018, (2018), 294-310. https://doi.org/10.1007/978-3-030-01234-2_18
[34]	C. Chen, D. Gong, H. Wang, Z. Li, K. Wong, Learning spatial attention for face super-resolution, IEEE Trans. Image Process. , 30 (2020), 1219-1231. https://doi.org/10.1109/TIP.2020.3043093 doi: 10.1109/TIP.2020.3043093
[35]	M. Yin, Z. Chen, C. Zhang, A CNN-transformer network combining CBAM for change detection in high-resolution remote sensing images, Remote Sens. , 15 (2023), 2406. https://doi.org/10.3390/rs15092406 doi: 10.3390/rs15092406
[36]	H. Fang, M. Xia, G. Zhou, Y. Chang, L. Yan, Infrared small UAV target detection based on residual image prediction via global and local dilated residual networks, IEEE Geosci. Remote Sens. Lett. , 19 (2021), 1-5. https://doi.org/10.1109/LGRS.2021.3085495 doi: 10.1109/LGRS.2021.3085495
[37]	Z. Wu, J. Chen, L. Tan, H. Gong, Y. Zhou, G. Shi, A lightweight GAN-based image fusion algorithm for visible and infrared images, preprint, arXiv: 2409.15332.
[38]	R. E. Rivadeneira, A. D. Sappa, C. Wang, J. Jiang, Z. Zhong, P. Chen, et al., Thermal image super-resolution challenge results-pbvs 2024, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2024), 3113-3122. https://doi.org/10.1109/CVPRW63382.2024.00317
[39]	R. E. Rivadeneira, P. L. Suárez, A. D. Sappa, B. X. Vintimilla, Thermal image superresolution through deep convolutional neural network, in Proceedings of the Image Analysis and Recognition: 16th International Conference (ICIAR2019), (2019), 417-426. https://doi.org/10.1007/978-3-030-27272-2_37

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)