Leveraging deep learning and image conversion of executable files for effective malware detection: A static malware analysis approach

Mesut GUVEN; Mesut GUVEN

doi:10.3934/math.2024739

AIMS Mathematics

2024, Volume 9, Issue 6: 15223-15245. doi: 10.3934/math.2024739

Previous Article Next Article

Research article Special Issues

Leveraging deep learning and image conversion of executable files for effective malware detection: A static malware analysis approach

Mesut GUVEN ^,

TOBB University of Economics and Technology, TR-06560 Ankara, Turkey

Received: 15 March 2024 Revised: 07 April 2024 Accepted: 17 April 2024 Published: 26 April 2024
MSC : 68T45, 68U20

The escalating sophistication of malware poses a formidable security challenge, as it evades traditional protective measures. Static analysis, an initial step in malware investigation, involves code scrutiny without actual execution. One static analysis approach employs the conversion of executable files into image representations, harnessing the potency of deep learning models. Convolutional neural networks (CNNs), particularly adept at image classification, have potential for malware detection. However, their inclination towards structured data requires a preprocessing phase to convert software into image-like formats. This paper outlines a methodology for malware detection that involves applying deep learning models to image-converted executable files. Experimental evaluations have been performed by using CNN models, autoencoder-based models, and pre-trained counterparts, all of which have exhibited commendable performance. Consequently, employing deep learning for image-converted executable analysis emerges as a fitting strategy for the static analysis of software. This research is significant because it utilized the largest dataset to date and encompassed a wide range of deep learning models, many of which have not previously been tested together.
- artificial intelligence,
- deep learning,
- convolutional neural networks,
- autoencoders,
- transfer learning,
- malware detection,
- executable files
Citation: Mesut GUVEN. Leveraging deep learning and image conversion of executable files for effective malware detection: A static malware analysis approach[J]. AIMS Mathematics, 2024, 9(6): 15223-15245. doi: 10.3934/math.2024739

Related Papers:

Abstract

The escalating sophistication of malware poses a formidable security challenge, as it evades traditional protective measures. Static analysis, an initial step in malware investigation, involves code scrutiny without actual execution. One static analysis approach employs the conversion of executable files into image representations, harnessing the potency of deep learning models. Convolutional neural networks (CNNs), particularly adept at image classification, have potential for malware detection. However, their inclination towards structured data requires a preprocessing phase to convert software into image-like formats. This paper outlines a methodology for malware detection that involves applying deep learning models to image-converted executable files. Experimental evaluations have been performed by using CNN models, autoencoder-based models, and pre-trained counterparts, all of which have exhibited commendable performance. Consequently, employing deep learning for image-converted executable analysis emerges as a fitting strategy for the static analysis of software. This research is significant because it utilized the largest dataset to date and encompassed a wide range of deep learning models, many of which have not previously been tested together.

References

[1]	K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, H. Liu, A review of android malware detection approaches based on machine learning, IEEE Access, 8 (2020). https://doi.org/10.1109/ACCESS.2020.3006143
[2]	B. Amos, H. Turner, J. White, Applying machine learning classifiers to dynamic Android malware detection at scale, In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), IEEE, Italy, 2013, 1666–1671. https://doi.org/10.1109/IWCMC.2013.6583806
[3]	M. Egele, T. Scholte, E. Kirda, C. Kruegel, A survey on automated dynamic malware-analysis techniques and tools, ACM Comput. Surv., 44 (2012), 1–42.
[4]	B. Amro, Malware detection techniques for mobile devices, Int. J. Mobile Netw. Commun. Telemat., 7 (2017). https://doi.org/10.1145/2089125.2089126
[5]	K. Kavitha, P. Salini, V. Ilamathy, Exploring the malicious Android applications and reducing risk using static analysis, In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), IEEE, India, 2016, 1316–1319. https://doi.org/10.1109/ICEEOT.2016.7754896
[6]	E. M. B. Karbab, M. Debbabi, MalDy: Portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports, Digit. Invest., 28 (2019), 77–87. https://doi.org/10.1016/j.diin.2019.01.017 doi: 10.1016/j.diin.2019.01.017
[7]	R. Ito, M. Mimura, Detecting unknown malware from ASCII strings with natural language processing techniques, In: 2019 14th Asia Joint Conference on Information Security (AsiaJCIS), IEEE, Japan, 2019. https://doi.org/10.1109/AsiaJCIS.2019.00-12
[8]	P. Najafi, D. Koehler, F. Cheng, C. Meinel, NLP-based entity behavior analytics for malware detection, In: 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC), IEEE, USA, 2021. https://doi.org/10.1109/IPCCC51483.2021.9679411
[9]	U. Raghav, E. Martinez-Marroquin, W. Ma, Static analysis for Android Malware detection with document vectors, In: 2021 International Conference on Data Mining Workshops (ICDMW), IEEE, New Zealand, 2021. https://doi.org/10.1109/ICDMW53433.2021.00104
[10]	X. Xing, X. Jin, H. Elahi, H. Jiang, G. Wang, A malware detection approach using autoencoder in deep learning, IEEE Access, 10 (2022), 25696–25706. https://doi.org/10.1109/ACCESS.2022.3155695 doi: 10.1109/ACCESS.2022.3155695
[11]	Q. Le, O. Boydell, B. Mac, M. Scanlon, Deep learning at the shallow end: Malware classification for non-domain experts, Digit. Invest., 26 (2018), S118–S126. http://dx.doi.org/10.1016/j.diin.2018.04.024 doi: 10.1016/j.diin.2018.04.024
[12]	J. Y. Kim, S. J. Bu, S. B. Cho, Zeroday malware detection using transferred generative adversarial networks based on deep autoencoders, Inform. Sci., 460–461 (2018), 83–102. https://doi.org/10.1016/j.ins.2018.04.092 doi: 10.1016/j.ins.2018.04.092
[13]	I. Goodfellow, NIPS 2016 Tutorial: Generative adversarial networks, arXiv preprint, 2014. https://doi.org/10.48550/arXiv.1701.00160
[14]	S. Kumar, B. Janet, DTMIC: Deep transfer learning for malware image classification, J. Inf. Secur. Appl., 64 (2022). https://doi.org/10.1016/j.jisa.2021.103063
[15]	Ö. Aslan, A. A. Yilmaz, A new malware classification framework based on deep learning algorithms, IEEE Access, 9 (2021), 87936–87951. https://doi.org/10.1109/ACCESS.2021.3089586 doi: 10.1109/ACCESS.2021.3089586
[16]	F. Rustam, I. Ashraf, A. D. Jurcut, A. K. Bashir, Y. B. Zikria, Malware detection using image representation of malware data and transfer learning, J. Parallel Distr. Com., 172 (2023), 32–50. https://doi.org/10.1016/j.jpdc.2022.10.001 doi: 10.1016/j.jpdc.2022.10.001
[17]	T. Li, Y. Luo, X. Wan, Q. Li, Q. Liu, R. Wang, et al., A malware detection model based on imbalanced heterogeneous graph embeddings, Expert Syst. Appl., 246 (2014), 123109.
[18]	Google play store. Available from: https://https://play.google.com/store/apps.
[19]	Virusshare. Available from: http://virusshare.com/.
[20]	Virustotal. Available from: https://www.virustotal.com/gui/home/upload.
[21]	L. Nataraj, S. Karthikeyan, G. Jacob, B. S. Manjunath, Malware images: Visualization and automatic classification, In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, 2011, 1–7. https://doi.org/10.1145/2016904.2016908
[22]	A. S. Bozkir, A. O. Cankaya, M. Aydos, Utilization and comparison of convolutional neural networks in malware recognition, In: 2019 27th Signal Processing and Communications Applications Conference (SIU), IEEE, Turkey, 2019, 1–4. https://doi.org/10.1109/SIU.2019.8806511
[23]	MaleVis. Available from: https://web.cs.hacettepe.edu.tr/selman/malevis/.
[24]	S. Venkatraman, M. Alazab, R. Vinayakumar, A hybrid deep learning image-based analysis for effective malware detection, J. Inf. Secur. Appl., 47 (2019), 377–389.
[25]	A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet classification with deep convolutional neural networks, Adv. Neural Inform. Proc. Syst., 2012.
[26]	K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint, 2014. https://doi.org/10.48550/arXiv.1409.1556
[27]	K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016,770–778.
[28]	C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, 2818–2826.
[29]	G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger, Densely connected convolutional networks, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, 4700–4708.
[30]	J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NIPS), 2014.
[31]	S. J. Pan, Q. Yang, A survey on transfer learning, IEEE T. Knowl. Data Eng., 22 (2010), 1345–1359. https://doi.org/10.1109/TKDE.2009.191 doi: 10.1109/TKDE.2009.191
[32]	R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, S. Venkatraman, Robust intelligent malware detection using deep learning, IEEE Access, 7 (2019), 46717–46738. https://doi.org/10.1109/ACCESS.2019.2906934 doi: 10.1109/ACCESS.2019.2906934
[33]	J. S. Luo, D. C. T. Lo, Binary malware image classification using machine learning with local binary pattern, In: 2017 IEEE International Conference on Big Data (Big Data), IEEE, USA, 2017, 4664–4667. https://doi.org/10.1109/BigData.2017.8258512
[34]	Z. Cui, F. Xue, X. Cai, Y. Cao, G. G. Wang, J. Chen, Detection of malicious code variants based on deep learning, IEEE T. Ind. Inform., 14 (2018), 3187–3196. https://doi.org/10.1109/tii.2018.2822680 doi: 10.1109/tii.2018.2822680
[35]	D. Gibert, Convolutional neural networks for malware classification, M.S. thesis, Univ. Rovira Virgili, Tarragona, Spain, 2016.
[36]	A. Singh, A. Handa, N. Kumar, S. K. Shukla, Malware classification using image representation, In: Proc. Int. Symp. Cyber Secur. Cryptogr. Mach. Learn. Cham, Switzerland: Springer, 2019, 75–92. https://doi.org/10.1007/978-3-030-20951-3_6

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)