Research article

Optimal feature selection using novel flamingo search algorithm for classification of COVID-19 patients from clinical text


  • Received: 29 October 2022 Revised: 12 December 2022 Accepted: 22 December 2022 Published: 11 January 2023
  • Though several AI-based models have been established for COVID-19 diagnosis, the machine-based diagnostic gap is still ongoing, making further efforts to combat this epidemic imperative. So, we tried to create a new feature selection (FS) method because of the persistent need for a reliable system to choose features and to develop a model to predict the COVID-19 virus from clinical texts. This study employs a newly developed methodology inspired by the flamingo's behavior to find a near-ideal feature subset for accurate diagnosis of COVID-19 patients. The best features are selected using a two-stage. In the first stage, we implemented a term weighting technique, which that is RTF-C-IEF, to quantify the significance of the features extracted. The second stage involves using a newly developed feature selection approach called the improved binary flamingo search algorithm (IBFSA), which chooses the most important and relevant features for COVID-19 patients. The proposed multi-strategy improvement process is at the heart of this study to improve the search algorithm. The primary objective is to broaden the algorithm's capabilities by increasing diversity and support exploring the algorithm search space. Additionally, a binary mechanism was used to improve the performance of traditional FSA to make it appropriate for binary FS issues. Two datasets, totaling 3053 and 1446 cases, were used to evaluate the suggested model based on the Support Vector Machine (SVM) and other classifiers. The results showed that IBFSA has the best performance compared to numerous previous swarm algorithms. It was noted, that the number of feature subsets that were chosen was also drastically reduced by 88% and obtained the best global optimal features.

    Citation: Amir Yasseen Mahdi, Siti Sophiayati Yuhaniz. Optimal feature selection using novel flamingo search algorithm for classification of COVID-19 patients from clinical text[J]. Mathematical Biosciences and Engineering, 2023, 20(3): 5268-5297. doi: 10.3934/mbe.2023244

    Related Papers:

  • Though several AI-based models have been established for COVID-19 diagnosis, the machine-based diagnostic gap is still ongoing, making further efforts to combat this epidemic imperative. So, we tried to create a new feature selection (FS) method because of the persistent need for a reliable system to choose features and to develop a model to predict the COVID-19 virus from clinical texts. This study employs a newly developed methodology inspired by the flamingo's behavior to find a near-ideal feature subset for accurate diagnosis of COVID-19 patients. The best features are selected using a two-stage. In the first stage, we implemented a term weighting technique, which that is RTF-C-IEF, to quantify the significance of the features extracted. The second stage involves using a newly developed feature selection approach called the improved binary flamingo search algorithm (IBFSA), which chooses the most important and relevant features for COVID-19 patients. The proposed multi-strategy improvement process is at the heart of this study to improve the search algorithm. The primary objective is to broaden the algorithm's capabilities by increasing diversity and support exploring the algorithm search space. Additionally, a binary mechanism was used to improve the performance of traditional FSA to make it appropriate for binary FS issues. Two datasets, totaling 3053 and 1446 cases, were used to evaluate the suggested model based on the Support Vector Machine (SVM) and other classifiers. The results showed that IBFSA has the best performance compared to numerous previous swarm algorithms. It was noted, that the number of feature subsets that were chosen was also drastically reduced by 88% and obtained the best global optimal features.



    加载中


    [1] C. Li, C. Zhao, J. Bao, B. Tang, Y. Wang, B. Gu, Laboratory diagnosis of coronavirus disease-2019 (COVID-19), Clin. Chim. Acta., 510 (2020), 35–46. https://doi.org/10.1016/j.cca.2020.06.045 doi: 10.1016/j.cca.2020.06.045
    [2] Y. Guo, Q. Cao, Z. Hong, Y. Tan, S. Chen, H. Jin, et al., The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak- A n update on the status, Mil. Med. Res., 7 (2020), 1–10. https://doi.org/10.1186/s40779-020-00240-0 doi: 10.1186/s40779-020-00240-0
    [3] M. Rostami, M. Oussalah, A novel explainable COVID-19 diagnosis method by integration of feature selection with random forest, Inform. Med. Unlocked, 30 (2022), 100941. https://doi.org/10.1016/j.imu.2022.100941 doi: 10.1016/j.imu.2022.100941
    [4] X. Luo, P. Gandhi, S. S. KH, A deep language model for symptom extraction from clinical text and its application to extract COVID-19 symptoms from social media, IEEE J. Biomed. Heal Inform., 26 (2022), 1737–1748. https://doi.org/10.1109/JBHI.2021.3123192 doi: 10.1109/JBHI.2021.3123192
    [5] G. Saranya, A. Pravin, Feature selection techniques for disease diagnosis system: A survey, in Artificial Intelligence Techniques for Advanced Computing Applications, Springer, Singapore, 130 (2021), 249–258. https://doi.org/10.1007/978-981-15-5329-5_24
    [6] J. T. Pintas, L. A. F. Fernandes, A. C. B. Garcia, Feature selection methods for text classification: A systematic literature review, Artif. Intell. Rev., 54 (2021), 6149–6200. https://doi.org/10.1007/s10462-021-09970-6 doi: 10.1007/s10462-021-09970-6
    [7] L. M. Abualigah, A. T. Khader, E. S. Hanandeh, A new feature selection method to improve the document clustering using particle swarm optimization algorithm, J. Comput. Sci., 25 (2018), 456–466. https://doi.org/10.1016/j.jocs.2017.07.018 doi: 10.1016/j.jocs.2017.07.018
    [8] D. A. Elmanakhly, M. Saleh, E. A. Rashed, M. Abdel-Basset, BinHOA : Efficient binary horse herd optimization method for feature selection : Analysis and validations, IEEE Access., 10 (2022), 26795–26816. https://doi.org/10.1109/ACCESS.2022.3156593 doi: 10.1109/ACCESS.2022.3156593
    [9] R. Abu Khurmaa, I. Aljarah, A. Sharieh, An intelligent feature selection approach based on moth flame optimization for medical diagnosis, Neural Comput. Appl., 33 (2021), 7165–7204. https://doi.org/10.1007/s00521-020-05483-5 doi: 10.1007/s00521-020-05483-5
    [10] P. H. Prastyo, R. Hidayat, I. Ardiyanto, Enhancing sentiment classification performance using hybrid query expansion ranking and binary particle swarm optimization with adaptive inertia weights, ICT Express., 8 (2021), 189–197. https://doi.org/10.1016/j.icte.2021.04.009 doi: 10.1016/j.icte.2021.04.009
    [11] B. Ji, X. Lu, G. Sun, W. Zhang, J. Li, Y. Xiao, Bio-Inspired feature selection : An improved binary particle swarm optimization approach, IEEE Access., 8 (2020), 85989–86002. https://doi.org/10.1109/ACCESS.2020.2992752 doi: 10.1109/ACCESS.2020.2992752
    [12] H. K. H. Chantar, M. M. Mafarja, H. I. Alsawalqah, A. A. Heidari, I. Aljarah, H. Faris, Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification, Neural Comput. Appl., 32 (2020), 12201–12220. https://doi.org/10.1007/s00521-019-04368-6 doi: 10.1007/s00521-019-04368-6
    [13] M. H. Nadimi-Shahraki, S. Taghian, S. Mirjalili, L. Abualigah. Binary aquila optimizer for selecting effective features from medical data: A COVID-19 case study, Math. MDPI., 10 (2022), 1–24. https://doi.org/10.3390/math10111929 doi: 10.3390/math10111929
    [14] J. Piri, P. Mohapatra, B. Acharya, F. S. Gharehchopogh, V. C. Gerogiannis, A. Kanavos, et al., Feature selection using artificial gorilla troop optimization for biomedical data: A case analysis with COVID-19 data, Mathematics, 10 (2022), 1–31. https://doi.org/10.3390/math10152742 doi: 10.3390/math10152742
    [15] W. Tuerxun, X. Chang, G. Hongyu, J. Zhijie, Z. Huajian, Fault diagnosis of wind turbines based on a support vector machine optimized by the sparrow search algorithm, IEEE Power Energy Soc. Sect., 9 (2021), 69307–69315. https://doi.org/10.1109/ACCESS.2021.3075547 doi: 10.1109/ACCESS.2021.3075547
    [16] C. A. Flores, R. L. Figueroa, J. E. Pezoa, FREGEX: A feature extraction method for biomedical text classification using regular expressions, in 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), (2019), 6085–6088. https://doi.org/10.1109/EMBC.2019.8857471
    [17] W. M. Shaban, A. H. Rabie, A. I. Saleh, M. A. Abo-Elsoud, Accurate detection of COVID-19 patients based on distance biased Naïve Bayes (DBNB) classification strategy, Pattern Recognit., 119 (2021), 108110–108110. https://doi.org/10.1016/j.patcog.2021.108110 doi: 10.1016/j.patcog.2021.108110
    [18] A. Singh, K. K. Singh, M. Greguš, I. Izonin, CNGOD-An improved convolution neural network with grasshopper optimization for detection of COVID-19, Math. Biosci. Eng., 9 (2022), 12518–12531. https://doi.org/10.3934/mbe.2022584 doi: 10.3934/mbe.2022584
    [19] Z. M. Fadhil, R. A. Jaleel, Multiple efficient data mining algorithms with genetic selection for prediction of SARS-CoV2, in 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), (2022). https://doi.org/10.1109/ICACITE53722.2022.9823757
    [20] I. M. El-Hasnony, M. Elhoseny, Z. Tarek, A hybrid feature selection model based on butterfly optimization algorithm: COVID‐19 as a case study, Expert Syst., 39 (2022), e12786. https://doi.org/10.1111/exsy.12786 doi: 10.1111/exsy.12786
    [21] M. A. k. alsaeedi, S. Kurnaz, Feature selection for diagnose coronavirus (COVID-19) disease by neural network and Caledonian crow learning algorithm, Appl Nanosci., (2022), 1–16. https://doi.org/10.1007/s13204-021-02159-x doi: 10.1007/s13204-021-02159-x
    [22] T. Bezdan, M. Zivkovic, N. Bacanin, A. Chhabra, M. Suresh, Feature selection by hybrid brain storm optimization algorithm for COVID-19 classification, J. Comput. Biol., 29 (2022), 515–529. https://doi.org/10.1089/cmb.2021.0256 doi: 10.1089/cmb.2021.0256
    [23] Z. Wang, J. Liu, Flamingo search algorithm and its application to path planning problem, in 2021 4th Flamingo search algorithm and its application to path planning problem, (2021), 567–573. https://doi.org/10.1145/3488933.3489011
    [24] A. Onan, M. A. Toçoğlu, A term weighted neural language model and stacked bidirectional LSTM based framework for sarcasm identification, IEEE Access, 9 (2021), 7701–7722. https://doi.org/10.1109/ACCESS.2021.3049734 doi: 10.1109/ACCESS.2021.3049734
    [25] M. Neumann, D. King, I. Beltagy W. Ammar, ScispaCy: Fast and robust models for biomedical natural language processing, in Proceedings of the 18th BioNLP Workshop and Shared Task, (2019), 319–327. https://doi.org/10.18653/v1/W19-5034
    [26] A. Y. Mahdi, S. S. Yuhaniz, Automatic diagnosis of COVID-19 patients from unstructured data based on a novel weighting scheme, C. Mater. Contin., 74 (2022), 1375–1392. https://doi.org/10.32604/cmc.2023.032671 doi: 10.32604/cmc.2023.032671
    [27] T. Parlar, S. A. Özel, F. Song, A new feature selection method for sentiment analysis, Human-centric Comput. Inf. Sci., 8 (2018), 1–19. https://doi.org/10.1515/jisys-2018-0171 doi: 10.1515/jisys-2018-0171
    [28] S. L. Marie-Sainte, N. Alalyani, Firefly algorithm based feature selection for arabic text classification, J. King Saud Univ. Comput. Inf. Sci., 32 (2020), 320–328, https://doi.org/10.1016/j.jksuci.2018.06.004 doi: 10.1016/j.jksuci.2018.06.004
    [29] W. Zhiheng, L. Jianhua, Flamingo search algorithm: A new swarm intelligence optimization algorithm, IEEE Access., 9 (2021), 88564–88582. https://doi.org/10.1109/ACCESS.2021.3090512 doi: 10.1109/ACCESS.2021.3090512
    [30] M. Abd El Aziz, A. Hassanien, Modified cuckoo search algorithm with rough sets for feature selection, Neural Comput. Appl., 29 (2018), 925–934. https://doi.org/10.1007/s00521-016-2473-7 doi: 10.1007/s00521-016-2473-7
    [31] Z. Li, Y. Zhou, S. Zhang, J. Song, Lévy-Flight Moth-Flame algorithm for function optimization and engineering design problems, Math. Probl. Eng., (2016), 1–22. https://doi.org/10.1155/2016/1423930 doi: 10.1155/2016/1423930
    [32] P. A. Digehsara, S. N. Chegini, A. Bagheri, M. P. Roknsaraei, An improved particle swarm optimization based on the reinforcement of the population initialization phase by scrambled Halton sequence, Cogent. Eng., 7 (2020), 1–29. https://doi.org/10.1080/23311916.2020.1737383 doi: 10.1080/23311916.2020.1737383
    [33] B. Kazimipour, X. Li, A. K. Qin, A review of population initialization techniques for evolutionary algorithms, 2014 IEEE Congr. Evol. Comput., (2014), 2585–2592. https://doi.org/10.1109/CEC.2014.6900618 doi: 10.1109/CEC.2014.6900618
    [34] W. H. Bangyal, A. Hameed, W. Alosaimi, H. Alyami, A new initialization approach in particle swarm optimization for global optimization problems, Comput. Intell. Neurosci., 2021 (2021), 1–17. https://doi.org/10.1155/2021/6628889 doi: 10.1155/2021/6628889
    [35] A. G. Gad, K. M. Sallam, R. K. Chakrabortty, M. J. Ryan, A. A. Abohany, An improved binary sparrow search algorithm for feature selection in data classification, Neural Comput. Appl., 34 (2022), 15705–15752. https://doi.org/10.1007/s00521-022-07546-1 doi: 10.1007/s00521-022-07546-1
    [36] P.H. Prastyo, A.S. Sumi, A.W. Dian, A. E Permanasari, Tweets responding to the Indonesian government's handling of COVID-19: Sentiment analysis using SVM with Normalized Poly Kernel, J. Inf. Syst. Eng. Bus. Intell., 6 (2020), 112–122. https://doi.org/10.20473/jisebi.6.2.112-122 doi: 10.20473/jisebi.6.2.112-122
    [37] K. Kowsari, K. Meimandi, M. Heidarysafa, S. Mendu, L. E. Barnes, D. E. Brown, Text classification algorithms : A survey, Inf. J., 10 (2019), 1–68. https://doi.org/10.3390/info10040150 doi: 10.3390/info10040150
    [38] M. Qaraad, S. Amjad, I. I. M. Manhrawy, H. Fathi, B. A. Hassan, P. E. Kafrawy, A hybrid feature selection optimization model for high dimension data classification, IEEE Access., 9 (2021), 42884–42895. https://doi.org/10.1109/ACCESS.2021.3065341 doi: 10.1109/ACCESS.2021.3065341
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1752) PDF downloads(52) Cited by(6)

Article outline

Figures and Tables

Figures(11)  /  Tables(18)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog