Feature selection based on fuzzy joint mutual information maximization

Omar A. M. Salem; Feng Liu; Ahmed Sobhy Sherif; Wen Zhang; Xi Chen; Omar A. M. Salem; Feng Liu; Ahmed Sobhy Sherif; Wen Zhang; Xi Chen

doi:10.3934/mbe.2021016

Mathematical Biosciences and Engineering

2021, Volume 18, Issue 1: 305-327. doi: 10.3934/mbe.2021016

Previous Article Next Article

Research article Special Issues

Feature selection based on fuzzy joint mutual information maximization

1.
School of Computer Science, Wuhan University, Wuhan 430072, China
2.
Faculty of Computers and Informatics, Suez Canal University, Ismailia 41522, Egypt
3.
College of informatics, Huazhong Agricultural University, Wuhan 430070, China

Received: 06 July 2020 Accepted: 09 November 2020 Published: 30 November 2020

Nowadays, real-world applications handle a huge amount of data, especially with high-dimension features space. These datasets are a significant challenge for classification systems. Unfortunately, most of the features present are irrelevant or redundant, thus making these systems inefficient and inaccurate. For this reason, many feature selection (FS) methods based on information theory have been introduced to improve the classification performance. However, the current methods have some limitations such as dealing with continuous features, estimating the redundancy relations, and considering the outer-class information. To overcome these limitations, this paper presents a new FS method, called Fuzzy Joint Mutual Information Maximization (FJMIM). The effectiveness of our proposed method is verified by conducting an experimental comparison with nine of conventional and state-of-the-art feature selection methods. Based on 13 benchmark datasets, experimental results confirm that our proposed method leads to promising improvement in classification performance and feature selection stability.
- mutual information,
- fuzzy sets,
- fuzzy mutual information,
- feature selection,
- classification systems
Citation: Omar A. M. Salem, Feng Liu, Ahmed Sobhy Sherif, Wen Zhang, Xi Chen. Feature selection based on fuzzy joint mutual information maximization[J]. Mathematical Biosciences and Engineering, 2021, 18(1): 305-327. doi: 10.3934/mbe.2021016

Related Papers:

Abstract

Nowadays, real-world applications handle a huge amount of data, especially with high-dimension features space. These datasets are a significant challenge for classification systems. Unfortunately, most of the features present are irrelevant or redundant, thus making these systems inefficient and inaccurate. For this reason, many feature selection (FS) methods based on information theory have been introduced to improve the classification performance. However, the current methods have some limitations such as dealing with continuous features, estimating the redundancy relations, and considering the outer-class information. To overcome these limitations, this paper presents a new FS method, called Fuzzy Joint Mutual Information Maximization (FJMIM). The effectiveness of our proposed method is verified by conducting an experimental comparison with nine of conventional and state-of-the-art feature selection methods. Based on 13 benchmark datasets, experimental results confirm that our proposed method leads to promising improvement in classification performance and feature selection stability.

References

[1]	L. T. Vinh, S. Lee, Y. Park, B. J. d'Auriol, A novel feature selection method based on normalized mutual information, Appl. Intell., 37 (2012), 100-120. doi: 10.1007/s10489-011-0315-y
[2]	J. R. Vergara, P. A. Estévez, A review of feature selection methods based on mutual information, Neural Comput. Appl., 24 (2014), 175-186. doi: 10.1007/s00521-013-1368-0
[3]	I. K. Fodor, A survey of dimension reduction techniques, Lawrence Livermore National Lab, CA (US), 2002.
[4]	H. X. Li, L. D. Xu, Feature space theory—a mathematical foundation for data mining, Knowl. Based Syst., 14 (2001), 253-257. doi: 10.1016/S0950-7051(01)00103-4
[5]	R. Thawonmas, S. Abe, A novel approach to feature selection based on analysis of class regions, IEEE Trans. Syst. Man Cybern. Syst., 27 (1997), 196-207. doi: 10.1109/3477.558798
[6]	Y. Saeys, I. Inza, P. Larrañaga, A review of feature selection techniques in bioinformatics, Bioinformatics, 23 (2007), 2507-2517. doi: 10.1093/bioinformatics/btm344
[7]	I. Guyon, A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3 (2003), 1157-1182.
[8]	M. Bennasar, Y. Hicks, R. Setchi, Feature selection using joint mutual information maximisation, Expert Syst. Appl., 42 (2015), 8520-8532. doi: 10.1016/j.eswa.2015.07.007
[9]	Q. Hu, D. Yu, Z. Xie, Information-preserving hybrid data reduction based on fuzzy-rough techniques, Pattern Recognit. Lett., 27 (2006), 414-423. doi: 10.1016/j.patrec.2005.09.004
[10]	C. Lazar, J. Taminau, S. Meganck, D. Steenhoff, A. Coletta, C. Molter, et al., A survey on filter techniques for feature selection in gene expression microarray analysis, IEEE/ACM Trans. Comput. Biol. Bioinf., 9 (2012), 1106-1119.
[11]	G. Chandrashekar, F. Sahin, A survey on feature selection methods, Comput. Electr. Eng., 40 (2014), 16-28. doi: 10.1016/j.compeleceng.2013.11.024
[12]	O. A. Salem, L. Wang, Fuzzy mutual information feature selection based on representative samples, Int. J. Software Innovation, 6 (2018), 58-72. doi: 10.4018/IJSI.2018010105
[13]	D. Mo, S. H. Huang, Feature selection based on inference correlation, Intell. Data Anal., 15 (2011), 375-398. doi: 10.3233/IDA-2010-0473
[14]	R. Steuer, J. Kurths, C. O. Daub, J. Weise, J. Selbig, The mutual information: detecting and evaluating dependencies between variables, Bioinformatics, 18 (2002), S231-S240. doi: 10.1093/bioinformatics/18.suppl_2.S231
[15]	J. Wang, J. M. Wei, Z. Yang, S. Q. Wang, Feature selection by maximizing independent classification information, IEEE Trans. Knowl. Data Eng., 29 (2017), 828-841. doi: 10.1109/TKDE.2017.2650906
[16]	F. Macedo, M. R. Oliveira, A. Pacheco, R. Valadas, Theoretical foundations of forward feature selection methods based on mutual information, Neurocomputing, 325 (2019), 67-89. doi: 10.1016/j.neucom.2018.09.077
[17]	D. Yu, S. An, Q. Hu, Fuzzy mutual information based min-redundancy and max-relevance heterogeneous feature selection, Int. J. Comput. Intell. Syst., 4 (2011), 619-633. doi: 10.1080/18756891.2011.9727817
[18]	J. Liang, K. Chin, C. Dang, R. C. Yam, A new method for measuring uncertainty and fuzziness in rough set theory, Int. J. Gen. Syst., 31 (2002), 331-342. doi: 10.1080/0308107021000013635
[19]	Z. Li, P. Zhang, X. Ge, N. Xie, G. Zhang, C. F. Wen, Uncertainty measurement for a fuzzy relation information system, IEEE Trans. Fuzzy Syst., 27 (2019), 2338-2352.
[20]	C. Wang, Y. Huang, M. Shao, D. Chen, Uncertainty measures for general fuzzy relations, Fuzzy Sets Syst., 360 (2019), 82-96. doi: 10.1016/j.fss.2018.07.006
[21]	Y. Li, K. Qin, X. He, Some new approaches to constructing similarity measures, Fuzzy Sets Syst., 234 (2014), 46-60. doi: 10.1016/j.fss.2013.03.008
[22]	G. Brown, A new perspective for information theoretic feature selection, Artif. Intell. Stat., 2009, 49-56.
[23]	D. D. Lewis, Feature selection and feature extract ion for text categorization, Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, 1992, 23-26.
[24]	R. Battiti, Using mutual information for selecting features in supervised neural net learning, IEEE Trans. Neural Netw. Learn. Syst., 5 (1994), 537-550. doi: 10.1109/72.298224
[25]	H. Peng, F. Long, C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005), 1226-1238. doi: 10.1109/TPAMI.2005.159
[26]	H. Yang, J. Moody, Feature selection based on joint mutual information, Proc. Int. ICSC Symp. Adv. Intell. Data Anal., 1999, 22-25.
[27]	F. Fleuret, Fast binary feature selection with conditional mutual information, J. Mach. Learn. Res., 5 (2004), 1531-1555.
[28]	P. E. Meyer, G. Bontempi, On the use of variable complementarity for feature selection in cancer classification, Workshops on applications of evolutionary computation, Springer, Berlin, Heidelberg, 2006, 91-102.
[29]	A. El Akadi, A. El Ouardighi, D. Aboutajdine, A powerful feature selection approach based on mutual information, Int. J. Comput. Sci. Network Secur., 8 (2008), 116.
[30]	P. A. Estévez, M. Tesmer, C. A. Perez, J. M. Zurada, Normalized mutual information feature selection, IEEE Trans. Neural Networks, 20 (2009), 189-201. doi: 10.1109/TNN.2008.2005601
[31]	N. Hoque, D. Bhattacharyya, J. K. Kalita, Mifs-nd: a mutual information-based feature selection method, Expert Syst. Appl., 41 (2014), 6371-6385. doi: 10.1016/j.eswa.2014.04.019
[32]	G. Herman, B. Zhang, Y. Wang, G. Ye, F. Chen, Mutual information-based method for selecting informative feature sets, Pattern Recognit., 46 (2013), 3315-3327. doi: 10.1016/j.patcog.2013.04.021
[33]	J. Y. Ching, A. K. Wong, K. C. C. Chan, Class-dependent discretization for inductive learning from continuous and mixed-mode data, IEEE Trans. Pattern Anal. Mach. Intell., 17 (1995), 641-651. doi: 10.1109/34.391407
[34]	Q. Shen, R. Jensen, Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring, Pattern Recognit., 37 (2004), 1351-1363. doi: 10.1016/j.patcog.2003.10.016
[35]	J. Zhao, Z. Zhang, C. Han, Z. Zhou, Complement information entropy for uncertainty measure in fuzzy rough set and its applications, Soft Comput., 19 (2015), 1997-2010. doi: 10.1007/s00500-014-1387-5
[36]	H.-M. Lee, C.-M. Chen, J.-M. Chen, Y.-L. Jou, An efficient fuzzy classifier with feature selection based on fuzzy entropy, IEEE Trans. Syst. Man Cybern. Syst., 31 (2001), 426-432. doi: 10.1109/3477.931536
[37]	I. Rodriguez-Lujan, R. Huerta, C. Elkan, C. S. Cruz, Quadratic programming feature selection, J. Mach. Learn. Res., 11 (2010), 1491-1516.
[38]	K. Kira, L. A. Rendell, The feature selection problem: Traditional methods and a new algorithm, Aaai, 2 (1992), 129-134.
[39]	K. Sechidis, L. Azzimonti, A. Pocock, G. Corani, J. Weatherall, G. Brown, Efficient feature selection using shrinkage estimators, Mach. Learn., 108 (2019), 1261-1286. doi: 10.1007/s10994-019-05795-1
[40]	X. Wang, B. Guo, Y. Shen, C. Zhou, X. Duan, Input feature selection method based on feature set equivalence and mutual information gain maximization, IEEE Access, 7 (2019), 151525-151538. doi: 10.1109/ACCESS.2019.2948095
[41]	P. Zhang, W. Gao, G. Liu, Feature selection considering weighted relevancy, Appl. Intell., 1-11.
[42]	S. Garcia, J. Luengo, J. A. Sáez, V. Lopez, F. Herrera, A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning, IEEE Trans. Knowl. Data Eng., 25 (2012), 734-750.
[43]	A. Tharwat, Classification assessment methods, Appl. Comput. Inform., 2020.
[44]	M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, et al., A brief survey of text mining: Classification, clustering and extraction techniques, preprint, arXiv: 1707.02919.
[45]	R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai, 14 (1995), 1137-1145.
[46]	S. Nogueira, G. Brown, Measuring the stability of feature selection, Joint European conference on machine learning and knowledge discovery in databases, Springer, Cham, 2016,442-457.
[47]	Y. S. Tsai, U. C. Yang, I. F. Chung, C. D. Huang, A comparison of mutual and fuzzy-mutual information-based feature selection strategies, 2013 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, 2013, 1-6.
[48]	L. I. Kuncheva, A stability index for feature selection, Artificial intelligence and applications, 2007,421-427.
[49]	D. Dua, C. Graff, UCI machine learning repository, 2017. Available from: http://archive.ics.uci.edu/ml.

Reader Comments

Your name:*

Email:*
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)