
In this paper, we consider the time-fractional telegraph equation of distributed order in higher spatial dimensions, where the time derivative is in the sense of Hilfer, thus interpolating between the Riemann-Liouville and the Caputo fractional derivatives. By employing the techniques of the Fourier, Laplace, and Mellin transforms, we obtain a representation of the solution of the Cauchy problem associated with the equation in terms of convolutions involving functions that are Laplace integrals of Fox H-functions. Fractional moments of the first fundamental solution are computed and for the special case of double-order distributed it is analyzed in detail the asymptotic behavior of the second-order moment, by application of the Tauberian Theorem. Finally, we exhibit plots of the variance showing its behavior for short and long times, and for different choices of the parameters along small dimensions.
Citation: Nelson Vieira, M. Manuela Rodrigues, Milton Ferreira. Time-fractional telegraph equation of distributed order in higher dimensions with Hilfer fractional derivatives[J]. Electronic Research Archive, 2022, 30(10): 3595-3631. doi: 10.3934/era.2022184
[1] | Kawkab Al Amri, Qamar J. A Khan, David Greenhalgh . Combined impact of fear and Allee effect in predator-prey interaction models on their growth. Mathematical Biosciences and Engineering, 2024, 21(10): 7211-7252. doi: 10.3934/mbe.2024319 |
[2] | Saheb Pal, Nikhil Pal, Sudip Samanta, Joydev Chattopadhyay . Fear effect in prey and hunting cooperation among predators in a Leslie-Gower model. Mathematical Biosciences and Engineering, 2019, 16(5): 5146-5179. doi: 10.3934/mbe.2019258 |
[3] | Dirk Stiefs, Ezio Venturino, Ulrike Feudel . Evidence of chaos in eco-epidemic models. Mathematical Biosciences and Engineering, 2009, 6(4): 855-871. doi: 10.3934/mbe.2009.6.855 |
[4] | Yuhong Huo, Gourav Mandal, Lakshmi Narayan Guin, Santabrata Chakravarty, Renji Han . Allee effect-driven complexity in a spatiotemporal predator-prey system with fear factor. Mathematical Biosciences and Engineering, 2023, 20(10): 18820-18860. doi: 10.3934/mbe.2023834 |
[5] | Yuanfu Shao . Bifurcations of a delayed predator-prey system with fear, refuge for prey and additional food for predator. Mathematical Biosciences and Engineering, 2023, 20(4): 7429-7452. doi: 10.3934/mbe.2023322 |
[6] | Hongqiuxue Wu, Zhong Li, Mengxin He . Dynamic analysis of a Leslie-Gower predator-prey model with the fear effect and nonlinear harvesting. Mathematical Biosciences and Engineering, 2023, 20(10): 18592-18629. doi: 10.3934/mbe.2023825 |
[7] | Ranjit Kumar Upadhyay, Swati Mishra . Population dynamic consequences of fearful prey in a spatiotemporal predator-prey system. Mathematical Biosciences and Engineering, 2019, 16(1): 338-372. doi: 10.3934/mbe.2019017 |
[8] | Shunyi Li . Hopf bifurcation, stability switches and chaos in a prey-predator system with three stage structure and two time delays. Mathematical Biosciences and Engineering, 2019, 16(6): 6934-6961. doi: 10.3934/mbe.2019348 |
[9] | Rongjie Yu, Hengguo Yu, Chuanjun Dai, Zengling Ma, Qi Wang, Min Zhao . Bifurcation analysis of Leslie-Gower predator-prey system with harvesting and fear effect. Mathematical Biosciences and Engineering, 2023, 20(10): 18267-18300. doi: 10.3934/mbe.2023812 |
[10] | Wanxiao Xu, Ping Jiang, Hongying Shu, Shanshan Tong . Modeling the fear effect in the predator-prey dynamics with an age structure in the predators. Mathematical Biosciences and Engineering, 2023, 20(7): 12625-12648. doi: 10.3934/mbe.2023562 |
In this paper, we consider the time-fractional telegraph equation of distributed order in higher spatial dimensions, where the time derivative is in the sense of Hilfer, thus interpolating between the Riemann-Liouville and the Caputo fractional derivatives. By employing the techniques of the Fourier, Laplace, and Mellin transforms, we obtain a representation of the solution of the Cauchy problem associated with the equation in terms of convolutions involving functions that are Laplace integrals of Fox H-functions. Fractional moments of the first fundamental solution are computed and for the special case of double-order distributed it is analyzed in detail the asymptotic behavior of the second-order moment, by application of the Tauberian Theorem. Finally, we exhibit plots of the variance showing its behavior for short and long times, and for different choices of the parameters along small dimensions.
The class imbalance problem refers to the hot potato that the quantity of one class presents abnormal characteristic, which is much larger or less than the other classes of samples and the cost of misclassification between this classes of samples is different, leading to failure for standard classifiers. Thus, characteristics of class imbalanced datasets are shown as follows: the quantity imbalanced of different classes of samples and the cost imbalanced of miscalculation (Li et al., 2019). Usually, class imbalanced learning methods are considered as the technologies that can solve the above problem, which are widely used in several files such as bioinformatics (Blagus and Lusa, 2013), software defect monitoring (Lin and Lu, 2021), text classification (Ogura et al., 2011), and computer vision (Pouyanfar and Chen, 2015) etc. Therefore, these broad applications reveal tremendous value to research class imbalanced learning methods.
Standard classifiers such as logistic regression (LR), Support Vector Machine (SVM) and decision tree (DT) are suitable for balanced training sets. When facing imbalanced scenarios, these models often provide suboptimal classification results (Ye et al., 2019). For example, when facing imbalanced datasets, it is possible that unsatisfactory classification result was produced by Bayesian classifier, and the unsatisfactory classification result was influenced by the overlapping range of different class in the sample space (Domingos and Pazzani, 1997). Similarly, when the SVM classifier is employed to handle class imbalanced datasets, the optimal hyperplane will move to the core range of the majority class. Particularly, when data sets present the characteristic of highly imbalanced (Jiang et al., 2019) or interclass aggregation (Zhai et al., 2010), we obtained outcome that all sub-cluster samples of the minority class will be misclassified.
Therefore, the class imbalanced influenced the result of standard classifiers (Yu et al., 2019). Generally speaking, the class imbalance ratio (IR) is defined as the ratio of majority class size to minority class size, which can measure the degree of class imbalance in data sets. According to literature analysis, the result of standard classifiers influenced by class imbalanced was generally positive proportion that the greater IR was, the greater the impact has (Cmv and Jie, 2018). However, we should realize that class imbalance does not always lead poor results to the classifier. In addition, the following are also some factors that affect the results of standard classifiers:
● The scale of the overlapping space, which refers to the feature that different classes of samples have no clear boundary in the sample space.
● The number of noise samples, which refers to a few examples of one class far away from the core area of the class (López et al., 2015).
● The number of training samples, which refers to the model training samples (Yu et al., 2016).
● The degree of interclass aggregation, which refers to this feature that one class samples present two or more clusters in the sample space, and these clusters can distinguish major and minor (Japkowicz et al., 2002).
● The dimension of dataset, which refers to the number of features.
The above factors lead a suboptimal result. It is worth noting that when the above factors appear in the imbalanced datasets, worse results will emerge than of in the balanced scenario. Here, we generated a series of data sets to verify the influence of these factors on standard classifiers. Detailed results are shown in the appendix.
In this research, we aim to provide an overview of class imbalanced learning methods. The rest of this research is organized as follows. Section 2 introduces approaches to addressing class imbalanced dataset both data driven and algorithm driven. Section 3 provides a review of measurement of classifier performance to class imbalanced classifiers. In Section 4, we discuss our opinions for the challenges and directions of future from to analysis of relevant literature. Finally, Section 5 presents the conclusions of this study.
The research of class imbalance learning originated in the late 1990s. Since then, numerous methods have been developed. Thus, this study discusses the key methods to handle class imbalanced problems from data driven (Liu et al., 2019) and algorithm driven (Wu et al., 2019).
Methods from the data drive, also known as data-level methods or resampling methods. These methods reverse the property of the imbalance characteristics of classes' quantity by randomly generating cases of the minority class (ROS) or removing cases of the majority class (RUS). It can be regarded as one of data preprocessing processes, therefore, resampling and the classifier training processes are independent on each other, and it was compatible with standard classifiers (Maurya and Toshniwal, 2018; Wang and Minku, 2015).
About methods of data driven can be described as follows.
Firstly, researchers pointed out that the random resampling can be used to deal with class imbalanced datasets, which was the simplest data-driven method to improve the classification accuracy of the minority class. But the uncomplicated data driven methods present some shortcomings, for instance, longer learning time, more running memory and poor generalization ability were presented by Oversampling due to the repeatability of samples. In addition, Undersampling will reduce the performance of classification owing to the lack of information resulted from the elimination of samples. Secondly, as the disadvantages of simple random sampling technology were exposed, some better methods were developed such as the synthetic minority oversampling techniques (SMOTE) (Chawla et al., 2011) and Borderline-SMOTE (Hui et al., 2005). The former method was proposed by Chawla et al. (2002). It was an oversampling algorithm that based on k-nearest neighbor (KNN) to synthesize a new virtual sample of the minority classes randomly among the minority class. Compared with ROS, SMOTE had a stronger ability of generalization and overcame overfitting in a certain extent. Borderline-SMOTE was an oversampling strategy based on SMOTE. Borderline-SMOTE synthesizes mainly the minority class samples at the class boundary, therefore, the method's classification result was better than SMOTE when one dataset with a few noise samples. In recent years, with the continuous progress of computer technology, some more superior methods have been proposed such as the cleaning resampling method (Koziarski et al., 2020), and based-radial undersampling method (Krawczyk et al., 2020), etc. Analyzing data driven methods, Yu deemed that the data driven methods underwent three stages random sampling technology stage, manual sampling technology stage and complex algorithm stage (Yu, 2016), shown in Figure 1.
Table 1 provides a summary of data driven method and its analysis. We can draw that the overlapping was important factor affecting the impact of standard classifiers, and that different cases were have different impacts on classification in the sample space, and that some concepts were defined by researchers such as "energy (Li L et al., 2020)" to provide some information for resampling.
Methods and illustrations | |
Over-sampling | ROS: generated the cases of the minority class randomly |
SMOTE: generated the cases of the minority class with KNN randomly | |
Under-sampling | Borderline-SMOTE: generated the cases of the minority with SMOTE in overlapping range |
EOS: generated the cases of the minority class randomly with "entropy" information | |
RBO: generated the cases of the minority class randomly with "radial" information | |
RUS: removed the cases of the majority class randomly | |
SMOTE + ENN (Tao et al., 2019): removed the cases of the majority class with KNN randomly | |
SMOTE + Tomek (Wang et al., 2019): removed the cases of the majority with deleting Tomek cases | |
OSS (Rodriguez et al., 2013): removed the cases of the majority with just deleting the case of the majority in Tomek cases | |
SBC (Xiao and Gao, 2019): removed the cases of the majority class with the clustering theory randomly | |
Hybrid sampling | EUS: removed the cases of the majority class randomly with "entropy" information |
EHS: hybrid resampling that entropy based | |
CCR: hybrid resampling that synthesizing and cleaning based |
Data driven methods are regarded as independent of classifiers methods. Yet algorithm driven methods are regarded as dependent classifiers methods. These methods are improving standard classifiers that include cost-sensitive learning based and threshold moved based mainly. For methods of algorithm driven, the main core algorithm was cost sensitive learning, and the supported learning algorithms include four learning technologies: active learning, decision compensation learning, feature extraction learning and emblems learning.
Cost sensitive learning was one of the frequently used technologies to solve the problem of class imbalanced (Wan and Yang, 2020), and its goal is to minimize the cost of overall misclassification. In the process of model learning, according to the practical problems, different factors of penalty cost were given to different classes. Cost sensitive learning's core is the design of cost matrix which could combine with the standard classifiers model to improve classification result (Zhou and Liu, 2010). For instance, we could obtain a posteriori probability which was more suitable for dealing with class imbalance problems, though the original Bayesian classifier posterior probability was fused with cost matrix (Kuang et al., 2019). And DT classifier integrated cost matrix into the process of attribute selection and pruning for the purpose of optimizing the classification result (Ping et al., 2020).
What the above analysis shows is that the technology was strongly dependent on cost matrix. Main design methods are as follows:
● Empirical weighted design, which shows that the cost coefficients of the samples of the same class are the same (Zong et al., 2013).
● Fuzzy weighted design, which shows that the cost coefficients of the same class are different in different position of sample spaces (Dai, 2015).
● Adaptive weighted design, which is iterative and dynamic, converging to the global optimum in an adaptive way (Sun et al., 2007).
Active learning's core idea refers to obtain cases that are difficult to mark out class to train one model. For active learning: firstly, the experts manually labelled the sample labels served as the initial training set, and then put it to use to learn the classifier. Secondly, some query algorithms were used to select samples that samples of one class are indistinguishable from other classes. And these samples are labeled by experts to expand the training dataset. Thirdly, the label samples are added to train a new classifier. After repeating step two and step three, qualified classifier is obtained. Merit of active learning is decreasing size of train samples, keeping main information, and reducing manual (Attenberg and Ertekin, 2013).
Decision Adjustment learning modifies the decision threshold, which is directly making positive compensation for the decision to correct the original unsatisfactory decision. In essence, it is an adjustment strategy, which makes the classification results tend to core range of the minority (Gao et al., 2020).
Class imbalanced learning from feature selection driven refers to which the key features are preserved, which can increase the discrimination degree between the minority class and the majority classes, and improve the accuracy of the minority class and even any class. Feature extraction skills mainly include convolution neural network (CNN) and recurrent neural network (RNN) (Hua and Xiang, 2018). According to whether or not the evaluation criteria selected by feature selection are related to classifiers, three models have been developed: filter, wrapper and embedded (Bibi and Banu, 2015). These above ideas were noticed by researchers, and then series features of driven based algorithms were proposed (Shen et al., 2017; Xu et al., 2020). These algorithms have been applied to high dimensional data processing such as software defect (He et al., 2019), bioinformatics (Sunny et al., 2020), natural language processing (Wang et al., 2020) and network public opinion analysis (Luo and Wu, 2020).
Ensemble learning can first review the idea of cascade multi classification integration system written by Sebestyen. Ensemble learning is also one of the important technologies of machine learning. It can solve the limitations of some single algorithms by strategically building multiple base algorithms and combing them to complete classification task. One weak classifier that is slightly better than random conjecture can be promoted to a strong classifier by ensemble learning (Witten et al., 2017; Schapire, 1990).There are two leading frameworks for ensemble learning: one is Bagging framework (Breiman, 1996), and the representative algorithm is random forest algorithm (Verikas et al., 2011), and the other is Boosting framework (Ling and Wang, 2014; Li et al., 2013), and the representative algorithm is AdaBoost algorithm (Schapire, 2013).
Resampling-based ensemble learning, which is defined as an ingenious combination of resampling and ensemble learning. The simplicity of bagging paradigm firstly was noticed by researchers, and then multifarious algorithms have been developed, such as AsBagging algorithm (Tao et al., 2006) and UnderOverBagging algorithms (Wang and Yao, 2009). The former perfectly combines RUS with Bagging, and its merit is that it could reserve all cases of the majority class and reduce overfitting degree of the minority class. Meanwhile, AsBagging algorithm makes the classification result more stable because of the random resampling method and ensemble learning. Nevertheless, the result of the algorithm may is swinging with handing multi-noise datasets, and the reason is that the algorithm uses Bootstrap technology to create the train datasets of the basic algorithm. Thus, AsBagging_FSS algorithm (Yu and Ni, 2014) was proposed, which combined with the random feature subspace generation strategy (FSS). Because FSS can reduce the impact of noise samples on the basic classification algorithm, the classification results of the basic classification of the algorithm can get a better result. Therefore, AsBagging_FSS algorithm is better than AsBagging_FSS in dealing with the imbalanced data sets with noise samples. Except for combination of the resampling methods and Bagging ensemble learning framework, researchers also research the combination of the Booting framework and then develop some algorithms, such as SMOTEBoost algorithm (Chawla et al., 2003) and RUSBoost algorithm (Seiffert, 2010). Besides, the Hybrid framework (Galar, 2012) that was fusion of the Bagging and the Boosting was also noticed by researchers. Based on this idea, EasyEnsemble algorithm and BalanceCascade algorithm were proposed by Liu et al (Liu et al., 2009).EasyEnsemble algorithm is Bagging-based AdaBoost ensemble learning algorithm, which used Adaboost algorithm as basic classifier and first uses RUS algorithm to generate balanced train datasets of basic algorithm. EasyEnsemble algorithm can lower the variance and deviation of classification result, which makes the classification result become stable and presents stronger generalization ability. The BalanceCascade algorithm is improved EasyEnsemble algorithm. This algorithm's ingenious idea is that the correctly classified samples are constantly removed in the basic classifier train datasets, so that the classifier can repeatedly learn misclassified samples. Therefore, the generation of base classifier in the former algorithm is a parallel relationship, while the generation of base classifier in the latter algorithm is a serial relationship. Some representative algorithms are shown in Table 2.
Algorithms | Ensemble frameworks | Combined strategies | |
Data driven | AsBagging | Bagging | Bootstrap |
UnderBagging | Bagging | Undersampling | |
OverBagging | Bagging | Oversampling | |
SMOTEBagging | Bagging | SMOTE | |
SMOTEBoost | Boost | SMOTE | |
RUSBoost | Boost | Undersampling | |
EasyEnsemble | Hybrid | Undersampling | |
BalanceCascade | Hybrid | Undersampling | |
Cost-sensitive | CS-SemiBagging | Bagging | Fuzzy cost matrix |
AdaCX | Boost | Empirical cost matrix | |
AdaCost | Boost | Fuzzy cost matrix | |
DE-CStacking | Stacking | Adaptive cost matrix |
Ensemble algorithm is based on cost sensitive learning, which combines cost sensitive learning with ensemble learning. For example, AdaCX algorithm (Sun et al., 2007), which combines cost sensitive learning and AdaBoost algorithm, aiming to giving a larger weight to the minority class. The core of this algorithm is that update weights are different from different classes, which can be able to amplify effect of the cost sensitive, and AdaC1, AdaC2 and AdaC3 algorithm are developed based on different update weights. In addition, similar algorithms include AdaCost algorithm (Zhang, 1999), CBS1 algorithm and CBS2 algorithm (Ling, 2007). In addition, algorithms based on other frames are developed, such as the CS-SemiBagging algorithm of based Bagging ensemble framework (Ren et al., 2018), and the DE-CStacking algorithm (Gao et al., 2019) of based Stacking ensemble framework (Wolpert, 1992).
Ensemble learning algorithm is based on decision adjustment learning, among these algorithms, the classical algorithm is EnSVM-OTHR algorithm (Yu et al., 2015), which is the SVM-OTHR algorithm as the basic classifier and Bagging frameworks as the learning framework. EnSVM-OTHR algorithm uses bootstrap sampling and random interference to enhance the diversity of basic classifiers.
From the above analysis, we can draw a conclusion that ensemble learning can be applied to deal with the problem of class imbalance, especially for linear indivisible data, the ensemble learning presents better classification results. In the future, the ensemble-based class imbalanced learning methods will be one of the main research directions (Tsai and Liu, 2021). However, ensemble learning presents disadvantages of long training time and high computational complexity. Especially, it is also a bottleneck to deal with high dimensional and large scale data. Therefore, ensemble learning based on algorithms are facing with new challenges and opportunities in the era of big data. To solve this problem, ensemble learning can combine with feature extraction to reduce the data dimension. Or we deal with the problem of computational complexity by using the distributed Computing (Yang, 1997; Guo et al., 2018).
To sum up, the class imbalanced learning methods were analyzed from two different motivations. Although methods are from different ideas, the pursuit of the goal is consistent. Both data-driven methods and algorithm driven methods pursue the maximum accuracy of all classes. Therefore, methods from the data driven are essentially the same as the cost sensitive technology of some methods from algorithm driven. For example, in the random oversampling, in generating cases of the minority classes to balance quantity, it is also equivalent to giving the classifier a cost of IR times to the minority for some classifiers. The methods of manual resampling are similar to the idea of fuzzy cost sensitive algorithms, both of which use the prior information of samples to generate cases of the minority class or obtain the cost matrix.
Based on the above analysis, related experiments were designed:
● Experimental environment: Python 3.8.5 (64x), sklearn module, decision tree classifier (DT), default parameters.
● Dataset from keel website, shown in the Table 3, and the ratio of train set to test set is set to 7:3, which designates "Tra: Tes".
Data sets | Variates | Size | IR | Tra: Tes |
yeast | 8 | 514 | 9.08 | 7:3 |
glass | 9 | 314 | 3.20 | 7:3 |
cleveland | 13 | 177 | 16.62 | 7:3 |
vehicle | 18 | 846 | 3.25 | 7:3 |
We designed 10 oversampling experiments for that dataset, and randomly recorded the experimental results of four of them and calculated that average, numbered as "1", "2", "3", "4", and "average". We also designed an empirical weighted cost sensitive experiment as a contrast, and the result of the experiment is numbered "cost-sen". This conclusion has been obtained that cost sensitive experiment may catch similar classification results in oversampling experiments. Above analysis is shown in the Table 4.
Dataset | Item | Precision | Recall | F1-Score | Accuracy |
yeast | 1 | 0.9231 | 0.5714 | 0.7059 | 0.9355 |
2 | 0.8462 | 0.6471 | 0.7333 | 0.9484 | |
3 | 0.9231 | 0.5714 | 0.7059 | 0.9355 | |
4 | 0.8462 | 0.5500 | 0.6667 | 0.9290 | |
average | 0.8847 | 0.5850 | 0.7030 | 0.9371 | |
cost-sen | 0.92310 | 0.5714 | 0.7059 | 0.9355 | |
glass | 1 | 0.8125 | 1.0000 | 0.8966 | 0.9538 |
2 | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
3 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
4 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
average | 0.8906 | 1.0000 | 0.9413 | 0.9500 | |
cost-sen | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
cleveland | 1 | 0.6667 | 0.3333 | 0.4444 | 0.9038 |
2 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
3 | 0.3333 | 0.2000 | 0.2500 | 0.8846 | |
4 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
average | 0.4167 | 0.2583 | 0.3165 | 0.8990 | |
cost-sen | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
vehicle | 1 | 0.8788 | 0.9062 | 0.8923 | 0.9449 |
2 | 0.8333 | 0.9016 | 0.8661 | 0.9331 | |
3 | 0.8636 | 0.9194 | 0.8906 | 0.9449 | |
4 | 0.8485 | 0.918 | 0.8819 | 0.9409 | |
average | 0.8561 | 0.9113 | 0.8827 | 0.94095 | |
cost-sen | 0.8182 | 0.9153 | 0.8640 | 0.93310 |
After analyzing, we can acquire the general processing to imbalanced datasets, shown in Figure 2.
Thus, we can draw the following conclusions. If one dataset is class imbalanced and non-overlapping, it is possible that standard classifiers are not affected. When it is overlapping in sample space to the dataset, it is difficult to categorize the sample of overlapping range; decision result is affected by the inverse probability theory which makes the decision results prefer the majority class. In this case, class imbalanced learning methods such as the data driven or algorithm driven method mentioned above can be employed. Or when the dataset has the phenomenon of interclass aggregation, it is difficult for a single classifier to distinguish the samples of the sub-aggregation range of the minority class. So, we can use ensemble-based class imbalanced learning to solve this data. In this way, it may improve the accuracy of all classes of the single classifier. In addition, when we get one classifier, we also can adjust the decision threshold by the decision adjustment learning according to experience, which may achieve better results. The whole process is illustrated in Figure 2.
For result evaluation indexes of the different classifiers, a series of indexes such as threshold based, probability based and grade based can be found in some scientific literature (Luque et al., 2019). But some indexes of standard classifiers are unsuitable for the study file of the class imbalanced classifiers. Usually, we use robustness indexes such as F-measure, G-means metric, MCC and AUC. These based on confusion matrix indexes are creative.
An explanation of the Table 5: TP (TN) is the number of samples that originally belong to the positive (negative) class and belong to the positive (negative) class after classification, which represents the number that is correctly classified; FP (FN) is the number of samples that originally belong to the negative (positive) class and belong to the positive (negative) class after classification, which represents the number of misclassifications.
Prediction positive | Prediction negative | |
Positive class | True positive (TP) | False negatives (FN) |
Negative class | False positive (FP) | True negatives (TN) |
A series of concepts such as Precision, Recall and TNP etc. are constructed by researchers:
Accuracy=TP+TNTP+TN+FP+FN | (1) |
Precision=TPTP+FP | (2) |
Recall=TPTP+FN | (3) |
TNR=TNTN+FP | (4) |
G−Mean=√TPR×TNR | (5) |
F−Measure=Precision×Recall | (6) |
F1−score=2×Precision×RecallPrecision+Recall | (7) |
MCC=TP×TN−FP×FN√(TP+FP)(TP+FN)(TN+FN)(TN+FN) | (8) |
Equation (3)'s "Recall" also can be called TPR.
G-mean is the geometric mean that is the accuracy of the positive class and the negative class. When the accuracy of the two classes is robustness, the G-Mean value becomes the optimal value. F-Measure is some similarity in the principle G-Mean. When the value of the Precision and Recall is roughly the same, the F-Measure value also becomes the optimal value. MCC represents the correlation degree between the real result and the predicted result, which is not affected by the class imbalance data. As one of correlation coefficients, the MCC value range is between −1 and 1. AUC is the area under ROC, and ROC is an important plane curve by FPR as the horizontal coordinate and TPR as the vertical coordinate.
At present, class imbalanced learning methods have developed many mature methods in binary data, and a lot of algorithms and tools are used in various applications. In this era of big data, class imbalanced learning methods are facing some new challenges (Leevy et al., 2018; Chandresh et al., 2016):
● Large scale data processing problems: overcoming the increasing computational complexity and memory consumption.
● High dimensional data processing problems: sparse data processing.
● Data stream processing problems: the development of scalable online algorithms.
● Missing label data processing problems: semi supervised algorithm development.
● Multi class imbalance processing problems: the new definition of class imbalanced degree.
● Highly imbalanced processing problems: the development of accurate discriminant algorithms for the minority samples.
Nowadays, the processing of the class imbalanced problem is still research hotspot. The future research prospects are as follows:
● Strengthen theoretical research and enhance the interpretability of the algorithms. So far, there is a lack of theoretical research on class imbalanced model classification. It is difficult to interpret some the methods and evaluation is empirical.
● Adapt to the current research and be fit to the topical development. The complex data lead to the failure result of many traditional methods. Therefore, auxiliary technologies such as feature creation, feature extraction and active learning will be further applied in the study of the complex data.
In this research, we attempted to provide a review of methods in class imbalance problem. Different from other researches that have been published in imbalanced learning field, research are reviewed from both core technologies which are including the resampling methods and the cost sensitivity learning, and supporting technologies which include the active leaning and others. Through our analysis, we found some interesting conclusions flowingly:
● Data resampling based on classifiers are generally used in biomedical field due to the fact that biomedical data generally are fixed with structure and have multifarious similarity measurement between samples. Cost-sensitive learning technology is generally used in the operational research field, because its goal is to minimize the cost. With the improvement of data technology, data with high dimensionality and large scale are aroused by sensors. Feature extraction learning is used to reduce the complexity of some algorithms by reducing the dimension in high dimensional data. Distributed computing technology will be used to relieve the problem of insufficient memory in the single machine model in large scale data.
● The class imbalance rate is not an absolute condition that affects the result of the standard classifier. The standard classification model, in which the class of data is non-overlapping in the sample space, can also train outstanding result. When facing various datasets, researchers will choose the appropriate processing method according to the different data characteristics. For instance, when facing datasets with interclass aggregation factor, researchers often choose ensemble learning and complex classifiers that enable to distinguish examples of the secondary features of in interclass. When facing datasets with fewer labels, researchers will choose semi-supervised, active learning and other supporting technologies to fit to the imbalanced dataset.
● The main challenge to fit to valid classifiers for class imbalanced datasets is the increasing complexity of data. For example, the processing of unstructured data such as language, text and web pages often needs data cleaning and feature representation. In addition, the handing of stream data generated by sensors requires developing dynamic learning algorithm with strong scalability and non-traditional memory.
At the end of this study, the future research directions are put forward from reviewing, which is also our focus in the future research.
This work was supported by NSSF of China (18BTJ029).
All authors declare no conflicts of interest in this paper.
[1] |
A. V. Chechkin, R. Gorenflo, I. M. Sokolov, Retarding subdiffusion and accelerating superdiffusion governed by distributed-order fractional diffusion equations, Phys. Rev. E, 66 (2002), 046129. https://doi.org/10.1103/PhysRevE.66.046129 doi: 10.1103/PhysRevE.66.046129
![]() |
[2] | F. Mainardi, A. Mura, G. Pagnini, R. Gorenflo, Time-fractional diffusion of distributed order, J. Vib. Control, 14 (2008), 1267–1290. https://doi.org/10.1177%2F1077546307087452 |
[3] |
T. Sandev, A. Chechkin, N. Korabel, H. Kantz, I. M. Sokolov, R. Metzler, Distributed-order diffusion equations and multifractality: Models and solutions, Phys. Rev. E, 92 (2015), 04217. https://doi.org/10.1103/physreve.92.042117 doi: 10.1103/physreve.92.042117
![]() |
[4] | M. Caputo, Distributed order differential equations modeling dielectric induction and diffusion, Fract. Calc. Appl. Annal., 4 (2001), 421–442. |
[5] |
A. Ansari, M. Moradi, Exact solutions to some models of distributed-order time fractional diffusion equations via the Fox H functions, ScienceAsia, 39 (2013), 57–66. http://dx.doi.org/10.2306/scienceasia1513-1874.2013.39S.057 doi: 10.2306/scienceasia1513-1874.2013.39S.057
![]() |
[6] | W. Ding, S. Patnaik, F. Semperlotti, Multiscale nonlocal elasticity: A distributed order fractional formulation, Int. J. Mech. Sci. 226 (2021), 19. https://doi.org/10.1016/j.ijmecsci.2022.107381 |
[7] |
J. Jia, H. Wang, Analysis of a hidden memory variably distributed-order space-fractional diffusion equation, Appl. Math. Lett., 124 (2022), 107617. https://doi.org/10.1016/j.aml.2021.107617 doi: 10.1016/j.aml.2021.107617
![]() |
[8] |
J. Jia, X. Zheng, H. Wang, Analysis and fast approximation of a steady-state spatially-dependent distributed-order space-fractional diffusion equation, Fract. Calc. Appl. Anal., 24 (2021), 1477–1506. https://doi.org/10.1515/fca-2021-0062 doi: 10.1515/fca-2021-0062
![]() |
[9] |
Y. Kumar, V. K. Singh, Computational approach based on wavelets for financial mathematical model governed by distributed order fractional differential equation, Math. Comput. Simul., 190 (2021), 531–569. https://doi.org/10.1016/j.matcom.2021.05.026 doi: 10.1016/j.matcom.2021.05.026
![]() |
[10] |
M. Naber, Distributed order fractional sub-diffusion, Fractals, 12 (2004), 23–32. https://doi.org/10.1142/S0218348X04002410 doi: 10.1142/S0218348X04002410
![]() |
[11] | I. M. Sokolov, A. V. Chechkin, J. Klafter, Distributed order fractional kinetics, Acta Physica Polonica, 35 (2004), 1323–1341. |
[12] | Yu. Luchko, Boundary value problems for the generalized time-fractional diffusion equation of distributed order, Fract. Calc. Appl. Anal., 12(2009), 409–422. |
[13] |
M. Al-Refai, Yu. Luchko, Analysis of fractional diffusion equations of distributed order: Maximum principles and their applications, Analysis, 36 (2016), 123–133. https://doi.org/10.1515/anly-2015-5011 doi: 10.1515/anly-2015-5011
![]() |
[14] |
S. Patnaik, F. Semperlotti, Application of variable- and distributed-order fractional operators to the dynamic analysis of nonlinear oscillators, Nonlinear Dyn., 100 (2020), 561–580. https://doi.org/10.1007/s11071-020-05488-8 doi: 10.1007/s11071-020-05488-8
![]() |
[15] |
R. Gorenflo, Yu. Luchko, M. Stojanović, Fundamental solution of a distributed order time-fractional diffusion-wave equation as probability density, Fract. Calc. Appl. Anal., 16 (2013), 297–316. https://doi.org/10.2478/s13540-013-0019-6 doi: 10.2478/s13540-013-0019-6
![]() |
[16] | R. Hilfer, Fractional calculus and regular variation in thermodynamics, in Applications of Fractional Calculus in Physics (ed. R. Hilfer), World Scientific, Singapore, (2000), 429–463. |
[17] |
R. Metzler, J. Klafter, The random walk's guide to anomalous diffusion: A fractional dynamics approach, Phys. Rep., 339 (2000), 1–77. https://doi.org/10.1016/S0370-1573(00)00070-3 doi: 10.1016/S0370-1573(00)00070-3
![]() |
[18] |
W. Ding, S. Patnaik, S. Sidhardh, F. Semperlotti, Applications of Distributed-Order Fractional Operators: A Review, Entropy, 23 (2021), 110. https://doi.org/10.3390/e23010110 doi: 10.3390/e23010110
![]() |
[19] |
M. Caputo, M. Fabrizio, The kernel of the distributed order fractional derivatives with an application to complex materials, Fractal Fract., 1 (2017), 13. https://doi.org/10.3390/fractalfract1010013 doi: 10.3390/fractalfract1010013
![]() |
[20] |
G. Calcagni, Towards multifractional calculus, Front. Phys., 6 (2018), 58. https://doi.org/10.3389/fphy.2018.00058 doi: 10.3389/fphy.2018.00058
![]() |
[21] | C. F. Lorenzo, T. T. Hartley, Variable order and distributed order fractional operators, Nonlinear Dyn. 29 (2002), 57–98. https://doi.org/10.1023/A:1016586905654 |
[22] | W. Thomson, On the theory of the electric telegraph, Proc. R. Soc. Lond, Ser. I, 7 (1854), 382–399. https://www.jstor.org/stable/111814 |
[23] | C. Cattaneo, Sur une forme de l'équation de la chaleur éliminant le paradoxe d'une propagation instantanée, C. R. Acad. Sci., Paris, 246 (1958), 431–433. |
[24] | W. Hayt, Engineering Electromagnetics, 5th edition, McGraw-Hill, New York, 1989. |
[25] |
J. Banasiak, R. Mika, Singular perturbed telegraph equations with applications in random walk theory, J. Appl. Stoch. Anal., 11 (1998), 9–28. https://doi.org/10.1155/S1048953398000021 doi: 10.1155/S1048953398000021
![]() |
[26] |
F. Effenberger, Y. Litvinenko, The diffusion approximation versus the telegraph equation for modeling solar energetic particle transport with adiabatic focusing, Astrophys. J., 783 (2014), 15. https://doi.org/10.1088/0004-637X/783/1/15 doi: 10.1088/0004-637X/783/1/15
![]() |
[27] | A. Okuko, Application of the telegraph equation to oceanic diffusion: another mathematical model, Technical Report No.69, Chesapeake Bay Institute, Johns Hopkins University, Baltimore, 1971. |
[28] |
V. H. Weston, S. He, Wave splitting of telegraph equation in R and its application to inverse scattering, Inverse Probl., 9 (1993), 789–812. https://doi.org/10.1088/0266-5611/9/6/013 doi: 10.1088/0266-5611/9/6/013
![]() |
[29] |
L. Boyadjiev, Y. Luchko, The neutral-fractional telegraph equation, Math. Model. Nat. Phenom., 12 (2017), 51–67. https://doi.org/10.1051/mmnp/2017064 doi: 10.1051/mmnp/2017064
![]() |
[30] |
J. Masoliver, Telegraphic transport processes and their fractional generalization: A review and some extensions, Entropy, 23 (2021), 364. https://doi.org/10.3390/e23030364 doi: 10.3390/e23030364
![]() |
[31] |
E. Orsingher, L. Beghin, Time-fractional telegraph equations and telegraph processes with Brownian time, Probab. Theory Relat. Fields, 128 (2004), 141–160. https://doi.org/10.1007/s00440-003-0309-8 doi: 10.1007/s00440-003-0309-8
![]() |
[32] |
E. Orsingher, B. Toaldo, Space-time fractional equations and the related stable processes at random time, J. Theor. Probab., 30 (2017), 1–26. https://doi.org/10.1007/s10959-015-0641-9 doi: 10.1007/s10959-015-0641-9
![]() |
[33] |
R. C. Cascaval, E. C. Eckstein, L. C. Frota, J. A. Goldstein, Fractional telegraph equations, J. Math. Anal. Appl., 276 (2002), 145–159. https://doi.org/10.1016/S0022-247X(02)00394-3 doi: 10.1016/S0022-247X(02)00394-3
![]() |
[34] |
R. F. Camargo, A. O. Chiacchio, E. C. de Oliveira, Differentiation to fractional orders and the fractional telegraph equation, J. Math. Phys., 49 (2008), 12. https://doi.org/10.1063/1.2890375 doi: 10.1063/1.2890375
![]() |
[35] |
R. K. Saxena, R. Garra, E. Orsingher, Analytical solution of space-time fractional telegraph-type equations involving Hilfer and Hadamard derivatives, Integr. Transform. Spec. Funct., 27 (2015), 30–42. https://doi.org/10.1080/10652469.2015.1092142 doi: 10.1080/10652469.2015.1092142
![]() |
[36] |
K. Górska, A. Horzela, E. K. Lenzi, G. Pagnini, T. Sandev, Generalized Cattaneo (telegrapher's) equations in modeling anomalous diffusion phenomena, Phy. Review E, 102 (2020), 13. https://doi.org/10.1103/PhysRevE.102.022128 doi: 10.1103/PhysRevE.102.022128
![]() |
[37] |
M. Ferreira, M. M. Rodrigues, N. Vieira, Application of the fractional Sturm-Liouville theory to a fractional Sturm-Liouville telegraph equation, Complex Anal. Oper. Theory, 15 (2021), 36. https://doi.org/10.1007/s11785-021-01125-3 doi: 10.1007/s11785-021-01125-3
![]() |
[38] |
M. Ferreira, M. M. Rodrigues, N. Vieira, First and second fundamental solutions of the time-fractional telegraph equation with Laplace or Dirac operators, Adv. Appl. Clifford Algebr., 28 (2018), 14. https://doi.org/10.1007/s00006-018-0858-7 doi: 10.1007/s00006-018-0858-7
![]() |
[39] |
M. Ferreira, M. M. Rodrigues, N. Vieira, Fundamental solution of the time-fractional telegraph Dirac operator, Math. Methods Appl. Sci., 40 (2017), 7033–7050. https://doi.org/10.1002/mma.4511 doi: 10.1002/mma.4511
![]() |
[40] |
M. Ferreira, M. M. Rodrigues, N. Vieira, Fundamental solution of the multi-dimensional time fractional telegraph equation, Fract. Calc. Appl. Anal., 20 (2017), 868–894. https://doi.org/10.1515/fca-2017-0046 doi: 10.1515/fca-2017-0046
![]() |
[41] |
M. D'Ovidio, E. Orsingher, B. Toaldo, Time-changed processes governed by space-time fractional telegraph equations, Stoch. Anal. Appl., 32 (2014), 1009–1045. https://doi.org/10.1080/07362994.2014.962046 doi: 10.1080/07362994.2014.962046
![]() |
[42] | J. Masoliver, K. Lindenberg, Two-dimensional telegraphic processes and their fractional generalizations, Phys. Rev. E, 101 (2020). https://doi.org/10.1103/PhysRevE.101.012137 |
[43] | J. Masoliver, Three-dimensional telegrapher's equation and its fractional generalization, Phys. Rev. E, 96 (2017). https://link.aps.org/doi/10.1103/PhysRevE.96.022101 |
[44] | J. Masoliver, Fractional telegrapher's equation from fractional persistent random walks, Phys. Rev. E, 93 (2016). https://link.aps.org/doi/10.1103/PhysRevE.93.052107 |
[45] |
J. Masoliver, J. M. Porrà, G. H. Weiss, Some two and three-dimensional persistent random walks, Phys. A: Stat. Mech. Appl., 193 (1993), 469–482. https://doi.org/10.1016/0378-4371(93)90488-P doi: 10.1016/0378-4371(93)90488-P
![]() |
[46] | N. Vieira, M. M. Rodrigues, M. Ferreira, Time-fractional telegraph equation of distributed order in higher dimensions, Commun. Nonlinear Sci. Numer. Simulat., 102 (2021). https://doi.org/10.1016/j.cnsns.2021.105925 |
[47] | R. Hilfer, Applications of Fractional Calculus in Physics, World Scientific, Singapore, 2000. |
[48] | R. Hilfer, Threefold introduction to fractional derivatives, Chapter 2, in: Anomalous Transport: Foundations and Applications (eds. R. Klages, G.Radons and I.M. Sokolov). Weinheim: Wiley-VCH, (2008), 17–74. |
[49] |
R. Hilfer, Experimental evidence for fractional time evolution in glass forming materials, Chem. Phys., 284 (2002), 399–408. https://doi.org/10.1016/S0301-0104(02)00670-5 doi: 10.1016/S0301-0104(02)00670-5
![]() |
[50] | R. Hilfer, Y. Luchko, Z. Tomovski, Operational method for the solution of fractional differential equations with generalized Riemann-Liouville fractional derivatives, Fract. Calc. Appl. Anal., 12 (2009), 299–318. |
[51] |
T. Sandev, Z. Tomovski, B. Crnkovic, Generalized distributed order diffusion equations with composite time fractional derivative, Comput. Math. Appl., 73 (2017), 1028–1040. https://doi.org/10.1016/j.camwa.2016.07.009 doi: 10.1016/j.camwa.2016.07.009
![]() |
[52] |
R. K. Saxena, A. M. Mathai, H. J. Haubold, Space–time fractional reaction-diffusion equations associated with a generalized Riemann–Liouville fractional derivative, Axioms, 3 (2014), 320–334. https://doi.org/10.3390/axioms3030320 doi: 10.3390/axioms3030320
![]() |
[53] | R. K. Saxena, Z. Tomovski, T. Sandev, Fractional Helmholtz and fractional wave equations with Riesz-Feller and generalized Riemann-Liouville fractional derivatives, Eur. J. Pure Appl. Math., 7 (2014), 312–334. http://www.ejpam.com/index.php/ejpam/article/view/2176 |
[54] |
Z. Tomovski, Generalized Cauchy type problems for nonlinear fractional differential equations with composite fractional derivative operator, Nonlinear Anal., 75 (2012), 3364–3384. https://doi.org/10.1016/j.na.2011.12.034 doi: 10.1016/j.na.2011.12.034
![]() |
[55] |
Z. Tomovski, T. Sandev, R. Metzler, J. Dubbeldam, Generalized space-time fractional diffusion equation with composite fractional time derivative, Phys. A., 391 (2012), 2527–2542. 10.1016/j.physa.2011.12.035 doi: 10.1016/j.physa.2011.12.035
![]() |
[56] |
Z. Tomovski, T. Sandev, Distributed-order wave equations with composite time fractional derivative, Int. J. Comput. Math., 95 (2018), 1100–1113. https://doi.org/10.1080/00207160.2017.1366465 doi: 10.1080/00207160.2017.1366465
![]() |
[57] | A. A. Kilbas, H. M. Srivastava, J. J. Trujillo, Theory and Applications of Fractional Differential Equations, North-Holland Mathematics Studies, Elsevier, Amsterdam, 2006. |
[58] | S. G. Samko, A. A. Kilbas, O. I. Marichev, Fractional Integrals and Derivatives: Theory and Applications, Gordon and Breach, New York, 1993. |
[59] | E. C. Titchmarsh, Introduction to the Theory of Fourier Integrals, Clarendon Press, Oxford, 1937. |
[60] | A. A. Kilbas, M. Saigo, H-transforms. Theory and applications, Analytical Methods and Special Functions, Chapman & Hall/CRC, Boca Raton, 2004. |
[61] | R. Gorenflo, A. A. Kilbas, F. Mainardi, S. V. Rogosin, Mittag-Leffler Functions, Related Topics and Applications, 2nd extended and updated edition, Springer Monographs in Mathematics, Springer, Berlin, 2020. |
[62] |
M. Ferreira, N. Vieira, Fundamental solutions of the time fractional diffusion-wave and parabolic Dirac operators, J. Math. Anal. Appl., 447 (2017), 329–353. https://doi.org/10.1016/j.jmaa.2016.08.052 doi: 10.1016/j.jmaa.2016.08.052
![]() |
[63] | A. P. Prudnikov, Y. A. Brychkov, O. I. Marichev, Integrals and series. Volume 5: Inverse Laplace transforms, Gordon and Breach Science Publishers, New York etc., 1992. |
[64] |
R. Garrapa, Numerical evaluation of two and three parameter Mittag-Leffler functions, SIAM J. Numer. Anal., 53 (2015), 1350–1369. https://doi.org/10.1137/140971191 doi: 10.1137/140971191
![]() |
[65] | N. Vieira, M. Ferreira, M. M. Rodrigues, Time-fractional telegraph equation with ψ-Hilfer derivatives, Comput. Appl. Math., 41 (2022). |
1. | Vincent Peter C. Magboo, Ma. Sheila A. Magboo, 2022, Chapter 2, 978-3-031-14831-6, 23, 10.1007/978-3-031-14832-3_2 | |
2. | M. Shyamala Devi, J. Arun Pandian, P. S. Ramesh, A. Prem Chand, Anshumam Raj, Ayush Raj, Rahul Kumar Thakur, 2023, Chapter 34, 978-981-19-5291-3, 363, 10.1007/978-981-19-5292-0_34 | |
3. | Yuejun Guo, Qiang Hu, Qiang Tang, Yves Le Traon, 2024, Chapter 19, 978-3-031-51481-4, 371, 10.1007/978-3-031-51482-1_19 | |
4. | Derrick Hoang Danh Nguyen, Arinah Jing Hui Tan, Ronjin Lee, Wei Feng Lim, Jia Yih Wong, Fadhlina Suhaimi, Monitoring of plant diseases caused by Fusarium commune and Rhizoctonia solani in bok choy using hyperspectral remote sensing and machine learning, 2024, 1526-498X, 10.1002/ps.8414 | |
5. | Derrick Nguyen, Arinah Tan, Ronjin Lee, Wei Feng Lim, Tin Fat Hui, Fadhlina Suhaimi, Early detection of infestation by mustard aphid, vegetable thrips and two-spotted spider mite in bok choy with deep neural network (DNN) classification model using hyperspectral imaging data, 2024, 220, 01681699, 108892, 10.1016/j.compag.2024.108892 | |
6. | Ananthajit Ajaya Kumar, Ashwani Assam, Addressing performance improvement of a neural network model for Reynolds-averaged Navier–Stokes solutions with high wake formation, 2024, 41, 0264-4401, 1740, 10.1108/EC-08-2023-0446 | |
7. | Xiaoyan Zhao, Shaopeng Guan, CTCN: a novel credit card fraud detection method based on Conditional Tabular Generative Adversarial Networks and Temporal Convolutional Network, 2023, 9, 2376-5992, e1634, 10.7717/peerj-cs.1634 | |
8. | Yan Peng, Hanzi Chen, Tinghui Li, The Impact of Digital Transformation on ESG: A Case Study of Chinese-Listed Companies, 2023, 15, 2071-1050, 15072, 10.3390/su152015072 | |
9. | Jun Ye, Shoulei Lu, Jiawei Chen, A New Image Oversampling Method Based on Influence Functions and Weights, 2024, 14, 2076-3417, 10553, 10.3390/app142210553 | |
10. | Chun-Chao Huang, Hsin-Fan Chiang, Cheng-Chih Hsieh, Bo-Rui Zhu, Wen-Jie Wu, Jin-Siang Shaw, Impact of Dataset Size on 3D CNN Performance in Intracranial Hemorrhage Classification, 2025, 15, 2075-4418, 216, 10.3390/diagnostics15020216 |
Methods and illustrations | |
Over-sampling | ROS: generated the cases of the minority class randomly |
SMOTE: generated the cases of the minority class with KNN randomly | |
Under-sampling | Borderline-SMOTE: generated the cases of the minority with SMOTE in overlapping range |
EOS: generated the cases of the minority class randomly with "entropy" information | |
RBO: generated the cases of the minority class randomly with "radial" information | |
RUS: removed the cases of the majority class randomly | |
SMOTE + ENN (Tao et al., 2019): removed the cases of the majority class with KNN randomly | |
SMOTE + Tomek (Wang et al., 2019): removed the cases of the majority with deleting Tomek cases | |
OSS (Rodriguez et al., 2013): removed the cases of the majority with just deleting the case of the majority in Tomek cases | |
SBC (Xiao and Gao, 2019): removed the cases of the majority class with the clustering theory randomly | |
Hybrid sampling | EUS: removed the cases of the majority class randomly with "entropy" information |
EHS: hybrid resampling that entropy based | |
CCR: hybrid resampling that synthesizing and cleaning based |
Algorithms | Ensemble frameworks | Combined strategies | |
Data driven | AsBagging | Bagging | Bootstrap |
UnderBagging | Bagging | Undersampling | |
OverBagging | Bagging | Oversampling | |
SMOTEBagging | Bagging | SMOTE | |
SMOTEBoost | Boost | SMOTE | |
RUSBoost | Boost | Undersampling | |
EasyEnsemble | Hybrid | Undersampling | |
BalanceCascade | Hybrid | Undersampling | |
Cost-sensitive | CS-SemiBagging | Bagging | Fuzzy cost matrix |
AdaCX | Boost | Empirical cost matrix | |
AdaCost | Boost | Fuzzy cost matrix | |
DE-CStacking | Stacking | Adaptive cost matrix |
Data sets | Variates | Size | IR | Tra: Tes |
yeast | 8 | 514 | 9.08 | 7:3 |
glass | 9 | 314 | 3.20 | 7:3 |
cleveland | 13 | 177 | 16.62 | 7:3 |
vehicle | 18 | 846 | 3.25 | 7:3 |
Dataset | Item | Precision | Recall | F1-Score | Accuracy |
yeast | 1 | 0.9231 | 0.5714 | 0.7059 | 0.9355 |
2 | 0.8462 | 0.6471 | 0.7333 | 0.9484 | |
3 | 0.9231 | 0.5714 | 0.7059 | 0.9355 | |
4 | 0.8462 | 0.5500 | 0.6667 | 0.9290 | |
average | 0.8847 | 0.5850 | 0.7030 | 0.9371 | |
cost-sen | 0.92310 | 0.5714 | 0.7059 | 0.9355 | |
glass | 1 | 0.8125 | 1.0000 | 0.8966 | 0.9538 |
2 | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
3 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
4 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
average | 0.8906 | 1.0000 | 0.9413 | 0.9500 | |
cost-sen | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
cleveland | 1 | 0.6667 | 0.3333 | 0.4444 | 0.9038 |
2 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
3 | 0.3333 | 0.2000 | 0.2500 | 0.8846 | |
4 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
average | 0.4167 | 0.2583 | 0.3165 | 0.8990 | |
cost-sen | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
vehicle | 1 | 0.8788 | 0.9062 | 0.8923 | 0.9449 |
2 | 0.8333 | 0.9016 | 0.8661 | 0.9331 | |
3 | 0.8636 | 0.9194 | 0.8906 | 0.9449 | |
4 | 0.8485 | 0.918 | 0.8819 | 0.9409 | |
average | 0.8561 | 0.9113 | 0.8827 | 0.94095 | |
cost-sen | 0.8182 | 0.9153 | 0.8640 | 0.93310 |
Prediction positive | Prediction negative | |
Positive class | True positive (TP) | False negatives (FN) |
Negative class | False positive (FP) | True negatives (TN) |
Methods and illustrations | |
Over-sampling | ROS: generated the cases of the minority class randomly |
SMOTE: generated the cases of the minority class with KNN randomly | |
Under-sampling | Borderline-SMOTE: generated the cases of the minority with SMOTE in overlapping range |
EOS: generated the cases of the minority class randomly with "entropy" information | |
RBO: generated the cases of the minority class randomly with "radial" information | |
RUS: removed the cases of the majority class randomly | |
SMOTE + ENN (Tao et al., 2019): removed the cases of the majority class with KNN randomly | |
SMOTE + Tomek (Wang et al., 2019): removed the cases of the majority with deleting Tomek cases | |
OSS (Rodriguez et al., 2013): removed the cases of the majority with just deleting the case of the majority in Tomek cases | |
SBC (Xiao and Gao, 2019): removed the cases of the majority class with the clustering theory randomly | |
Hybrid sampling | EUS: removed the cases of the majority class randomly with "entropy" information |
EHS: hybrid resampling that entropy based | |
CCR: hybrid resampling that synthesizing and cleaning based |
Algorithms | Ensemble frameworks | Combined strategies | |
Data driven | AsBagging | Bagging | Bootstrap |
UnderBagging | Bagging | Undersampling | |
OverBagging | Bagging | Oversampling | |
SMOTEBagging | Bagging | SMOTE | |
SMOTEBoost | Boost | SMOTE | |
RUSBoost | Boost | Undersampling | |
EasyEnsemble | Hybrid | Undersampling | |
BalanceCascade | Hybrid | Undersampling | |
Cost-sensitive | CS-SemiBagging | Bagging | Fuzzy cost matrix |
AdaCX | Boost | Empirical cost matrix | |
AdaCost | Boost | Fuzzy cost matrix | |
DE-CStacking | Stacking | Adaptive cost matrix |
Data sets | Variates | Size | IR | Tra: Tes |
yeast | 8 | 514 | 9.08 | 7:3 |
glass | 9 | 314 | 3.20 | 7:3 |
cleveland | 13 | 177 | 16.62 | 7:3 |
vehicle | 18 | 846 | 3.25 | 7:3 |
Dataset | Item | Precision | Recall | F1-Score | Accuracy |
yeast | 1 | 0.9231 | 0.5714 | 0.7059 | 0.9355 |
2 | 0.8462 | 0.6471 | 0.7333 | 0.9484 | |
3 | 0.9231 | 0.5714 | 0.7059 | 0.9355 | |
4 | 0.8462 | 0.5500 | 0.6667 | 0.9290 | |
average | 0.8847 | 0.5850 | 0.7030 | 0.9371 | |
cost-sen | 0.92310 | 0.5714 | 0.7059 | 0.9355 | |
glass | 1 | 0.8125 | 1.0000 | 0.8966 | 0.9538 |
2 | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
3 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
4 | 0.9375 | 1.0000 | 0.9677 | 0.9846 | |
average | 0.8906 | 1.0000 | 0.9413 | 0.9500 | |
cost-sen | 0.8750 | 1.0000 | 0.9333 | 0.8769 | |
cleveland | 1 | 0.6667 | 0.3333 | 0.4444 | 0.9038 |
2 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
3 | 0.3333 | 0.2000 | 0.2500 | 0.8846 | |
4 | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
average | 0.4167 | 0.2583 | 0.3165 | 0.8990 | |
cost-sen | 0.3333 | 0.2500 | 0.2857 | 0.9038 | |
vehicle | 1 | 0.8788 | 0.9062 | 0.8923 | 0.9449 |
2 | 0.8333 | 0.9016 | 0.8661 | 0.9331 | |
3 | 0.8636 | 0.9194 | 0.8906 | 0.9449 | |
4 | 0.8485 | 0.918 | 0.8819 | 0.9409 | |
average | 0.8561 | 0.9113 | 0.8827 | 0.94095 | |
cost-sen | 0.8182 | 0.9153 | 0.8640 | 0.93310 |
Prediction positive | Prediction negative | |
Positive class | True positive (TP) | False negatives (FN) |
Negative class | False positive (FP) | True negatives (TN) |