1.
Introduction
Cardiovascular disease has always been regarded as the most serious and fatal disease for humans in the world. The increase in morbidity and mortality of cardiovascular disease has brought enormous risks and burdens to healthcare systems around the world [1]. Despite the efforts of medical staff to prevent, diagnose, and treat different types of cardiovascular diseases, the number of deaths from cardiovascular disease in the world continues to increase every year, and by 2019 the number has increased to 18.6 million [2]. World Health Organization estimates that the deaths from cardiovascular disease in 2020 accounts for approximately 32% of the total deaths worldwide [3]. According to NHANES report, the prevalence of cardiovascular disease among adults over 20 years old from 2013 to 2016 was 48%, and the prevalence increased with age [4].
CAD is one of the common clinical cardiovascular diseases. Clinically, Coronary Angiography (CAG) is mainly used to determine the location and extent of arterial stenosis. The CAG technique is to obtain coronary artery images by X-ray after direct injection of contrast medium into the femoral artery. The inspection is time-consuming, expensive, traumatic, and it has high technical threshold and equipment requirements [5]. Coronary Computed Tomography Angiography (CTA) is an emerging examination technique. The CTA technology obtains accurate and clear images of cardiac coronary artery by intravenous injection of contrast medium and computer reconstruction spiral CT scanning, which is non-invasive. However, the patient's respiration, heart rate, cardiac function, and other factors could affect the imaging quality, which causes the inspection effect of this method to be inferior to CAG [6]. To avoid the harm of CAG technology to patients and the limitations of CTA technology on patients' factors, researchers have widely applied machine learning and data mining techniques to diagnose CAD [7].
This research will build models for diagnosing LAD, LCX, and RCA based on swarm intelligence optimization algorithm and machine learning techniques. All the research in this paper is done on the extension of Z-Alizadeh Sani dataset, which is derived from the Mendeley Data [8]. We could know the severity of the patient's condition by diagnosing the stenosis of individual artery, which can assist physicians to take corresponding treatments according to different degrees of disease. Feature selection optimization based on meta-heuristic optimizer is a new feature selection algorithm proposed in recent years, which can effectively solve global optimization problems and avoid falling into local optimal solutions [9]. For feature selection, we first apply the filtering method to delete the features with the variance of 0. Then we use the KNN-based WOA to select the feature subsets and use the classification accuracy of KNN and the number of features to guarantee the quality of the selected feature subsets [10]. In this study, a two-layer stacking model is established to blend the results of individual and ensemble classifiers. Four classifiers with best performance are selected from KNN, Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost) algorithms as primary learners, and Logistic Regression (LR) is applied as the secondary learner to reduce the complexity of the model [11]. We adopt accuracy, recall, precision, F1-score, and AUC value as the evaluation metrics of the model.
The rest of the paper consists of the following contents: Section 2 provides an overview of relevant studies in previous literature; Section 3 introduces the feature selection methods and machine learning algorithms. Section 4 discusses the results of feature selection and the classification performance of the proposed method. Section 5 summarizes this paper and looks forward to future study.
2.
Relevant studies
In this section, we briefly discuss some relevant studies that use machine learning algorithms, data mining techniques, and improved methods to diagnose and predict diseases.
2.1. Feature selection methods
The methods and results of feature selection determine the classification performance of the model to some extent. In [12,13,17,18,19,20], Roohallah Alizadehsani and Zeinab Arabasadi et al. adopted the weight by SVM method to complete feature selection. This method uses the normal vector coefficients of linear support vector machine as feature weights. In [14,15], Roohallah Alizadehsani et al. used information gain to measure the importance of features, and selected features with information gain higher than a certain value as feature subset. In [16], Roohallah Alizadehsani et al. measured feature importance by calculating the Gini index of each feature. In [21], Roohallah Alizadehsani et al. proposed the assurance feature selection method. This method measured the importance of a feature by calculating the ratio of the number of patients associated with a feature to the total number of patients. In [22], Roohallah Alizadehsani et al. applied Gini index and principal component analysis (PCA) to calculate the weights of features, and determined the threshold of the weights for feature selection through experiments. In the above studies, researchers implemented feature selection by calculating and evaluating the importance of single features. These methods are computationally fast and easy to implement, focusing on the ability to select features that have a great impact on disease classification.
Metaheuristic optimization algorithm (MOA) mainly simulates natural and human intelligence to solve the optimal solution [23]. MOA can be divided into four main categories: evolutionary, swarm intelligence, human, and physical and chemical based algorithms. Among them, Genetic Algorithm (GA) [24], Particle Swarm Optimization (PSO) [25], Sine Cosine Algorithm (SCA) [26], Moth-flame Optimization (MFO) [27], WOA [28], Grey Wolf Optimizer (GWO) [29] and other algorithms are widely used for feature selection.
Moloud Abdar et al. [30] used three types SVMs to establish models for CAD diagnosis, and compared the performance of GA and PSO for feature selection and model parameter optimization in parallel. This method could simultaneously select the optimal feature subset and parameter combination of the model. Bayu Adhi Tama et al. [5] combined the Correlation-based Feature Selection (CFS) and Credal Decision Tree (CDT)-based BPSO to identify important features. The CFS could identify the unimportant and unnecessary features for classification. Shafaq Abbas et al. [31] applied Extremely Randomized Tree (ERT)-based WOA to conduct feature selection and classification on breast cancer dataset. Hoda Zamani et al. [32] proposed a FSWOA algorithm for feature selection, which achieved effective dimensionality reduction in medical datasets. In the above studies, researchers used different machine learning algorithms as wrappers to combine MOA to achieve feature selection for disease diagnosis problems. By preliminarily filtering redundant features in the dataset, the initial population of the MOA can be optimized and the efficiency of feature selection can be improved.
Since most metaheuristic algorithms are proposed to solve continuous problems, researchers have used transfer functions to convert each dimension of the solution vectors into binary form for feature selection, especially for medical data [33]. E. Emary et al. [34] proposed two binary-conversion methods for the GWO algorithm. The second method is to use the S-shaped transfer function to convert the updated gray wolf position vector into binary form. Shokooh Taghian et al. [35] proposed SBSCA and VBSCA method for feature selection, researchers used S-shaped and V-shaped transfer function to achieve binary-conversion of SCA. Mohammad H. Nadimi-Shahraki et al. [23] improved the MFO algorithm using S-shaped, V-shaped and U-shaped transfer functions for feature selection on medical datasets.
2.2. Machine learning methods
In previous studies, researchers have applied a variety of basic classification algorithms and ensemble classification algorithms to establish CAD diagnostic models. In [12,13], Roohallah Alizadehsani et al. introduced the cost sensitive algorithm into the model construction, and combined the 10-fold cross-validation (cv) method to evaluate the classification performance of the Sequence Minimum Optimization (SMO) algorithm for CAD. In [15], Roohallah Alizadehsani et al. used ensemble learning method to combine classification results of SMO algorithm and Naïve Bayes (NB) to diagnose CAD. In [16], the researchers used bagging algorithm to obtain high accuracy in the diagnose of LAD. In [18,20,21], Roohallah Alizadehsani et al. used SVM to establish diagnostic models for CAD and main coronary artery stenosis. Zeinab Arabasadi et al. [19] applied Neural Network to establish CAD diagnostic model, and adjusted the weight of Neural Network through GA to obtain ideal classification performance. In [22], the improved SVM algorithm was used to diagnose LAD, LCX and RCA stenosis. The researchers combined the distance between the sample and the separating hyperplane with the accuracy of the classifier to improve the model's performance. Md Mamun Ali et al. [36] applied KNN, DT, and RF to establish disease diagnosis models. Bayu Adhi Tama et al. [5] built a two-layer stacking model to diagnose CAD, RF, GBDT, and XGBoost were used to obtain classification results in the first layer, and the Generalized Linear Model (GLM) in the second layer to generate the final predictions.
According to the above summary, we can find that there are few studies on the diagnosis of each main coronary artery stenosis. In this study, we will apply machine learning and data mining algorithms to diagnose stenosis of each main coronary artery. We will divide the training set and test set at a ratio of 9: 1, combine randomized search and 10-fold cv to train the model on the training set, and then use the trained model to make predictions on the test set. We apply filtering and KNN-based WOA to select the optimal feature subset for each main coronary artery and then build a two-layer stacking model based on the selected feature subset. At last, we compare the performance achieved by the proposed method with the classification performance obtained in the existing literature.
3.
Materials and methods
3.1. Dataset
The extension of Z-Alizadeh Sani dataset contains the clinical record information of 303 patients, and each patient has 55 features [8]. The patients with LAD, LCX, and RCA stenosis were 160,109, and 101. A total of 216 people were diagnosed with CAD. The features can be divided into four fields: demographic features, symptoms and physical examination, electrocardiography (ECG), laboratory and echocardiography [13]. Table 1 shows the information of features in the extension of Z-Alizadeh Sani dataset.
3.2. Algorithm flow
The data mining process starts from the preprocessing stage, followed by feature engineering, and finally uses machine learning algorithms to establish models. The algorithm flow of this research is shown in Figure 1.
3.3. Data preprocessing methods
In one-hot encoding, each category of features is represented by a vector whose length is equal to the number of categories. The i-th vector only takes the value of 1 at the i-th component, and the rest are all 0. One-hot encoding of features can extend the value of discrete features to Euclidean space, making the distance calculation between features more reasonable.
Standardization is to scale the values of a column of numerical features to a state of mean 0 and variance 1. The standardized formula is shown in Eq (1).
Among them, x is the instance in an n-dimensional space, n is the number of features, and −x and σ represent the mean and standard deviation of each feature [37].
3.4. Performance evaluation measures
In this research, we use accuracy, precision, recall, F1-score, and AUC value to evaluate the classification performance of proposed model. The calculation formulas are shown in Eqs (2)–(5) [38].
AUC (Area under curve) is defined as the area under the ROC curve, which is usually greater than 0.5 and less than 1. The larger the AUC value, the better the classification performance of the classifier.
3.5. Whale Optimization Algorithm (WOA)
WOA is a meta-heuristic global optimization algorithm related to swarm intelligence proposed by Mirjalili and Lewis [28]. The algorithm is inspired by the bubble net foraging behavior of humpback whales and finds the optimal solution by simulating this unique behavior. They hunt for food by continuously shrinking the enclosure, spirally updating their positions, and hunting randomly [39].
The algorithm mainly includes two stages: first, achieve the encirclement of the prey, and update the spiral position (also known as hunting behavior); second, search for the prey randomly [39]. Next, we will introduce each stage in detail:
1) Surround the prey. Humpback whales can identify the location of their prey and circle around them. In the initial stage of the algorithm, since we don't know the location of the optimal solution in the search space, the WOA will assume that the best candidate solution currently obtained is the target solution or is close to the optimal solution. After defining the best candidate solution, the whales will attempt to move from other candidate positions to the best position and update their positions. This process is represented by Eq (6):
Among them, t is the number of iterations in the current state, →P∗(t) is the position vector of the optimal solution currently, →P(t) is the current position vector, and →A and →C are the constant vectors. The vectors →A and →C are calculated by Eqs (7) and (8):
In the above equations, →a is decreased linearly from 2 to 0 in the iterative process, and →r is a random vector in the range of [0, 1].
2) Hunting behavior. Humpback whales hunt by swimming towards their prey in a spiral motion. The mathematical model of hunting behavior is as follows:
Among them, b is a constant used to define the shape of the logarithmic spiral, and l is a random number in the range of [-1, 1]. During the hunting process, each humpback whale randomly chooses to shrink to surround the prey or spiral upward to chase the prey. The probability of each whale choosing these two behaviors is 50%. The researchers simulated this behavior through the following mathematical model:
3) Search for prey. The algorithm searches for prey according to the change of the value of →A. When |→A|>1, the algorithm randomly selects a search individual, and then updates the position of other individuals according to the location of the randomly selected individual, forcing the whale to deviate from the prey, thereby finding a more suitable prey and make the WOA realize global search. When |→A|<1, the whales attack the preys. The mathematical model is shown in Eq (11):
Among them, →Prand is the position vector of the whale randomly selected from the population.
In this study, we apply the KNN-based WOA for feature selection. WOA is used to adaptively search for the optimal feature subset to maximize classification accuracy, and KNN is used to ensure the quality of the selected feature subset. In WOA, the whale takes any point in space as a starting point, and continuously adjusts its position to the best candidate solution. Each solution obtained by this algorithm is a continuous vector of the same dimension, bounded in [0, 1] [39]. The function of feature selection can be achieved by setting a threshold to perform binary-conversion on the solution vector. In this study, we set the threshold to 0.5, and the value in the solution is 1 when it is greater than 0.5, and 0 when it is less than 0.5. The length of each solution is M and consists of 0 and 1, where M is the total number of features, 1 means that the feature at the corresponding position is selected, and 0 means that the feature is abandoned. Multiple solution vectors are obtained by changing the initial population size and the number of iterations. The quality of the solution is evaluated by the fitness function. Eq (12) shows the fitness function used in this paper.
where f is the fitness of a given solution vector of size M, m is the number of selected features, E is the classification error rate of the classifier, and α is a constant that balances the error rate of the classifier with the number of selected features [10]. The smaller the fitness value, the better the performance of the feature, and the closer to the optimal solution. In this research, E is the classification error rate of KNN, and α is set to 0.99. The pseudocode of WOA feature selection algorithm is shown in Algorithm 1.
3.6. Machine learning algorithms
In this study, we construct a two-layer stacking model for diagnosing each main coronary artery stenosis. The primary learners in this model are selected from KNN, SVM, DT, RF, GBDT, XGBoost, and AdaBoost, and LR is used as a secondary learner to blend the classification results of multiple primary learners to obtain the final prediction results.
LR is a generalized linear model used to solve binary classification problems. The output value of the linear model is processed by the sigmoid function and positioned between (0, 1) for the task of binary classification [36]. KNN is a supervised algorithm, and the principle of KNN is that when predicting a new sample, it can judge which category the sample belongs to according to the category of the k points closest to it [40]. SVM is a two-class model with superior performance and flexibility, which can minimize both empirical and structural risks. For a sample set in a finite-dimensional space, SVM performs classification by mapping the sample set from the original feature space to a high-dimensional space [41]. DT is a non-parametric supervised learning method, and the generation of DT is to continuously select the optimal features to divide the training set. ID3, C4.5, and CART are the three main DT algorithms [42]. RF is a special bagging method [43]. For each training set, a decision tree is constructed. When nodes are selected for feature splitting, some features are randomly selected from all the features, and the optimal solution is found from the selected features and applied to node splitting. GBDT uses the addition model and forward stepwise algorithm for greedy learning, and learns a CART tree in each iteration to fit the residuals between the predicted results of the previous (t-1) trees and the real values of the training samples [44]. XGBoost is an optimized distributed gradient boosting library [45]. XGBoost supports multiple types of base classifiers. When using CART as a base classifier, XGBoost improves its generalization ability by adding a regular term to control the complexity of the model. AdaBoost is an iterative algorithm implemented by changing the distribution of dataset. The weight of the sample incorrectly classified by the previous classifier in the training set will increase, and the weight of the sample correctly classified will decrease. Then the new dataset with modified weight is passed to the next classifier for training. At last, the algorithm combines the classifiers obtained each time as the final decision classifier [46].
Stacking is an ensemble learning algorithm that learns how to best combine the prediction results from multiple well-performing machine learning models. In the stacking model, we call the base learners the primary learners and the learner used for blending is called the secondary learner or meta-learner. Specifically, the original dataset is divided into several subsets, which are input into each primary learner of the first layer and predicted by k-fold cv. Then, the output of each primary learner in the first layer is taken as the input value of the secondary learner in the second layer. The final prediction result is obtained by fitting the trained model to the test set [47]. The algorithm flow of the stacking model is shown in Figure 2.
4.
Experimental results
In this section, we systematically describe the implementation of the proposed method on the extension of Z-Alizadeh Sani dataset and report the results. All experiments in this research are carried out on a Windows machine with 8GB memory and Intel (R) Core (TM) i5-7200U CPU @ 2.50GHZ. Python 3.8 is used in the Jupyter Notebook IDE to implement the entire experiment.
4.1. Data preprocessing
First, for the processing of categorical features, one-hot encoding is performed on the feature Bundle Branch Block (BBB) to obtain three binary features BBB_LBBB, BBB_RBBB and BBB_N. For feature Valvular Heart Disease (VHD), its values "Normal", "Mild", "Moderate", and "Severe" are denoted as 0, 1, 2 and 3 respectively. Second, for the features Function Class and Region with Regional wall motion abnormality (Region RWMA), they are processed according to the discretization range provided in the Braunwald heart book, when the value of the feature is zero, it is recorded as "Normal", and non-zero is recorded as "High" [48]. Then, all the categorical features with two values are transformed into numerical values. Next, the dataset is divided in a ratio of 9: 1 to obtain training data and test data for LAD, LCX and RCA diagnosis respectively. The training data is used to develop the model, while the test data is used to evaluate the classification performance of the model. At last, the training set is standardized, and the mean and variance of the training set are used for standardizing the test set. Mark "Stenotic" as 1, and "Normal" as 0 in the three labels of LAD, LCX and RCA.
4.2. Results of feature selection
In this study, we first use the filtering method to delete the feature Exertional CP with the variance of 0 and then use KNN-based WOA to select the feature subsets for diagnosing each main coronary artery. The parameters of feature selection algorithm are shown in Table 2.
We first compare the results of feature selection by KNN-based WOA, BPSO, GA and BGWO, and the number of iterations and the size of initial population of each algorithm are the same as WOA, as shown in Table 2. Each method is run for 60 times. Table 3 shows the average fitness of the feature subsets obtained by these four algorithms, the average classification accuracy, and the average AUC value on validation set. It can be seen from Table 3 that the KNN-based WOA method has the best performance, and Friedman test results show that there are significant differences in the performance of different feature selection methods. The bold font indicates the best result in the following tables.
Next, we compare WOA feature selection methods based on different wrappers. We choose SVM, DT, and RF to compare with KNN, and run each method 60 times to generate multiple feature subsets. The average performance of these four algorithms on validation set and Friedman test results are given in Table 4. We can conclude from Table 4 that the KNN-based WOA is more suitable for feature selection of the problems in this study. Because the WOA has low complexity, fast convergence, and good optimization performance, and the KNN has low computational complexity and is easy to repeat, the KNN-based WOA method is suitable for the feature selection problem of this study.
In the feature selection process, we obtain multiple feature subsets by setting different iterations and the sizes of the initial population for WOA. We run 20 times for each parameter combination to find the corresponding optimal feature subset. So, for all parameter combinations, we run a total of 120 times. The number of features, classification accuracy and AUC values of the optimal feature subsets on test sets obtained by multiple operations and comparison under different parameter combinations are given in Table 5.
For the diagnosis of three main coronary artery stenosis, the optimal features with better performance are obtained under the parameter combinations of (N = 30, T = 200), (N = 20, T = 400), and (N = 20, T = 300) respectively. Through the KNN-based WOA feature selection method, 17 features are randomly obtained for the diagnosis of LAD, LCX and RCA. The results of feature selection and the Pearson correlation coefficient between features and labels are shown in Table 6.
4.3. Classification results
In this section, we first train multiple basic classifiers, then evaluate and compare the performance of these classifiers. The parameters of each classifier are adjusted on the training set by randomized search combined with 10-fold cv, and the average AUC value is taken as the parameter tuning metric to obtain the optimal hyperparameter combination. We compare the KNN, SVM, DT, RF, GBDT, XGBoost, and AdaBoost algorithms to select four best classifiers as the primary learners to build stacking models. The performance of each algorithm on each main coronary artery test set is shown in Table 7.
According to the "strong-strong combination" strategy of the stacking model, we should try to choose classifiers with good performance and maintain the diversity of primary learners. According to the performance indicators of each machine learning algorithm in the above table, we choose SVM, GBDT, XGBoost, and AdaBoost as primary learners for LAD diagnosis, choose KNN, SVM, GBDT, and AdaBoost for LCX diagnosis, choose KNN, SVM, GBDT, and AdaBoost for RCA diagnosis, and use LR as the secondary learner of these three models. For these three stacking models, we compare the performance of the stacking algorithm in the case of 3-fold, 5-fold, and 8-fold cv to select the best cross-validation fold. Moreover, we generate three sets of 10 random numbers between 0 and 1000 to control the random state of the stacking algorithm in each model. Then, we calculate the average value of the model evaluation indicators under 10 different random states to test the stability of the models.
Table 8 shows the performance of the model in the diagnosis of main coronary arteries when different cross-validation folds are set for the stacking algorithm. From the data in the table, we can conclude that the 5-fold cv of the stacking algorithm can make our model obtain the best classification performance.
In Tables 9–11, we compare the classification performance of the proposed stacking models with each primary learner on the test set. It can be seen that the stacking models have the highest accuracy and F1-score in all prediction results, and the stacking models can also stabilize the recall, precision, and AUC value in a good range.
From Table 9 we can see that compared with other machine learning algorithms, stacking model has the highest accuracy (89.68%), recall (100%), and F1-score (91.42%) in the diagnosis of LAD stenosis, especially the recall of 100%. This shows that the model can identify all patients with artery stenosis in the LAD test set, and minimize the possibility of missed diagnosis in patients with CAD. The high accuracy and F1-score indicate that the model also has good ability to identify patients without CAD. In Table 10, compared with other algorithms, the stacking model achieves the highest accuracy (88.71%), precision (83.03%), F1-score (82.41%), and AUC value (0.9019) in the diagnosis of LCX, indicating that the model can accurately distinguish patients with LCX stenosis from those without stenosis. In Table 11, compared with other machine learning algorithms, stacking model in diagnosis of RCA obtains the highest accuracy (85.81%) and F1-score (82.44%). The AUC value (0.9252) of the model remains at a relatively good level, which also shows that the model can accurately distinguish the patients with RCA stenosis and non-stenosis, and effectively reduce the possibility of missed diagnosis in patients with CAD. From the above analysis, we can know that the classification performance of the stacking model is better than that of individual classifiers and ensemble classifiers based on bagging and boosting. The stacking model can combine the advantages of each primary learner to improve the prediction performance to the highest level.
We compare the classification performance of the model proposed in this paper with the performance of the model established by using the recursive feature elimination cross-validation (RFE-CV) method based on SVM for feature selection [49]. The results are shown in Table 12. It can be seen that the proposed method has better classification performance.
Figure 3 shows the comparison diagrams of the classification performance between the stacking model and each primary learner in diagnosing LAD, LCX and RCA stenosis.
Figure 4 shows the ROC curves of the stacking model and other machine learning algorithms in diagnosing LAD, LCX and RCA stenosis, respectively.
We compare the number of features and classification accuracy of the proposed model in this study with previous studies and show the differences between the methods and results in Table 13. It can be seen that the feature selection method used in this paper selects fewer features, and the classification accuracy of the proposed model is significantly better than other methods in diagnosing individual LAD, LCX and RCA stenosis.
4.4. Application of the proposed model on Cleveland dataset
We apply the proposed model to the well-known Cleveland dataset to diagnose heart disease [51]. The dataset contains 303 patient records, each of which has 13 features for diagnosing heart disease [52]. In this study, we select 297 records with no missing values to build the model, of which 137 patients were diagnosed with heart disease. We use the same processing method as the Z-Alizadeh Sani dataset to complete the preprocessing of the Cleveland dataset, and apply the KNN-based WOA method for feature selection. We run 20 times for each parameter combination and obtain the optimal feature subset under parameter combination of (T = 20, N = 300). The feature subset containing 6 features including resting electrocardiographic results (restecg), maximum heart rate achieved (thalach), exercise induced angina (exang), the slope of the peak exercise ST segment (slope), number of major vessels colored by flourosopy (ca), and thallium scan (thal). Then, we build a stacking model based on the selected features and choose KNN, SVM, GBDT, and AdaBoost as primary learners. The proposed method achieves a classification accuracy of 89.67% and an AUC value of 0.9129 on the Cleveland dataset. We compare the classification performance of the stacking model with each primary learner and show the results in Table 14.
Table 15 shows the comparison of the performance of the proposed model and the models in previous studies on the Cleveland dataset. It can be seen from the comparison results that our proposed method selects fewer features on the Cleveland dataset, and the model can achieve high precision and AUC value in the diagnosis of heart disease.
5.
Discussion
This paper proposes new models for main coronary arteries diagnosis. We use the KNN-based WOA for feature selection on the extension of Z-Alizadeh Sani dataset and apply stacking models to diagnose LAD, LCX and RCA stenosis.
In the feature selection process, first, we delete the feature with zero variance in the dataset. Then, by comparing the feature selection results of multiple meta-heuristic optimization algorithms and different wrappers, the KNN-based WOA method is used to select the optimal feature subsets. By using this method, we obtain three optimal feature subsets for diagnosing each main coronary artery, each of which contains 17 features. According to the results of feature selection, it can be seen that the features Age, BMI, FBS, and HTN appear in the feature subsets of each main coronary artery, which indicates that these features are important indicators affecting CAD.
In the stacking model, we choose different primary learners for the three coronary arteries, and use LR as the secondary learner. In this paper, the average classification performance of the stacking algorithm in multiple random states is calculated to obtain stable results. The diagnostic accuracy of the proposed method in this research for LAD, LCX and RCA stenosis is 89.68, 88.71 and 85.81%, respectively. The classification performance of the stacking model is more stable than that of other machine learning algorithms. Compared with previous studies, we select relatively fewer features in this study, and the diagnostic accuracy of the proposed model is also significantly improved. Our results show that the proposed method can be well applied to CAD datasets and provide a reliable and robust model for clinical diagnosis.
In the future study, we intend to use improved WOA to select fewer features and make accurate predictions of each main coronary artery stenosis on larger CAD datasets.
Acknowledgments
Thanks to our families and colleagues who supported us morally.
Conflict of interest
All authors declare no conflicts of interest in this paper.