For wearable electrocardiogram (ECG) acquisition, it was easy to infer motion artifices and other noises. In this paper, a novel end-to-end ECG denoising method was proposed, which was implemented by fusing the Efficient Channel Attention (ECA-Net) and the cycle consistent generative adversarial network (CycleGAN) method. The proposed denoising model was optimized by using the ECA-Net method to highlight the key features and introducing a new loss function to further extract the global and local ECG features. The original ECG signal came from the MIT-BIH Arrhythmia Database. Additionally, the noise signals used in this method consist of a combination of Gaussian white noise and noises sourced from the MIT-BIH Noise Stress Test Database, including EM (Electrode Motion Artifact), BW (Baseline Wander) and MA (Muscle Artifact), as well as mixed noises composed of EM+BW, EM+MA, BW+MA and EM+BW+MA. Moreover, corrupted ECG signals were generated by adding different levels of single and mixed noises to clean ECG signals. The experimental results show that the proposed method has better denoising performance and generalization ability with higher signal-to-noise ratio improvement (SNRimp), as well as lower root-mean-square error (RMSE) and percentage-root-mean-square difference (PRD).
Citation: Peng Zhang, Mingfeng Jiang, Yang Li, Ling Xia, Zhefeng Wang, Yongquan Wu, Yaming Wang, Huaxiong Zhang. An efficient ECG denoising method by fusing ECA-Net and CycleGAN[J]. Mathematical Biosciences and Engineering, 2023, 20(7): 13415-13433. doi: 10.3934/mbe.2023598
Related Papers:
[1]
Yan Xie, Rui Zhu, Xiaolong Tan, Yuan Chai .
Inhibition of absence seizures in a reduced corticothalamic circuit via closed-loop control. Electronic Research Archive, 2023, 31(5): 2651-2666.
doi: 10.3934/era.2023134
[2]
Ariel Leslie, Jianzhong Su .
Modeling and simulation of a network of neurons regarding Glucose Transporter Deficiency induced epileptic seizures. Electronic Research Archive, 2022, 30(5): 1813-1835.
doi: 10.3934/era.2022092
[3]
Zhihui Wang, Yanying Yang, Lixia Duan .
Dynamic mechanism of epileptic seizures induced by excitatory pyramidal neuronal population. Electronic Research Archive, 2023, 31(8): 4427-4442.
doi: 10.3934/era.2023226
[4]
Xiaolong Tan, Hudong Zhang, Yan Xie, Yuan Chai .
Electromagnetic radiation and electrical stimulation controls of absence seizures in a coupled reduced corticothalamic model. Electronic Research Archive, 2023, 31(1): 58-74.
doi: 10.3934/era.2023004
[5]
Saeedreza Tofighi, Farshad Merrikh-Bayat, Farhad Bayat .
Designing and tuning MIMO feedforward controllers using iterated LMI restriction. Electronic Research Archive, 2022, 30(7): 2465-2486.
doi: 10.3934/era.2022126
[6]
Yang Song, Beiyan Yang, Jimin Wang .
Stability analysis and security control of nonlinear singular semi-Markov jump systems. Electronic Research Archive, 2025, 33(1): 1-25.
doi: 10.3934/era.2025001
[7]
Jichen Hu, Ming Zhu, Tian Chen .
The nonlinear observer-based fault diagnosis method for the high altitude airship. Electronic Research Archive, 2025, 33(2): 907-930.
doi: 10.3934/era.2025041
[8]
Ramalingam Sakthivel, Palanisamy Selvaraj, Oh-Min Kwon, Seong-Gon Choi, Rathinasamy Sakthivel .
Robust memory control design for semi-Markovian jump systems with cyber attacks. Electronic Research Archive, 2023, 31(12): 7496-7510.
doi: 10.3934/era.2023378
[9]
Yu-Jing Shi, Yan Ma .
Finite/fixed-time synchronization for complex networks via quantized adaptive control. Electronic Research Archive, 2021, 29(2): 2047-2061.
doi: 10.3934/era.2020104
[10]
Huan Luo .
Heterogeneous anti-synchronization of stochastic complex dynamical networks involving uncertain dynamics: an approach of the space-time discretizations. Electronic Research Archive, 2025, 33(2): 613-641.
doi: 10.3934/era.2025029
Abstract
For wearable electrocardiogram (ECG) acquisition, it was easy to infer motion artifices and other noises. In this paper, a novel end-to-end ECG denoising method was proposed, which was implemented by fusing the Efficient Channel Attention (ECA-Net) and the cycle consistent generative adversarial network (CycleGAN) method. The proposed denoising model was optimized by using the ECA-Net method to highlight the key features and introducing a new loss function to further extract the global and local ECG features. The original ECG signal came from the MIT-BIH Arrhythmia Database. Additionally, the noise signals used in this method consist of a combination of Gaussian white noise and noises sourced from the MIT-BIH Noise Stress Test Database, including EM (Electrode Motion Artifact), BW (Baseline Wander) and MA (Muscle Artifact), as well as mixed noises composed of EM+BW, EM+MA, BW+MA and EM+BW+MA. Moreover, corrupted ECG signals were generated by adding different levels of single and mixed noises to clean ECG signals. The experimental results show that the proposed method has better denoising performance and generalization ability with higher signal-to-noise ratio improvement (SNRimp), as well as lower root-mean-square error (RMSE) and percentage-root-mean-square difference (PRD).
1.
Introduction
Cardiovascular disease has always been regarded as the most serious and fatal disease for humans in the world. The increase in morbidity and mortality of cardiovascular disease has brought enormous risks and burdens to healthcare systems around the world [1]. Despite the efforts of medical staff to prevent, diagnose, and treat different types of cardiovascular diseases, the number of deaths from cardiovascular disease in the world continues to increase every year, and by 2019 the number has increased to 18.6 million [2]. World Health Organization estimates that the deaths from cardiovascular disease in 2020 accounts for approximately 32% of the total deaths worldwide [3]. According to NHANES report, the prevalence of cardiovascular disease among adults over 20 years old from 2013 to 2016 was 48%, and the prevalence increased with age [4].
CAD is one of the common clinical cardiovascular diseases. Clinically, Coronary Angiography (CAG) is mainly used to determine the location and extent of arterial stenosis. The CAG technique is to obtain coronary artery images by X-ray after direct injection of contrast medium into the femoral artery. The inspection is time-consuming, expensive, traumatic, and it has high technical threshold and equipment requirements [5]. Coronary Computed Tomography Angiography (CTA) is an emerging examination technique. The CTA technology obtains accurate and clear images of cardiac coronary artery by intravenous injection of contrast medium and computer reconstruction spiral CT scanning, which is non-invasive. However, the patient's respiration, heart rate, cardiac function, and other factors could affect the imaging quality, which causes the inspection effect of this method to be inferior to CAG [6]. To avoid the harm of CAG technology to patients and the limitations of CTA technology on patients' factors, researchers have widely applied machine learning and data mining techniques to diagnose CAD [7].
This research will build models for diagnosing LAD, LCX, and RCA based on swarm intelligence optimization algorithm and machine learning techniques. All the research in this paper is done on the extension of Z-Alizadeh Sani dataset, which is derived from the Mendeley Data [8]. We could know the severity of the patient's condition by diagnosing the stenosis of individual artery, which can assist physicians to take corresponding treatments according to different degrees of disease. Feature selection optimization based on meta-heuristic optimizer is a new feature selection algorithm proposed in recent years, which can effectively solve global optimization problems and avoid falling into local optimal solutions [9]. For feature selection, we first apply the filtering method to delete the features with the variance of 0. Then we use the KNN-based WOA to select the feature subsets and use the classification accuracy of KNN and the number of features to guarantee the quality of the selected feature subsets [10]. In this study, a two-layer stacking model is established to blend the results of individual and ensemble classifiers. Four classifiers with best performance are selected from KNN, Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost) algorithms as primary learners, and Logistic Regression (LR) is applied as the secondary learner to reduce the complexity of the model [11]. We adopt accuracy, recall, precision, F1-score, and AUC value as the evaluation metrics of the model.
The rest of the paper consists of the following contents: Section 2 provides an overview of relevant studies in previous literature; Section 3 introduces the feature selection methods and machine learning algorithms. Section 4 discusses the results of feature selection and the classification performance of the proposed method. Section 5 summarizes this paper and looks forward to future study.
2.
Relevant studies
In this section, we briefly discuss some relevant studies that use machine learning algorithms, data mining techniques, and improved methods to diagnose and predict diseases.
2.1. Feature selection methods
The methods and results of feature selection determine the classification performance of the model to some extent. In [12,13,17,18,19,20], Roohallah Alizadehsani and Zeinab Arabasadi et al. adopted the weight by SVM method to complete feature selection. This method uses the normal vector coefficients of linear support vector machine as feature weights. In [14,15], Roohallah Alizadehsani et al. used information gain to measure the importance of features, and selected features with information gain higher than a certain value as feature subset. In [16], Roohallah Alizadehsani et al. measured feature importance by calculating the Gini index of each feature. In [21], Roohallah Alizadehsani et al. proposed the assurance feature selection method. This method measured the importance of a feature by calculating the ratio of the number of patients associated with a feature to the total number of patients. In [22], Roohallah Alizadehsani et al. applied Gini index and principal component analysis (PCA) to calculate the weights of features, and determined the threshold of the weights for feature selection through experiments. In the above studies, researchers implemented feature selection by calculating and evaluating the importance of single features. These methods are computationally fast and easy to implement, focusing on the ability to select features that have a great impact on disease classification.
Metaheuristic optimization algorithm (MOA) mainly simulates natural and human intelligence to solve the optimal solution [23]. MOA can be divided into four main categories: evolutionary, swarm intelligence, human, and physical and chemical based algorithms. Among them, Genetic Algorithm (GA) [24], Particle Swarm Optimization (PSO) [25], Sine Cosine Algorithm (SCA) [26], Moth-flame Optimization (MFO) [27], WOA [28], Grey Wolf Optimizer (GWO) [29] and other algorithms are widely used for feature selection.
Moloud Abdar et al. [30] used three types SVMs to establish models for CAD diagnosis, and compared the performance of GA and PSO for feature selection and model parameter optimization in parallel. This method could simultaneously select the optimal feature subset and parameter combination of the model. Bayu Adhi Tama et al. [5] combined the Correlation-based Feature Selection (CFS) and Credal Decision Tree (CDT)-based BPSO to identify important features. The CFS could identify the unimportant and unnecessary features for classification. Shafaq Abbas et al. [31] applied Extremely Randomized Tree (ERT)-based WOA to conduct feature selection and classification on breast cancer dataset. Hoda Zamani et al. [32] proposed a FSWOA algorithm for feature selection, which achieved effective dimensionality reduction in medical datasets. In the above studies, researchers used different machine learning algorithms as wrappers to combine MOA to achieve feature selection for disease diagnosis problems. By preliminarily filtering redundant features in the dataset, the initial population of the MOA can be optimized and the efficiency of feature selection can be improved.
Since most metaheuristic algorithms are proposed to solve continuous problems, researchers have used transfer functions to convert each dimension of the solution vectors into binary form for feature selection, especially for medical data [33]. E. Emary et al. [34] proposed two binary-conversion methods for the GWO algorithm. The second method is to use the S-shaped transfer function to convert the updated gray wolf position vector into binary form. Shokooh Taghian et al. [35] proposed SBSCA and VBSCA method for feature selection, researchers used S-shaped and V-shaped transfer function to achieve binary-conversion of SCA. Mohammad H. Nadimi-Shahraki et al. [23] improved the MFO algorithm using S-shaped, V-shaped and U-shaped transfer functions for feature selection on medical datasets.
2.2. Machine learning methods
In previous studies, researchers have applied a variety of basic classification algorithms and ensemble classification algorithms to establish CAD diagnostic models. In [12,13], Roohallah Alizadehsani et al. introduced the cost sensitive algorithm into the model construction, and combined the 10-fold cross-validation (cv) method to evaluate the classification performance of the Sequence Minimum Optimization (SMO) algorithm for CAD. In [15], Roohallah Alizadehsani et al. used ensemble learning method to combine classification results of SMO algorithm and Naïve Bayes (NB) to diagnose CAD. In [16], the researchers used bagging algorithm to obtain high accuracy in the diagnose of LAD. In [18,20,21], Roohallah Alizadehsani et al. used SVM to establish diagnostic models for CAD and main coronary artery stenosis. Zeinab Arabasadi et al. [19] applied Neural Network to establish CAD diagnostic model, and adjusted the weight of Neural Network through GA to obtain ideal classification performance. In [22], the improved SVM algorithm was used to diagnose LAD, LCX and RCA stenosis. The researchers combined the distance between the sample and the separating hyperplane with the accuracy of the classifier to improve the model's performance. Md Mamun Ali et al. [36] applied KNN, DT, and RF to establish disease diagnosis models. Bayu Adhi Tama et al. [5] built a two-layer stacking model to diagnose CAD, RF, GBDT, and XGBoost were used to obtain classification results in the first layer, and the Generalized Linear Model (GLM) in the second layer to generate the final predictions.
According to the above summary, we can find that there are few studies on the diagnosis of each main coronary artery stenosis. In this study, we will apply machine learning and data mining algorithms to diagnose stenosis of each main coronary artery. We will divide the training set and test set at a ratio of 9: 1, combine randomized search and 10-fold cv to train the model on the training set, and then use the trained model to make predictions on the test set. We apply filtering and KNN-based WOA to select the optimal feature subset for each main coronary artery and then build a two-layer stacking model based on the selected feature subset. At last, we compare the performance achieved by the proposed method with the classification performance obtained in the existing literature.
3.
Materials and methods
3.1. Dataset
The extension of Z-Alizadeh Sani dataset contains the clinical record information of 303 patients, and each patient has 55 features [8]. The patients with LAD, LCX, and RCA stenosis were 160,109, and 101. A total of 216 people were diagnosed with CAD. The features can be divided into four fields: demographic features, symptoms and physical examination, electrocardiography (ECG), laboratory and echocardiography [13]. Table 1 shows the information of features in the extension of Z-Alizadeh Sani dataset.
Table 1.
Features and types of the extension of Z-Alizadeh Sani dataset.
Feature type
Feature name
Count
Continuous
Age, Weight, Height, Body Mass Index (BMI), Blood Pressure (BP), Pulse Rate (PR), Fasting Blood Sugar (FBS), Creatine (Cr), Triglyceride (TG), Low Density Lipoprotein (LDL), High Density Lipoprotein (HDL), Blood Urea Nitrogen (BUN), Erythrocyte Sedimentation Rate (ESR), Hemoglobin (HB), Potassium (K), Sodium (Na), White Blood Cell (WBC), Lymphocyte (Lymph), Neutrophil (Neut), Platelet (PLT), Ejection Fraction (EF-TTE)
21
Binary
Sex, Diabetes Mellitus (DM), Hyper Tension (HTN), Current Smoker, EX-Smoker, Family History (FH), Obesity, Chronic Renal Failure (CRF), Cerebrovascular Accident (CVA), Airway disease, Thyroid Disease, Congestive Heart Failure (CHF), Dyslipidemia (DLP), Edema, Weak Peripheral Pulse, Lung Rales, Systolic Murmur, Diastolic Murmur, Typical Chest Pain (Typical CP), Dyspnea, Atypical, Nonanginal Chest Pain (Nonanginal CP), Exertional Chest Pain (Exertional CP), Low Threshold Angina (LowTH Ang), Q Wave, St Elevation, St Depression, T inversion, Left Ventricular Hypertrophy (LVH), Poor R Progression
30
Categorical
Function Class, Bundle Branch Block (BBB), Region with Regional wall motion abnormality (Region RWMA), Valvular Heart Disease (VHD)
The data mining process starts from the preprocessing stage, followed by feature engineering, and finally uses machine learning algorithms to establish models. The algorithm flow of this research is shown in Figure 1.
In one-hot encoding, each category of features is represented by a vector whose length is equal to the number of categories. The i-th vector only takes the value of 1 at the i-th component, and the rest are all 0. One-hot encoding of features can extend the value of discrete features to Euclidean space, making the distance calculation between features more reasonable.
Standardization is to scale the values of a column of numerical features to a state of mean 0 and variance 1. The standardized formula is shown in Eq (1).
S(x)=x−−xσ
(1)
Among them, x is the instance in an n-dimensional space, n is the number of features, and −x and σ represent the mean and standard deviation of each feature [37].
3.4. Performance evaluation measures
In this research, we use accuracy, precision, recall, F1-score, and AUC value to evaluate the classification performance of proposed model. The calculation formulas are shown in Eqs (2)–(5) [38].
Accuracy=TP+TNTP+FN+FP+TN
(2)
Precision=TPTP+FP
(3)
Recall=TPTP+FN
(4)
F1−score=2×Precision×RecallPrecision+Recall
(5)
AUC (Area under curve) is defined as the area under the ROC curve, which is usually greater than 0.5 and less than 1. The larger the AUC value, the better the classification performance of the classifier.
3.5. Whale Optimization Algorithm (WOA)
WOA is a meta-heuristic global optimization algorithm related to swarm intelligence proposed by Mirjalili and Lewis [28]. The algorithm is inspired by the bubble net foraging behavior of humpback whales and finds the optimal solution by simulating this unique behavior. They hunt for food by continuously shrinking the enclosure, spirally updating their positions, and hunting randomly [39].
The algorithm mainly includes two stages: first, achieve the encirclement of the prey, and update the spiral position (also known as hunting behavior); second, search for the prey randomly [39]. Next, we will introduce each stage in detail:
1) Surround the prey. Humpback whales can identify the location of their prey and circle around them. In the initial stage of the algorithm, since we don't know the location of the optimal solution in the search space, the WOA will assume that the best candidate solution currently obtained is the target solution or is close to the optimal solution. After defining the best candidate solution, the whales will attempt to move from other candidate positions to the best position and update their positions. This process is represented by Eq (6):
→P(t+1)=→P∗(t)−→A⋅|→C→P∗(t)−→P(t)|
(6)
Among them, t is the number of iterations in the current state, →P∗(t) is the position vector of the optimal solution currently, →P(t) is the current position vector, and →A and →C are the constant vectors. The vectors →A and →C are calculated by Eqs (7) and (8):
→A=2→a⋅→r−→a
(7)
→C=2⋅→r
(8)
In the above equations, →a is decreased linearly from 2 to 0 in the iterative process, and →r is a random vector in the range of [0, 1].
2) Hunting behavior. Humpback whales hunt by swimming towards their prey in a spiral motion. The mathematical model of hunting behavior is as follows:
→P(t+1)=|→P∗(t)−→P(t)|⋅ebl⋅cos(2πl)+→P∗(t)
(9)
Among them, b is a constant used to define the shape of the logarithmic spiral, and l is a random number in the range of [-1, 1]. During the hunting process, each humpback whale randomly chooses to shrink to surround the prey or spiral upward to chase the prey. The probability of each whale choosing these two behaviors is 50%. The researchers simulated this behavior through the following mathematical model:
3) Search for prey. The algorithm searches for prey according to the change of the value of →A. When |→A|>1, the algorithm randomly selects a search individual, and then updates the position of other individuals according to the location of the randomly selected individual, forcing the whale to deviate from the prey, thereby finding a more suitable prey and make the WOA realize global search. When |→A|<1, the whales attack the preys. The mathematical model is shown in Eq (11):
→P(t+1)=→Prand−→A⋅|→C→Prand−→P(t)|
(11)
Among them, →Prand is the position vector of the whale randomly selected from the population.
In this study, we apply the KNN-based WOA for feature selection. WOA is used to adaptively search for the optimal feature subset to maximize classification accuracy, and KNN is used to ensure the quality of the selected feature subset. In WOA, the whale takes any point in space as a starting point, and continuously adjusts its position to the best candidate solution. Each solution obtained by this algorithm is a continuous vector of the same dimension, bounded in [0, 1] [39]. The function of feature selection can be achieved by setting a threshold to perform binary-conversion on the solution vector. In this study, we set the threshold to 0.5, and the value in the solution is 1 when it is greater than 0.5, and 0 when it is less than 0.5. The length of each solution is M and consists of 0 and 1, where M is the total number of features, 1 means that the feature at the corresponding position is selected, and 0 means that the feature is abandoned. Multiple solution vectors are obtained by changing the initial population size and the number of iterations. The quality of the solution is evaluated by the fitness function. Eq (12) shows the fitness function used in this paper.
f=α⋅E+(1−α)mM
(12)
where f is the fitness of a given solution vector of size M, m is the number of selected features, E is the classification error rate of the classifier, and α is a constant that balances the error rate of the classifier with the number of selected features [10]. The smaller the fitness value, the better the performance of the feature, and the closer to the optimal solution. In this research, E is the classification error rate of KNN, and α is set to 0.99. The pseudocode of WOA feature selection algorithm is shown in Algorithm 1.
Table Algorithm1.
Pseudocode of WOA feature selection algorithm.
Input: Number of iterations (T), size of initial population (N). Output: The global optimal position vector P*. 1 Initialize the population Pi (i = 1, 2, 3, …, N). 2 Binary-conversion. 3 Compute the fitness of each solution. 4 Set P* as the best solution. 5 while (t < T) do 6 for (each solution) do 7 Update a, A, C, l and p. 8 if (p < 0.5) then 9 if (|A| < 1) then 10 The whale position is updating by the Eq (6). 11 else 12 if (|A| ≥ 1) then 13 Select the random whale Prand. 14 The whale position is updating by the Eq (11). 15 end 16 end 17 else 18 if (p ≥ 0.5) then 19 Modify the whale position by the Eq (9). 20 end 21 end 22 end 23 Check if any search agent goes beyond the search space and amend it. 24 Binary-conversion. 25 Compute the fitness of each search agent. 26 Update P* if there is a better solution. 27 t = t + 1 28 end 29 return P*
In this study, we construct a two-layer stacking model for diagnosing each main coronary artery stenosis. The primary learners in this model are selected from KNN, SVM, DT, RF, GBDT, XGBoost, and AdaBoost, and LR is used as a secondary learner to blend the classification results of multiple primary learners to obtain the final prediction results.
LR is a generalized linear model used to solve binary classification problems. The output value of the linear model is processed by the sigmoid function and positioned between (0, 1) for the task of binary classification [36]. KNN is a supervised algorithm, and the principle of KNN is that when predicting a new sample, it can judge which category the sample belongs to according to the category of the k points closest to it [40]. SVM is a two-class model with superior performance and flexibility, which can minimize both empirical and structural risks. For a sample set in a finite-dimensional space, SVM performs classification by mapping the sample set from the original feature space to a high-dimensional space [41]. DT is a non-parametric supervised learning method, and the generation of DT is to continuously select the optimal features to divide the training set. ID3, C4.5, and CART are the three main DT algorithms [42]. RF is a special bagging method [43]. For each training set, a decision tree is constructed. When nodes are selected for feature splitting, some features are randomly selected from all the features, and the optimal solution is found from the selected features and applied to node splitting. GBDT uses the addition model and forward stepwise algorithm for greedy learning, and learns a CART tree in each iteration to fit the residuals between the predicted results of the previous (t-1) trees and the real values of the training samples [44]. XGBoost is an optimized distributed gradient boosting library [45]. XGBoost supports multiple types of base classifiers. When using CART as a base classifier, XGBoost improves its generalization ability by adding a regular term to control the complexity of the model. AdaBoost is an iterative algorithm implemented by changing the distribution of dataset. The weight of the sample incorrectly classified by the previous classifier in the training set will increase, and the weight of the sample correctly classified will decrease. Then the new dataset with modified weight is passed to the next classifier for training. At last, the algorithm combines the classifiers obtained each time as the final decision classifier [46].
Stacking is an ensemble learning algorithm that learns how to best combine the prediction results from multiple well-performing machine learning models. In the stacking model, we call the base learners the primary learners and the learner used for blending is called the secondary learner or meta-learner. Specifically, the original dataset is divided into several subsets, which are input into each primary learner of the first layer and predicted by k-fold cv. Then, the output of each primary learner in the first layer is taken as the input value of the secondary learner in the second layer. The final prediction result is obtained by fitting the trained model to the test set [47]. The algorithm flow of the stacking model is shown in Figure 2.
In this section, we systematically describe the implementation of the proposed method on the extension of Z-Alizadeh Sani dataset and report the results. All experiments in this research are carried out on a Windows machine with 8GB memory and Intel (R) Core (TM) i5-7200U CPU @ 2.50GHZ. Python 3.8 is used in the Jupyter Notebook IDE to implement the entire experiment.
4.1. Data preprocessing
First, for the processing of categorical features, one-hot encoding is performed on the feature Bundle Branch Block (BBB) to obtain three binary features BBB_LBBB, BBB_RBBB and BBB_N. For feature Valvular Heart Disease (VHD), its values "Normal", "Mild", "Moderate", and "Severe" are denoted as 0, 1, 2 and 3 respectively. Second, for the features Function Class and Region with Regional wall motion abnormality (Region RWMA), they are processed according to the discretization range provided in the Braunwald heart book, when the value of the feature is zero, it is recorded as "Normal", and non-zero is recorded as "High" [48]. Then, all the categorical features with two values are transformed into numerical values. Next, the dataset is divided in a ratio of 9: 1 to obtain training data and test data for LAD, LCX and RCA diagnosis respectively. The training data is used to develop the model, while the test data is used to evaluate the classification performance of the model. At last, the training set is standardized, and the mean and variance of the training set are used for standardizing the test set. Mark "Stenotic" as 1, and "Normal" as 0 in the three labels of LAD, LCX and RCA.
4.2. Results of feature selection
In this study, we first use the filtering method to delete the feature Exertional CP with the variance of 0 and then use KNN-based WOA to select the feature subsets for diagnosing each main coronary artery. The parameters of feature selection algorithm are shown in Table 2.
Table 2.
Parameters of WOA feature selection algorithm.
We first compare the results of feature selection by KNN-based WOA, BPSO, GA and BGWO, and the number of iterations and the size of initial population of each algorithm are the same as WOA, as shown in Table 2. Each method is run for 60 times. Table 3 shows the average fitness of the feature subsets obtained by these four algorithms, the average classification accuracy, and the average AUC value on validation set. It can be seen from Table 3 that the KNN-based WOA method has the best performance, and Friedman test results show that there are significant differences in the performance of different feature selection methods. The bold font indicates the best result in the following tables.
Table 3.
Comparison of KNN-based WOA, BPSO, GA, and BGWO feature selection methods.
Next, we compare WOA feature selection methods based on different wrappers. We choose SVM, DT, and RF to compare with KNN, and run each method 60 times to generate multiple feature subsets. The average performance of these four algorithms on validation set and Friedman test results are given in Table 4. We can conclude from Table 4 that the KNN-based WOA is more suitable for feature selection of the problems in this study. Because the WOA has low complexity, fast convergence, and good optimization performance, and the KNN has low computational complexity and is easy to repeat, the KNN-based WOA method is suitable for the feature selection problem of this study.
Table 4.
Comparison of WOA feature selection methods based on different wrappers.
In the feature selection process, we obtain multiple feature subsets by setting different iterations and the sizes of the initial population for WOA. We run 20 times for each parameter combination to find the corresponding optimal feature subset. So, for all parameter combinations, we run a total of 120 times. The number of features, classification accuracy and AUC values of the optimal feature subsets on test sets obtained by multiple operations and comparison under different parameter combinations are given in Table 5.
Table 5.
Classification performance of feature subsets obtained under different parameter combinations.
For the diagnosis of three main coronary artery stenosis, the optimal features with better performance are obtained under the parameter combinations of (N = 30, T = 200), (N = 20, T = 400), and (N = 20, T = 300) respectively. Through the KNN-based WOA feature selection method, 17 features are randomly obtained for the diagnosis of LAD, LCX and RCA. The results of feature selection and the Pearson correlation coefficient between features and labels are shown in Table 6.
Table 6.
Feature selection results and Pearson correlation coefficient of LAD, LCX and RCA.
In this section, we first train multiple basic classifiers, then evaluate and compare the performance of these classifiers. The parameters of each classifier are adjusted on the training set by randomized search combined with 10-fold cv, and the average AUC value is taken as the parameter tuning metric to obtain the optimal hyperparameter combination. We compare the KNN, SVM, DT, RF, GBDT, XGBoost, and AdaBoost algorithms to select four best classifiers as the primary learners to build stacking models. The performance of each algorithm on each main coronary artery test set is shown in Table 7.
Table 7.
Performance of each algorithm in LAD, LCX, and RCA diagnosis.
According to the "strong-strong combination" strategy of the stacking model, we should try to choose classifiers with good performance and maintain the diversity of primary learners. According to the performance indicators of each machine learning algorithm in the above table, we choose SVM, GBDT, XGBoost, and AdaBoost as primary learners for LAD diagnosis, choose KNN, SVM, GBDT, and AdaBoost for LCX diagnosis, choose KNN, SVM, GBDT, and AdaBoost for RCA diagnosis, and use LR as the secondary learner of these three models. For these three stacking models, we compare the performance of the stacking algorithm in the case of 3-fold, 5-fold, and 8-fold cv to select the best cross-validation fold. Moreover, we generate three sets of 10 random numbers between 0 and 1000 to control the random state of the stacking algorithm in each model. Then, we calculate the average value of the model evaluation indicators under 10 different random states to test the stability of the models.
Table 8 shows the performance of the model in the diagnosis of main coronary arteries when different cross-validation folds are set for the stacking algorithm. From the data in the table, we can conclude that the 5-fold cv of the stacking algorithm can make our model obtain the best classification performance.
Table 8.
The performance of the stacking model under different cross-validation folds.
In Tables 9–11, we compare the classification performance of the proposed stacking models with each primary learner on the test set. It can be seen that the stacking models have the highest accuracy and F1-score in all prediction results, and the stacking models can also stabilize the recall, precision, and AUC value in a good range.
Table 9.
Performance of stacking model and primary learners in LAD diagnosis.
From Table 9 we can see that compared with other machine learning algorithms, stacking model has the highest accuracy (89.68%), recall (100%), and F1-score (91.42%) in the diagnosis of LAD stenosis, especially the recall of 100%. This shows that the model can identify all patients with artery stenosis in the LAD test set, and minimize the possibility of missed diagnosis in patients with CAD. The high accuracy and F1-score indicate that the model also has good ability to identify patients without CAD. In Table 10, compared with other algorithms, the stacking model achieves the highest accuracy (88.71%), precision (83.03%), F1-score (82.41%), and AUC value (0.9019) in the diagnosis of LCX, indicating that the model can accurately distinguish patients with LCX stenosis from those without stenosis. In Table 11, compared with other machine learning algorithms, stacking model in diagnosis of RCA obtains the highest accuracy (85.81%) and F1-score (82.44%). The AUC value (0.9252) of the model remains at a relatively good level, which also shows that the model can accurately distinguish the patients with RCA stenosis and non-stenosis, and effectively reduce the possibility of missed diagnosis in patients with CAD. From the above analysis, we can know that the classification performance of the stacking model is better than that of individual classifiers and ensemble classifiers based on bagging and boosting. The stacking model can combine the advantages of each primary learner to improve the prediction performance to the highest level.
We compare the classification performance of the model proposed in this paper with the performance of the model established by using the recursive feature elimination cross-validation (RFE-CV) method based on SVM for feature selection [49]. The results are shown in Table 12. It can be seen that the proposed method has better classification performance.
Table 12.
Performance of stacking model and primary learners in RCA diagnosis.
Figure 3 shows the comparison diagrams of the classification performance between the stacking model and each primary learner in diagnosing LAD, LCX and RCA stenosis.
Figure 3.
Classification performance of LAD, LCX, and RCA diagnostics using stacking model versus primary learners.
We compare the number of features and classification accuracy of the proposed model in this study with previous studies and show the differences between the methods and results in Table 13. It can be seen that the feature selection method used in this paper selects fewer features, and the classification accuracy of the proposed model is significantly better than other methods in diagnosing individual LAD, LCX and RCA stenosis.
Table 13.
Comparison of proposed method with previous studies for detecting main coronary arteries stenosis.
4.4. Application of the proposed model on Cleveland dataset
We apply the proposed model to the well-known Cleveland dataset to diagnose heart disease [51]. The dataset contains 303 patient records, each of which has 13 features for diagnosing heart disease [52]. In this study, we select 297 records with no missing values to build the model, of which 137 patients were diagnosed with heart disease. We use the same processing method as the Z-Alizadeh Sani dataset to complete the preprocessing of the Cleveland dataset, and apply the KNN-based WOA method for feature selection. We run 20 times for each parameter combination and obtain the optimal feature subset under parameter combination of (T = 20, N = 300). The feature subset containing 6 features including resting electrocardiographic results (restecg), maximum heart rate achieved (thalach), exercise induced angina (exang), the slope of the peak exercise ST segment (slope), number of major vessels colored by flourosopy (ca), and thallium scan (thal). Then, we build a stacking model based on the selected features and choose KNN, SVM, GBDT, and AdaBoost as primary learners. The proposed method achieves a classification accuracy of 89.67% and an AUC value of 0.9129 on the Cleveland dataset. We compare the classification performance of the stacking model with each primary learner and show the results in Table 14.
Table 14.
Performance of stacking model and primary learners on the Cleveland dataset.
Table 15 shows the comparison of the performance of the proposed model and the models in previous studies on the Cleveland dataset. It can be seen from the comparison results that our proposed method selects fewer features on the Cleveland dataset, and the model can achieve high precision and AUC value in the diagnosis of heart disease.
Table 15.
Comparison of proposed method with previous studies on the Cleveland dataset.
This paper proposes new models for main coronary arteries diagnosis. We use the KNN-based WOA for feature selection on the extension of Z-Alizadeh Sani dataset and apply stacking models to diagnose LAD, LCX and RCA stenosis.
In the feature selection process, first, we delete the feature with zero variance in the dataset. Then, by comparing the feature selection results of multiple meta-heuristic optimization algorithms and different wrappers, the KNN-based WOA method is used to select the optimal feature subsets. By using this method, we obtain three optimal feature subsets for diagnosing each main coronary artery, each of which contains 17 features. According to the results of feature selection, it can be seen that the features Age, BMI, FBS, and HTN appear in the feature subsets of each main coronary artery, which indicates that these features are important indicators affecting CAD.
In the stacking model, we choose different primary learners for the three coronary arteries, and use LR as the secondary learner. In this paper, the average classification performance of the stacking algorithm in multiple random states is calculated to obtain stable results. The diagnostic accuracy of the proposed method in this research for LAD, LCX and RCA stenosis is 89.68, 88.71 and 85.81%, respectively. The classification performance of the stacking model is more stable than that of other machine learning algorithms. Compared with previous studies, we select relatively fewer features in this study, and the diagnostic accuracy of the proposed model is also significantly improved. Our results show that the proposed method can be well applied to CAD datasets and provide a reliable and robust model for clinical diagnosis.
In the future study, we intend to use improved WOA to select fewer features and make accurate predictions of each main coronary artery stenosis on larger CAD datasets.
Acknowledgments
Thanks to our families and colleagues who supported us morally.
Conflict of interest
All authors declare no conflicts of interest in this paper.
References
[1]
J. Wang, R. Li, R. Li, B. Fu, A knowledge-based deep learning method for ECG signal delineation, Future Gener. Comput. Syst., 109 (2020), 56−66. https://doi.org/10.1016/j.future.2020.02.068 doi: 10.1016/j.future.2020.02.068
[2]
J. Y. Seo, Y. H. Noh, D. U. Jeong, Research of the deep learning model for denoising of ECG signal and classification of arrhythmias, in International Conference on Intelligent Human Computer Interaction, (2022), 198−204. https://doi.org/10.1007/978-3-030-98404-5_19
[3]
P. Singh, G. Pradhan, S. Shahnawazuddin, Denoising of ECG signal by non-local estimation of approximation coefficients in DWT, Biocybern. Biomed. Eng., 37 (2017), 599−610. https://doi.org/10.1016/j.bbe.2017.06.001 doi: 10.1016/j.bbe.2017.06.001
[4]
H. Hao, H. Wang, N. ur Rehman, L. Chen, H. Tian, An improved multivariate wavelet denoising method using subspace projection, IEICE Trans. Fundamentals Electron. Commun. Comput. Sci., 100 (2017), 769−775. https://doi.org/10.1587/transfun.E100.A.769 doi: 10.1587/transfun.E100.A.769
[5]
Z. Wang, J. Zhu, T. Yan, L. Yang, A new modified wavelet-based ECG denoising, Comput. Assisted Surg., 24 (2019), 174−183. https://doi.org/10.1080/24699322.2018.1560088 doi: 10.1080/24699322.2018.1560088
[6]
Y. Ye, W. He, Y. Cheng, W. Huang, Z. Zhang, A robust random forest-based approach for heart rate monitoring using photoplethysmography signal contaminated by intense motion artifacts, Sensors, 17 (2017), 385. https://doi.org/10.3390/s17020385 doi: 10.3390/s17020385
[7]
M. Zhang, G. Wei, An integrated EMD adaptive threshold denoising method for reduction of noise in ECG, PLoS One, 15 (2020), e0235330. https://doi.org/10.1371/journal.pone.0235330 doi: 10.1371/journal.pone.0235330
[8]
D. Zhang, S. Wang, F. Li, S. Tian, J. Wang, X. Ding, et al., An efficient ECG denoising method based on empirical mode decomposition, sample entropy, and improved threshold function, Wireless Commun. Mobile Comput., 2020 (2020). https://doi.org/10.1155/2020/8811962 doi: 10.1155/2020/8811962
[9]
W. He, Y. Ye, Y. Li, H. Xu, L. Lu, W. Huang, et al., Variational mode decomposition-based heart rate estimation using wrist-type photoplethysmography during physical exercise, in 2018 24th International Conference on Pattern Recognition (ICPR), (2018), 3766−3771. https://doi.org/10.1109/ICPR.2018.8545685
[10]
Y. Wang, D. Bai, Application of wavelet threshold method based on optimized VMD to ECG denoising, in 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), (2021), 741−744. https://doi.org/10.1109/ICFTIC54370.2021.9647050
[11]
B. Yang, Y. Dong, C. Yu, Z. Hou, Singular spectrum analysis window length selection in processing capacitive captured biopotential signals, IEEE Sens. J., 16 (2016), 7183–7193. https://doi.org/10.1109/JSEN.2016.2594189 doi: 10.1109/JSEN.2016.2594189
[12]
S. K. Mukhopadhyay, S. Krishnan, A singular spectrum analysis-based model-free electrocardiogram denoising technique, Comput. Methods Programs Biomed., 188 (2020), 1−15. https://doi.org/10.1016/j.cmpb.2019.105304 doi: 10.1016/j.cmpb.2019.105304
[13]
H. Sharma, K. K. Sharma, Baseline wander removal of ECG signals using Hilbert vibration decomposition, Electron. Lett., 51 (2015), 447−449. https://doi.org/10.1049/el.2014.4076 doi: 10.1049/el.2014.4076
[14]
B. R. Manju, M. R. Sneha, ECG denoising using wiener filter and kalman filter, Procedia Comput. Sci., 171 (2020), 273−281. https://doi.org/10.1016/j.procs.2020.04.029 doi: 10.1016/j.procs.2020.04.029
[15]
S. M. Qaisar, Baseline wander and power-line interference elimination of ECG signals using efficient signal-piloted filtering, Healthcare Technol. Lett., 7 (2020), 114−118. https://doi.org/10.1049/htl.2019.0116 doi: 10.1049/htl.2019.0116
[16]
S. A. Malik, S. A. Parah, B. A. Malik, Power line noise and baseline wander removal from ECG signals using empirical mode decomposition and lifting wavelet transform technique, Health Tech., 12 (2022), 745−756. https://doi.org/10.1007/s12553-022-00662-x doi: 10.1007/s12553-022-00662-x
[17]
B. Liu, Y. Li, ECG signal denoising based on similar segments cooperative filtering, Biomed. Signal Process. Control, 68 (2021), 102751. https://doi.org/10.1016/j.bspc.2021.102751 doi: 10.1016/j.bspc.2021.102751
[18]
J. Wang, Y. Ye, X. Pan, X. Gao, Parallel-type fractional zero-phase filtering for ECG signal denoising, Biomed. Signal Process. Control, 18 (2015), 36–41. https://doi.org/10.1016/j.bspc.2014.10.012 doi: 10.1016/j.bspc.2014.10.012
[19]
G. Wang, L. Yang, M. Liu, X. Yuan, P. Xiong, F. Lin, et al., ECG signal denoising based on deep factor analysis, Biomed. Signal Process. Control, 57 (2020), 101824. https://doi.org/10.1016/j.bspc.2019.101824 doi: 10.1016/j.bspc.2019.101824
[20]
J. Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 2223−2232. https://doi.org/10.1109/ICCV.2017.244
[21]
S. Kiranyaz, O. C. Devecioglu, T. Ince, J. Malik, M. Chowdhury, T. Hamid, et al., Blind ECG restoration by operational cycle-GANs, IEEE Trans. Biomed. Eng., 69 (2022), 3572−3581. https://doi.org/10.1109/TBME.2022.3172125 doi: 10.1109/TBME.2022.3172125
[22]
S. Kiranyaz, J. Malik, H. B. Abdallah, T. Ince, A. Iosifidis, M. Gabbouj, Self-organized operational neural networks with generative neurons, Neural Networks, 140 (2021), 294−308. https://doi.org/10.1016/j.neunet.2021.02.028 doi: 10.1016/j.neunet.2021.02.028
[23]
J. Malik, S. Kiranyaz, M. Gabbouj, Self-organized operational neural networks for severe image restoration problems, Neural Networks, 135 (2021), 201−211. https://doi.org/10.1016/j.neunet.2020.12.014 doi: 10.1016/j.neunet.2020.12.014
[24]
K. Antczak, Deep recurrent neural networks for ECG signal denoising, preprint, arXiv: 1807.11551.
L. Qiu, W. Cai, M. Zhang, W. Zhu, L. Wang, Two-stage ECG signal denoising based on deep convolutional network, Physiol. Meas., 42 (2021), 115002. https://doi.org/10.1088/1361-6579/ac34ea doi: 10.1088/1361-6579/ac34ea
[27]
J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 7132−7141. https://doi.org/10.1109/cvpr.2018.00745
[28]
Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, ECA-Net: efficient channel attention for deep convolutional neural networks, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 11531−11539. https://doi.org/10.1109/cvpr42600.2020.01155
[29]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial networks, Commun. ACM, 63 (2020), 139−144. https://doi.org/10.1145/3422622 doi: 10.1145/3422622
[30]
T. Zhou, P. Krähenbühl, M. Aubry, Q. Huang, A. A. Efros, Learning dense correspondence via 3d-guided cycle consistency, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 117−126. https://doi.org/10.1109/cvpr.2016.20
[31]
S. Pascual, A. Bonafonte, J. Serra, SEGAN: Speech enhancement generative adversarial network, Proc. Interspeech, (2017), 3642–3646. https://doi.org/10.21437/interspeech.2017-1428 doi: 10.21437/interspeech.2017-1428
[32]
J. Wang, R. Li, R. Li, K. Li, H. Zeng, G. Xie, et al., Adversarial de-noising of electrocardiogram, Neurocomputing, 349 (2019), 212−224. https://doi.org/10.1016/j.neucom.2019.03.083 doi: 10.1016/j.neucom.2019.03.083
[33]
Z. Wang, F. Wan, C. M. Wong, L. Zhang, Adaptive Fourier decomposition based ECG denoising, Comput. Biol. Med., 77 (2016), 195−205. https://doi.org/10.1016/j.compbiomed.2016.08.013 doi: 10.1016/j.compbiomed.2016.08.013
This article has been cited by:
1.
Ying Xu, Huixin Qin,
Dynamical Response in an Electromechanical Arm Driven by Temperature-Dependent Neural Circuit,
2024,
05779073,
10.1016/j.cjph.2024.11.010
2.
D. Vignesh, Shaobo He, Santo Banerjee,
A review on the complexities of brain activity: insights from nonlinear dynamics in neuroscience,
2024,
0924-090X,
10.1007/s11071-024-10558-2
3.
Yitong Guo, Chunni Wang, Jun Ma,
Model approach of artificial muscle and leg movements,
2025,
529,
03759601,
130069,
10.1016/j.physleta.2024.130069
4.
Denggui Fan, Jin Chen, Songan Hou, Zhengyong Song, Gerold Baier, Qingyun Wang,
From experimental phenomena to computational models: Exploring the synchronization mechanisms of phase-locked stimulation in the hippocampal–thalamic–cortical circuit for memory consolidation,
2025,
191,
09600779,
115754,
10.1016/j.chaos.2024.115754
5.
Qun Guo, Guodong Ren, Chunni Wang, Zhigang Zhu,
Reliability and energy function of an oscillator and map neuron,
2025,
251,
03032647,
105443,
10.1016/j.biosystems.2025.105443
Table Algorithm1.
Pseudocode of WOA feature selection algorithm.
Input: Number of iterations (T), size of initial population (N). Output: The global optimal position vector P*. 1 Initialize the population Pi (i = 1, 2, 3, …, N). 2 Binary-conversion. 3 Compute the fitness of each solution. 4 Set P* as the best solution. 5 while (t < T) do 6 for (each solution) do 7 Update a, A, C, l and p. 8 if (p < 0.5) then 9 if (|A| < 1) then 10 The whale position is updating by the Eq (6). 11 else 12 if (|A| ≥ 1) then 13 Select the random whale Prand. 14 The whale position is updating by the Eq (11). 15 end 16 end 17 else 18 if (p ≥ 0.5) then 19 Modify the whale position by the Eq (9). 20 end 21 end 22 end 23 Check if any search agent goes beyond the search space and amend it. 24 Binary-conversion. 25 Compute the fitness of each search agent. 26 Update P* if there is a better solution. 27 t = t + 1 28 end 29 return P*
Age, Weight, Height, Body Mass Index (BMI), Blood Pressure (BP), Pulse Rate (PR), Fasting Blood Sugar (FBS), Creatine (Cr), Triglyceride (TG), Low Density Lipoprotein (LDL), High Density Lipoprotein (HDL), Blood Urea Nitrogen (BUN), Erythrocyte Sedimentation Rate (ESR), Hemoglobin (HB), Potassium (K), Sodium (Na), White Blood Cell (WBC), Lymphocyte (Lymph), Neutrophil (Neut), Platelet (PLT), Ejection Fraction (EF-TTE)
21
Binary
Sex, Diabetes Mellitus (DM), Hyper Tension (HTN), Current Smoker, EX-Smoker, Family History (FH), Obesity, Chronic Renal Failure (CRF), Cerebrovascular Accident (CVA), Airway disease, Thyroid Disease, Congestive Heart Failure (CHF), Dyslipidemia (DLP), Edema, Weak Peripheral Pulse, Lung Rales, Systolic Murmur, Diastolic Murmur, Typical Chest Pain (Typical CP), Dyspnea, Atypical, Nonanginal Chest Pain (Nonanginal CP), Exertional Chest Pain (Exertional CP), Low Threshold Angina (LowTH Ang), Q Wave, St Elevation, St Depression, T inversion, Left Ventricular Hypertrophy (LVH), Poor R Progression
30
Categorical
Function Class, Bundle Branch Block (BBB), Region with Regional wall motion abnormality (Region RWMA), Valvular Heart Disease (VHD)
4
Input: Number of iterations (T), size of initial population (N). Output: The global optimal position vector P*. 1 Initialize the population Pi (i = 1, 2, 3, …, N). 2 Binary-conversion. 3 Compute the fitness of each solution. 4 Set P* as the best solution. 5 while (t < T) do 6 for (each solution) do 7 Update a, A, C, l and p. 8 if (p < 0.5) then 9 if (|A| < 1) then 10 The whale position is updating by the Eq (6). 11 else 12 if (|A| ≥ 1) then 13 Select the random whale Prand. 14 The whale position is updating by the Eq (11). 15 end 16 end 17 else 18 if (p ≥ 0.5) then 19 Modify the whale position by the Eq (9). 20 end 21 end 22 end 23 Check if any search agent goes beyond the search space and amend it. 24 Binary-conversion. 25 Compute the fitness of each search agent. 26 Update P* if there is a better solution. 27 t = t + 1 28 end 29 return P*