
Many developed countries including G-7 became part of the Kyoto Protocol for improving their environmental quality in 2005. Its main goals were to develop national-level programs for energy conservation and the eradication of greenhouse gas emissions. To achieve such goals, certain policy measures including reduction in deforestation, urban population and promotion of renewable energy consumption were adopted. This study aims to examine the consequences of these policies on the environmental quality of G-7 from 1988 to 2018. LLC and IPS unit root tests were applied to check the stationarity of analyzed variables. The results of Pedroni and Kao's co-integration tests proved the long-run relationship between the dependent and explanatory variables. The application of multiple cross-sectional dependence tests revealed that the cross-sections are independent of each other. The findings of the panel Autoregressive distributed lag (ARDL) model exhibited that urbanization, economic growth and nonrenewable energy consumption are hampering the environmental quality. While renewable energy consumption and globalization are improving it in the long term. Urbanization, renewable and nonrenewable energy consumption significantly improve the environmental quality during the short term whereas globalization insignificantly deteriorates the environment. The study confirms the presence of reversed U environmental Kuznets curve between urbanization and carbon emissions with a turning point at 80 per cent of urbanization. The study suggests that the transformation of energy consumption from nonrenewable to renewable sources and strict compliance with environment management policies will prove prolific for improving the environmental quality of G-7.
Citation: Laila Khalid, Imran Hanif, Farhat Rasul. How are urbanization, energy consumption and globalization influencing the environmental quality of the G-7?[J]. Green Finance, 2022, 4(2): 231-252. doi: 10.3934/GF.2022011
[1] | Zhimei Fu, Kezheng Zuo, Yang Chen . Further characterizations of the weak core inverse of matrices and the weak core matrix. AIMS Mathematics, 2022, 7(3): 3630-3647. doi: 10.3934/math.2022200 |
[2] | Xiaofei Cao, Yuyue Huang, Xue Hua, Tingyu Zhao, Sanzhang Xu . Matrix inverses along the core parts of three matrix decompositions. AIMS Mathematics, 2023, 8(12): 30194-30208. doi: 10.3934/math.20231543 |
[3] | Jinyong Wu, Wenjie Shi, Sanzhang Xu . Revisiting the m-weak core inverse. AIMS Mathematics, 2024, 9(8): 21672-21685. doi: 10.3934/math.20241054 |
[4] | Wanlin Jiang, Kezheng Zuo . Further characterizations of the m-weak group inverse of a complex matrix. AIMS Mathematics, 2022, 7(9): 17369-17392. doi: 10.3934/math.2022957 |
[5] | Yang Chen, Kezheng Zuo, Zhimei Fu . New characterizations of the generalized Moore-Penrose inverse of matrices. AIMS Mathematics, 2022, 7(3): 4359-4375. doi: 10.3934/math.2022242 |
[6] | Wanlin Jiang, Kezheng Zuo . Revisiting of the BT-inverse of matrices. AIMS Mathematics, 2021, 6(3): 2607-2622. doi: 10.3934/math.2021158 |
[7] | Yongge Tian . Miscellaneous reverse order laws and their equivalent facts for generalized inverses of a triple matrix product. AIMS Mathematics, 2021, 6(12): 13845-13886. doi: 10.3934/math.2021803 |
[8] | Jin Zhong, Yilin Zhang . Dual group inverses of dual matrices and their applications in solving systems of linear dual equations. AIMS Mathematics, 2022, 7(5): 7606-7624. doi: 10.3934/math.2022427 |
[9] | Suthep Suantai, Suparat Kesornprom, Nattawut Pholasa, Yeol Je Cho, Prasit Cholamjiak . A relaxed projection method using a new linesearch for the split feasibility problem. AIMS Mathematics, 2021, 6(3): 2690-2703. doi: 10.3934/math.2021163 |
[10] | Hongjie Jiang, Xiaoji Liu, Caijing Jiang . On the general strong fuzzy solutions of general fuzzy matrix equation involving the Core-EP inverse. AIMS Mathematics, 2022, 7(2): 3221-3238. doi: 10.3934/math.2022178 |
Many developed countries including G-7 became part of the Kyoto Protocol for improving their environmental quality in 2005. Its main goals were to develop national-level programs for energy conservation and the eradication of greenhouse gas emissions. To achieve such goals, certain policy measures including reduction in deforestation, urban population and promotion of renewable energy consumption were adopted. This study aims to examine the consequences of these policies on the environmental quality of G-7 from 1988 to 2018. LLC and IPS unit root tests were applied to check the stationarity of analyzed variables. The results of Pedroni and Kao's co-integration tests proved the long-run relationship between the dependent and explanatory variables. The application of multiple cross-sectional dependence tests revealed that the cross-sections are independent of each other. The findings of the panel Autoregressive distributed lag (ARDL) model exhibited that urbanization, economic growth and nonrenewable energy consumption are hampering the environmental quality. While renewable energy consumption and globalization are improving it in the long term. Urbanization, renewable and nonrenewable energy consumption significantly improve the environmental quality during the short term whereas globalization insignificantly deteriorates the environment. The study confirms the presence of reversed U environmental Kuznets curve between urbanization and carbon emissions with a turning point at 80 per cent of urbanization. The study suggests that the transformation of energy consumption from nonrenewable to renewable sources and strict compliance with environment management policies will prove prolific for improving the environmental quality of G-7.
Cyberattacks occurr frequently, causing serious impacts on people's daily life. In 2017, the WannaCry ransomware event broke out globally, hitting at least 300,000 users and causing 8 billion USD in damage [1]. In 2020, a cyberattack on Venezuela's national grid trunk line caused widespread power outages across the country [2]. In 2021, the United States refined product pipeline operator Colonial Pipeline was forced to shut down its fuel network in the eastern seaboard states due to a ransomware attack [3]. With the frequent occurrence of cyberattacks, existing methods, such as firewalls, data encryption, and authentication cannot meet security requirements [4]. Therefore, intrusion detection systems have gained the attention of researchers.
Intrusion detection systems play an important role in protecting critical information infrastructure [5]. According to detection techniques, they are categorized into signature-based intrusion detection systems (SIDS) and anomaly-based intrusion detection systems (AIDS) [6,7]. SIDS maintains an attack library that saves historical attack records. If the current traffic matches the record in the attack library, the traffic is judged to be attack class. AIDS analyzes historical traffic using statistical methods to learn a logical model. If the current traffic deviates from the normal traffic, the traffic is judged to be attack class. SIDS offers the advantages of fast detection and a low false alarm rate, but it cannot detect unknown attacks [8]. On the contrary, AIDS can detect unknown attacks and has a wide application prospect in the future. Figure 1 shows the block diagram of the intrusion detection system [9]. It consists of the following key components: (1) Information collection: network data, application logs, audit records, and other relevant information are collected from the network or hosts. The collected information will be used for intrusion analysis. (2) Analysis engine: modeling or behavior matching is performed based on the collected network information, which in turn forms the corresponding knowledge base. It will alert the network administrator if an intrusion is found. The intrusion process will also be part of the information collection. (3) Knowledge base: a list of historical behaviors or trained models are stored. The knowledge base can be used to analyze current traffic, but it needs to be updated regularly.
Intrusion detection is considered a classification problem, which has prompted researchers to adopt machine learning techniques to improve the performance of intrusion detection systems. In recent years, machine learning techniques have been applied broadly in intrusion detection, and have shown encouraging results in many studies [10]. Machine learning techniques can be classified as shallow learning and deep learning [11]. Shallow learning methods, such as K-nearest neighbors [12], decision trees [13], support vector machines [14] and random forests [15], are widely used because of their strong explainability. Among deep learning, autoencoders [16], deep belief networks [17] and convolutional neural networks [18] have achieved great success in intrusion detection owing to their ability to extract features. In the future, finding suitable machine learning techniques for improving the performance of intrusion detection systems has become a hot topic for researchers.
Researchers have proposed many approaches to detect intrusions based on machine learning techniques. This paper reviews related work from the perspectives of anomaly analysis and feature analysis. In anomaly analysis, Chouhan et al. [19] developed an autoencoder-based residual learning technique to enhance the classification capability of convolutional neural networks. Andresini et al. [20] combined feature selection techniques and residual learning to improve the performance of intrusion detection systems. The above residual thresholds need to be set manually, so Aygun et al. [21] developed a method to determine the thresholds adaptively. Yang et al. [22] developed a method that uses a modified conditional variation autoencoder to generate attack samples for balancing the data. Min et al. [23] developed a memory-enhanced autoencoder to improve the generalizability of the model. Autoencoders are also available for nonlinear dimensionality reduction [24,25]. In addition, to improve the performance of intrusion detection systems, some researchers have developed two-stage decision methods. Belouch et al. [26] introduced a two-stage classification model. In the first stage, a RepTree classifier is used to classify the traffic into normal and abnormal. In the second stage, a classifier is used to classify the anomalies detected in the first stage to identify the attack classes. Niyaz et al. [27] proposed an intrusion detection system based on two phases. The first stage uses a sparse autoencoder for feature extraction from the original data. The second stage feeds the processed features into SoftMaxRegression (SM) and self-taught learning (STL) classifiers for learning, respectively. Zhang et al. [28] applied machine-learning techniques to intrusion detection in in-vehicle networks and proposed a two-stage anomaly detection framework.
In feature analysis, Gu et al. [29] used the marginal density ratio method for data enhancement to improve the performance of intrusion detection. Ieracitano et al. [30] used statistical analysis techniques to identify outliers and redundant data, and, thus, remove unnecessary features. Zhang et al. [31] developed a feature fusion technique to improve model classification performance. Tree-based methods are often used for feature selection. Kasongo et al. [32] used extreme gradient augmentation trees for feature selection followed by shallow methods for classification. Megantara and Ahmad [33] developed a hybrid feature analysis method. This method first uses a decision tree to select the important features. After that, local outlier factors are used to exclude outlier and anomalous features. Rashid et al. [34] used univariate techniques for feature analysis, and integrated methods for classification. Bioheuristics have also been used for feature selection, such as the commonly used particle swarm algorithm and genetic algorithm [35,36,37]. In addition, deep learning methods are often used for nonlinear feature dimensionality reduction. To address the problem that isolated points and noisy data can affect the model performance, Seo et al. [38] used a restricted Boltzmann machine to remove isolated points and noisy data from the dataset. Wuke et al. [39] proposed a combination of multilayer extreme learning machines and autoencoders to reduce the dimensionality of the data. The reduced-dimensional data are then trained by the extreme learning machine. Zhao et al. [40] proposed a method that used deep belief networks and least-squares vector machines. The method first uses a deep belief network for dimensionality reduction, and then uses a particle swarm algorithm to optimize the parameters of the least-squares vector machine.
Although there is a lot of research focused on intrusion detection systems, there are still some issues that need to be addressed. One of the important issues is the dimensionality curse. The high-dimensional data makes it difficult for intrusion detection systems to learn effective data representations, which affects their detection efficiency. Another problem is the increasing number of zero-day attacks [6]. Various attack methods are emerging, leaving network administrators with shorter response times. To address these issues, a two-stage anomaly detection framework based on LightGBM and autoencoder is proposed in this study. The framework can detect novel attacks while improving detection efficiency. LightGBM is an integrated approach that introduces an exclusive feature bundling algorithm and a gradient-based one-sided sampling algorithm. The exclusive feature bundling (EFB) algorithm reduces the number of features that are simultaneously zero, and the gradient-based one-sided sampling (GOSS) method reduces the number of small gradient samples during model training. As a result, the LightGBM algorithm has less time overhead. The focal loss function can increase the weight of difficult samples, which is beneficial to learn the attack samples that are difficult to classify. The autoencoder learns the implicit representation of the data at the encoding layer, and reconstructs the original data at the decoding layer. Exploiting the reconstruction error, the autoencoder can enhance anomaly detection. Therefore, our main innovation is to introduce the focal loss function into LightGBM instead of the Cross-entropy function in it to improve the detection of attack samples. In addition, the reconstruction error of the autoencoder is utilized to further enhance the detection of misclassified samples. For data processing, we use recursive feature elimination, which is a packet-filtering feature selection method to select the best features based on the feature scores. Differently from existing methods, we use a two-stage decision step based on the reinforced LightGBM and Autoencoder. According to our survey, it is the first time the method is proposed. The proposed method has less time overhead and improves the performance of the intrusion detection system.
In the literature [41], we use autoencoder to fit the sampled data and use the LightGBM classifier for multiclassification prediction. However, in this work, we utilize the autoencoder and a modified LightGBM model for anomaly detection. We modified the objective function of the LightGBM and designed a two-stage decision step. The main contributions of this paper are as follows:
(1) To address the dimensionality curse, we propose to use a recursive feature elimination method based on LightGBM to reduce the dimensionality of the original data. The detection efficiency of the intrusion detection system is improved.
(2) To address the problem that the standard LightGBM method cannot effectively detect difficult samples, the focal loss function is introduced into LightGBM. In addition, the improved LightGBM is combined with an autoencoder to effectively respond to zero-day attacks.
(3) Finally, we have conducted experiments on the NSL-KDD and UNSWNB5 datasets. The experiments compare not only the classical methods, but also the current state-of-the-art methods.
The remainder of this paper is structured as follows: Section 2 introduces the relevant theories. Section 3 presents our method. Section 4 provides the experimental results and discussion. Section 5 presents the conclusions and future work of this paper.
In 2017, the Microsoft team proposed the LightGBM model [42]. LightGBM has less time overhead compared to extreme gradient boosting (Xgboost). The Xgboost uses a pre-sorting algorithm when dividing the best partition nodes of the tree [43]. Since the presorting algorithm needs to traverse all the features, it leads to the inefficiency of the algorithm. In general, the time complexity of the Xgboost algorithm is proportional to the size of the data volume [44]. It means that the larger the data volume, the higher the computational overhead. The LightGBM algorithm bins the continuous features and divides different features into different bins, which reduces the computational overhead of the model. This process is called the histogram algorithm. In addition, to further improve the training efficiency of the model, LightGBM introduces the gradient-based one-sided sampling method and the mutually exclusive feature bundling algorithm. The details of the LightGBM are described in Algorithm 1.
Gradient-based one-sided sampling method. The gradient is a vector that denotes the direction of the greatest change in the value of the function, and the maximum value in that direction is the value of the gradient. In machine learning, the size of the gradient of a sample during training indicates how much that sample contributes to the final model. Because a sample with a large gradient reflects that the model has room for convergence, it is beneficial to train the model. In contrast, a sample with a small gradient indicates that the sample is already well-trained and contributes less to the training model. Therefore, it is possible to keep all of the large gradient samples, and reduce the number of less gradient samples. This process is called the gradient-based one-sided sampling method. Specifically, the gradient information of each sample is calculated. For selection purposes, the gradients of all samples are sorted in descending order according to their absolute values. After that, the samples with large gradients are retained, and some samples with small gradients are randomly excluded.
Assume the training set has n samples, denoted as {x1,…,xn}. At each iteration, the negative gradient of the model output is denoted as {g1,…,gn}. For the gradient boosting decision tree, its information gain is calculated as follows. Let O be the training set of the node on the decision tree, then the information gain of the split feature j of the node at d is calculated as:
Vj|O(d)=1nO((∑{xi∈O:xij≤d})2njl|O(d)+(∑{xi∈O:xij>d})2njr|O(d)), | (1) |
where nO=∑I[xi∈O],njl|O(d)=∑I[xi∈O:xij≤d] andnjr|O(d)=∑I[xi∈O:xij>d].
For the GOSS algorithm, the top a × 100% large gradient samples are selected to form set A. After that, b × 100% small gradient samples are selected from the remaining sets to form set B. To maintain the original sample distribution, all small gradient samples in set B need to be multiplied by a coefficient (1–a)/b. Therefore, the final information gain is calculated as follows: Let a and b be the sampling ratios of large gradient and small gradient instances, respectively. According to the sorted instance gradient values, the first a × 100% large gradient sample is selected, and then randomly selects b × 100% small gradient samples from the rest of the data. After many iterations, the final calculated information gain is:
~Vj(d)=1n((∑xi∈Algi+1−ab∑xi∈Blgi)2njl(d)+(∑xi∈Argi+1−ab∑xi∈Brgi)2njr(d)), | (2) |
where Al={xi∈A:xij≤d},Ar={xi∈A:xij>d}, Bl={xi∈B:xij≤d},Br={xi∈B:xij>d}.
Exclusive feature bundling. GOSS reduces the number of samples, while EFB reduces the dimensionality of the features. The dimensionality of the features is another important factor that affects the time overhead. EFB uses the mutually exclusive nature of the features to reduce its dimensionality. Specifically, the EFB algorithm solves this problem by constructing a graph with weights. The nodes of the graph are represented by the features of the samples, while the weights indicate the degree of feature mutual exclusion. Finally, it is transformed into a graph coloring problem and a greedy strategy is used to solve it.
Algorithm 1: LightGBM |
Input: |
Training data: D={(x1,y1),(x2,y2),…,(xn,yn)},xi∈x,x⊆R,yi∈{−1,+1}; |
Loss function: L(y,θ(x)); // y is the true value and θ(x) is the predicted value |
Iterations: M; |
Big gradient data sampling ratio: a; |
Small gradient data sampling ratio: b; |
1: Exclusive Feature Bundling (EFB) techniques are used to combine mutually exclusive features of xi,i={1,…,n} that are not simultaneously non-zero ; 2: Initialize the predicted values: θ0(x)=argminc∑niL(yi,c); 3: For m = 1 to M do: 4: Calculate gradient absolute values: gi=|∂L(yi,θ(xi))∂θ(xi)|θ(xi)=θm−1(x),i={1,…,n}; 5: Resample dataset using gradient-based one-side sampling (GOSS): topN=a×len(D);randN=b×len(D); sorted=GetSortedIndices(abs(g)); A=sorted[1:topN];B=RandomPick(sorted[topN:len(D)],randN); D'=A+B; 6: Calculate the information gains: ~Vj(d)=1n((∑xi∈Algi+1−ab∑xi∈Blgi)2njl(d)+(∑xi∈Argi+1−ab∑xi∈Brgi)2njr(d)); 7: Get a new decision tree θm(x)'onsetD'; 8: Update θm(x)=θm−1(x)+θm(x)'; 9: End for 10: Return ˜θ(x)=θM(x); |
The focal loss function is derived from the cross entropy loss function to boost the recognition of difficult samples [45,46]. The cross-entropy loss function is a typical objective function that measures the closeness of true and observed distributions. A smaller cross-entropy shows a better classification result. The expression of the binary classification cross-entropy (BCE) loss function is shown below:
BCE=−ylog˜y−(1−y)log(1−˜y), | (3) |
where y and ˜y are the true label and the predicted label, respectively.
The focal loss function adds a modulation factor (1−˜y)γ and ˜yγ to the cross-entropy function, which enables the model to assign greater learning weights to difficult samples. As such,
FL=−y(1−˜y)γlog˜y−(1−˜y)yγlog(1−˜y), | (4) |
where γ∈[0,5] is the focal parameter. When γ=0, it is the cross-entropy loss function. The effect of the value of γ on the loss is shown in Figure 2.
In addition, the focus loss function introduces an alpha weighting factor. This factor is used to adjust the weighted losses of different categories. The final focal loss function is represented as:
FL=−αy(1−˜y)γlog˜y−(1−α)(1−y)˜yγlog(1−˜y), | (5) |
where α∈[0,1].
When performing the binary classification task, the objective function of the LightGBM defaults to the binary cross-entropy loss function. As shown in Figure 2, the classification results with the focal loss function are better than the binary cross-entropy loss. In this paper, we adopt the focal loss function as the objective function in LightGBM to enhance the learning of difficult samples.
Autoencoders are neural networks composed of multiple layers of neurons. Essentially, it is a multilayer perceptron that uses a feed-forward algorithm [19,47]. The difference is that the autoencoder has the same number of neurons in the input and output layers, which facilitates the reconstruction of the data. In general, an autoencoder consists of input layer, encoder, middle layer, decoder, and output layer [24]. The encoder, middle layer, and decoder are also called hidden layers. Its structure is shown in Figure 3. The size of the input and output layers is determined by the dimensionality of the dataset. The encoder is used to compress the dataset, and the decoder is used to reconstruct the dataset. The middle layer is a compressed representation of the dataset, and its size is less than the dimensionality of the dataset. In the encoder module, for input x, a compressed representation of the dataset y is obtained after mapping. Iits mathematical expression is shown in Eq (6). In the decoder module, the data x' is reconstructed using different weights w' and biases b'. This process is the opposite of the encoder, and its mathematical expression is shown in Eq (7). Usually, we use an activation function f that is nonlinear, since it can fit arbitrary functions. In addition, the autoencoder needs to define an objective function to measure the similarity of x and x'. When x and x' are close, it means that the autoencoder is well trained. In this study, we use the mean square error (MSE) function to define the loss of the autoencoder, which is one of the functions that are used the most. As such,
y=f(wx+b), | (6) |
x'=f(w'y+b'), | (7) |
where w is the weight coefficient of the encoder layer and b is the bias vector. w' and b' are the weight coefficients and bias vectors of the decoder layer, respectively. These parameters are updated by the backpropagation of the network. Thus,
MSE=1m∑(x'−x), | (8) |
where m denotes the number of samples.
To avoid overfitting, adding regularization to the objective function is a common strategy. In this paper, we use L1 regularization to impose restrictions on the weight coefficient to give them better generalization. Autoencoders that use regularization are called sparse autoencoders [48]. In addition, they can be further classified into shallow sparse autoencoders and deep sparse autoencoders, based on the number of hidden layers. The difference between them is shown in Figure 4. In the figure, x∈Rn is the input data. y∈Rm is the output of the middle layer. hl∈Rk is the vector of the lth hidden layer, and x'∈Rn is the output vector in the sparse autoencoder. A shallow sparse autoencoder consists of three layers, i.e., an input layer, a single hidden layer (middle layer) and an output layer [41]. The deep sparse autoencoder consists of multiple hidden layers stacked on top of each other. It can learn more important implicit information from the original data than the shallow sparse autoencoder. In this study, we use a deep sparse autoencoder for our work. As such,
L1=α‖ω‖, | (9) |
where α‖w‖ denotes the L1 regularization, which refers to the sum of absolute values of all weight parameters ω. α is the penalty factor.
According to the above theory, the original data x and the reconstructed data x' are very similar when an autoencoder is trained successfully. Their differences are also called reconstruction errors. In intrusion detection, there is a vast difference between normal samples and attack samples in the dataset. When an autoencoder trained with normal samples is used to reconstruct the attack samples, their reconstruction error will be larger than the reconstructed normal samples. Therefore, we use the reconstruction error to perform anomaly detection. Suppose the normal sample is x+, and the attack sample is x−. We use only the normal sample x+ to train the autoencoder. Let the reconstructed normal sample be x'+ and the reconstructed attack sample be x'−. Then we can find x+−x'+<x−−x'−. Let the current sample be noted as x∗ and after autoencoder reconstruction as x'∗. Assume that x+−x'+ is less than a certain threshold c. When c<x∗−x'∗, the sample can be judged as an attack sample.
Figure 5 shows the flow chart of the proposed method. It consists of four parts including data preprocessing, feature selection, model training and classification decision. The details are described below.
Data pre-processing. In the data preprocessing, since the model cannot handle non-numerical features, the training and test sets are first numerized. Sparse coding is helpful to enrich the data features. In addition to the category features, other non-numerical features are sparsely coded by the one-hot coding method in this paper. For example, the non-numeric feature "Protocol" has three values [TCP, UDP, ICMP], which can be coded as [100,010,001]. For the numerical features, the variation range of the values is different, which is not conducive to the training of the model. Therefore, in order to reduce the convergence time of the model, the normalization method is needed. In this paper, the maximum-minimum normalization method is used to scale the values in the range of [0, 1]. The maximum-minimum normalization method is represented as follows:
xnormalized=x−xminxmax−xmin, | (10) |
where xmax and xmin denote the max and min values of feature x, respectively.
Feature selection. In the feature selection, the recursive feature elimination method is adopted for feature selection. The recursive feature elimination method is a wrapper method that selects features based on the performance of the classification algorithm. Essentially, the recursive feature elimination method is a greedy algorithm. The recursive deletion is performed based on the ranking score of the features. The method needs to iterate through all the features and remove those that have little impact on the model performance until the desired number of features is satisfied.
Model training. In the model training, we use the process described in Algorithm 1 to build the model. First, the iteration number of the model is set. According to the number of iterations, several different decision trees are trained. Each decision tree is built relying on the performance of the previous decision tree. After several iterations, an integrated model consisting of several weak decision trees is obtained. In particular, we use the focal loss function instead of the default cross-entropy loss function in the definition of the objective function.
Classification decision. In the classification decision, there are two decision phases. In the first decision phase, the LightGBM with the introduction of the focal loss function is used for pre-classification. In the second decision stage, secondary classification is performed using a sparse autoencoder for samples predicted as normal in the first decision stage. Generally, if the sample is judged to be abnormal, it is finally predicted to be attack. On the contrary, if the sample is judged as normal, then it will finally be predicted as normal. The two-stage classification decision step enables the intrusion detection system to improve the accuracy, and the ability to detect unknown attacks.
Description. The NSL-KDD dataset is an improved version of the KDDCup99 dataset [49]. The KDDCup99 dataset is derived from the MIT Lincoln laboratory's intrusion detection evaluation project, which is data collected from nine weeks of network connectivity and system audits. According to Tavallaee et al. [49], the training and testing sets in the KDDCup99 dataset contain 78% and 75% of redundant data, respectively. To address the redundancy problem in the KDDcup99 dataset, Tavallaee extracted the NSL-KDD dataset without redundant data from the KDDCUp99 dataset. The improved NSL-KDD dataset has the following advantages: (1) There are no duplicate records in the training and test sets, which makes the classifier not affected by duplicate records. (2) The number of records in the training and test sets are reasonable, and they do not require high performance of the computer. As shown in Table 1, the NSL-KDD dataset consists of 42 features. The values of each feature are divided into numerical and non-numerical types. Among them, the values of three features including protocols, services and flags are non-numeric types, and the rest is numeric types. The NSL-KDD dataset contains four attack categories, namely Dos, Probe, user-to-root (U2R) and root-to-local (R2L). All these attack categories are considered anomalies. The size of the NSL-KDD dataset is shown in Figure 6. The training set and test set contain 125,973 and 22,544 records, respectively. Among them, the proportion of normal samples and attack samples in the training set are 53.46% and 46.54%, respectively. In the test set, the proportion of normal samples and attack samples are 43.08% and 56.92%, respectively. It is important to note that the attack samples are composed of a variety of different attack types. In the test set, an additional 18 attacks are contained, which means that the test set has different attack patterns [21]. Therefore, it can be used to simulate the detection of zero-day attacks.
Dataset | Feature name | Size |
NSL-KDD | duration, protocols_types, services, flag, src_bytes, dst_bytes, land, wrong_fragment, urgent, hot num_failed_logins, logged_in, num_compromised, root_shell, su_attempted, num_root, num_shells, num_access_files, num_outbound_cmds, is_hot_login, Is_guest_login, count, srv_count, serror_rate, srv_serror_rate, rerror_rate, srv_rerror_rate, same_srv_rate, diff_srv_rate, srv_diff_host_rate, dst_host_count, dst_host_srv_count, dst_host_same_srv_rate, dst_host_diff_srv_rate, dst_host_same_src_port_rate, dst_host_srv_diff_host_rate, dst_host_serror_rate, dst_host_srv_serror_rate, dst_host_rerror_rate, dst_host_srv_rerror_rate, label. | 42 |
UNSWNB15 | srcip, sport, dstip, dsport, protocol, state, dur, sbytes, dbytes, sttl, dttl, sloss, dloss, service, sload, dload, skts, dpkts, swin, dwin, stcpb, dtcpb, smeansz, dmeansz, trans_depth, res_bdy_len, sjit, djit, stime, ltime, sintpkt, dintpkt, tcprtt, synack, ackdat, is_sm_ips_ports, ct_state_ttl, ct_flw_http_mthd, is_ftp_login, ct_ftp_cmd, ct_srv_src, ct_srv_dst, ct_dst_ltm, ct_src_ ltm, ct_src_dport_ltm, ct_dst_sport_ltm, ct_dst_src_ltm, attack_type, label. | 49 |
The UNSWNB15 dataset was created by the Australian Center for Cyber Security in 2015 using the IXIA tool [50]. The dataset contains a total of 2 million records that were saved in four different CSV files [51]. To facilitate the use of the dataset, the UNSWNB15 dataset was divided into a training set and a test set, named UNSWNB15Train and UNSWNB15Test, respectively. As shown in Table 1, a total of 49 features are included in the dataset. Among them, three features containing protocol, service and state are non-numeric types, and the rest are numeric types. Different from the NSL-KDD dataset, the UNSWNB15 dataset includes nine new attack types: Backdoor, Shellcode, Reconnaissance, Worms, Fuzzers, DOS and Generic. In this paper, we used the UNSWNB15Train and UNSWNB15Test datasets for our experiments. The information on this dataset is shown in Figure 6. Specifically, the training set and test set contain 175,341 and 82,332 samples, respectively. In the training set, the proportion of normal samples and attack samples are 31.94% and 68.06%, respectively. In the test set, the proportion of normal samples and attack samples are 44.94% and 55.06%, respectively.
Preprocessing. The NSL-KDD dataset sample contained 41-dimensional features. Because the model cannot handle symbolic data, it is necessary to convert the characteristics of symbolic types into numeric kinds. In addition, the data are encoded with the one-hot method. Specifically, the protocol feature contains three values, represented by three numbers containing 0 and 1. The service features have 70 values, so they are represented by 70 numbers consisting of 0 and 1. The flag feature has 11 values, so it is represented by 11 numbers containing 0 and 1. After one-hot encoding, the dimension of the NSL-KDD dataset is expanded to 122. Similarly, the UNSWNB15 dataset is processed in the same way. Finally, the dimensionality of the UNSWNB15 dataset becomes 196.
In this study, the recursive feature elimination method was used to reduce the data dimensionality. The recursive feature elimination method selects features based on feature importance [52]. First, all the original features are trained using the LightGBM classifier to obtain the weight coefficients of each feature. After that, the features with the smallest weight coefficients are selected and removed from the original feature set to obtain a new subset of features. Finally, the new feature subset is trained again using the LightGBM classifier to obtain the weight coefficients. This process is repeated until the required number of features is obtained. Algorithm 2 describes this process. According to the results in Section 4.3.1, we select 40 and 60 features for the NSL-KDD and UNSWNB15 datasets, respectively.
The sparse autoencoder is used for the second stage of prediction. First, the autoencoder is trained on the normal class samples from the training set. Then, the trained model is reconstructed from the test set. In this study, the sparse autoencoder is composed of 7 hidden layers. Among them, the structures of the encoder are 64 and 32, denoting the number of nodes in each layer. The size of the middle layer is 16. Since the encoder and decoder are symmetric structures, the structures of the decoder are 32 and 64, respectively. The sizes of the input and output layers are 122 for the NSL-KDD dataset, and 196 for the UNSWNB15 dataset. In particular, the Relu function is used as an activation function between neurons.
Algorithm 2: RFE |
Input: Original feature set S=[1,2,3,…,D] // D denotes the features in the sample Expected number of features: N |
Output: Feature ordering set R=[] |
Start: Initialize feature weights wi=1(i=1,…,d) // d denotes the dimensionality of the features in the original dataset |
1: if len(S)≠N, do: |
2: Train the current feature set S with the LightGBM classifier |
3: Calculate the feature weight coefficients in set S |
4: Find the feature with the smallest weight coefficient: r=argminj(wj)(j=1,…,d) |
5: Update feature ordering set: R= [r,R] |
6: Remove less important features: S=S−[r] |
7: d = d-1 |
8: until len(S) = N |
9: end if |
The experiment was conducted on a Dell host, and was configured as follows: 32 G RAM, Intel Core i7-9700 CPU, and Radeon Rx 550x. To speed up the training of the model, we used the GPU on a Linux server to train the autoencoder. We used tensorflow with version 2.2.0 as the backend. Furthermore, sklearn and keras were used to process the dataset. The native lgb [53] library was used to build the model. In the experiments, we set the number of iterations as 200, and the random seed as 42. In addition, for the NSL-KDD dataset, we set α = 0.1 and γ = 0.9. For the UNSWNB15 dataset, we set α = 0.2 and γ = 5.
In this paper, we used accuracy, precision, recall and F1 score to evaluate the performance of the model. The accuracy represents the proportion of instances that are correctly predicted to account for all instances. The precision represents the proportion of correctly predicted attack instances to all predicted attack instances. The recall represents the proportion of attack instances that were correctly predicted by the classifier. The F1 score is a metric of balancing precision and recall. The formula of each metric is determined by Eqs (11)–(14).
Accuracy=TP+TNTP+TN+FP+FN, | (11) |
Precision=TPTP+FP, | (12) |
Recall=TPTP+FN, | (13) |
F1score=2/(1Precision+1Recall), | (14) |
where TP represents the number of attack instances that are correctly predicted. TN represents the number of correctly predicted normal instances. FP represents the number of normal instances that are mispredicted. FN represents the number of attack instances that are mispredicted.
In order to find the appropriate number of features, different numbers of features were used for comparison. Figures 7 and 8 show the accuracy of the two datasets for the different number of features. As can be seen from Figure 7, the highest accuracy is obtained on the NSL-KDD dataset when the number of features is 40. However, for the UNSWNB15 dataset, the number of features is 60. Therefore, we set the number of features to 40 and 60 for the NSL-KDD and UNSWNB15 datasets, respectively.
Figure 9 shows the performance of the proposed method on each metric. On the NSL-KDD dataset, the proposed method achieves 92.57%, 89.93%, 97.91% and 93.75% in accuracy, precision, recall and F1 score, respectively. Among them, recall is the best, which indicates that the proposed method can detect almost all attack classes. The confusion matrix on the NSL-KDD dataset is given in Table 2. It can be seen that only 267 attack classes are not recognized by the proposed method. For the UNSWNB15 dataset, the proposed method achieves 92.71%, 93.43%, 93.32% and 93.38% in terms of accuracy, precision, recall and F1 score, respectively. It can be concluded that the performance of all metrics on this dataset is more balanced. Table 3 shows the confusion matrix on the UNSWNB15 dataset.
NSL-KDD | Predicted label | ||
Normal | Attack | ||
True label | Normal | 8305 | 1406 |
Attack | 267 | 12,566 |
UNSWNB15 | Predicted label | ||
Normal | Attack | ||
True label | Normal | 34,029 | 2971 |
Attack | 3026 | 42,306 |
Table 4 shows the time overhead (the sum of training time and prediction time for the proposed method). When the feature selection method is not used, the time overhead on the two datasets are 8.22 and 18.6 seconds, respectively. In contrast, when using the recursive feature elimination method, the time overhead of the proposed method is 5.9 and 17.25 seconds, respectively. It indicates that the decision efficiency of the model is improved after using the recursive feature reduction method.
Dataset | Original | RFE |
NSL-KDD | 8.22 s | 5.9 s |
UNSWNB15 | 18.6 s | 17.25 s |
In this section, we perform the ablation analysis of the proposed method. The LightGBM model without introducing the focal loss function is taken as the base model, which is LGBM. The model introducing the focal loss function is the improved model, which is FL_LGBM. The proposed model is FL_LGBM-AE. As shown in Figure 10, the performance of these three different models in terms of accuracy and F1 score is shown. The F1 score reflects the harmonic value of precision and recall. It can be concluded that, for both datasets, the FL_LGBM model has a larger improvement compared to the base model. It shows that the focal loss function introduced in LGBM is valid. Furthermore, on the NSL-KDD dataset, compared to FL_LGBM, the FL_LGBM-AE model improved by 11.5% and 13.08% in accuracy and F1 score, respectively. The FL_LGBM-AE model also performs better than the FL_LGBM model on the UNSWNB15 dataset.
In the proposed method, the learning rate and threshold are two important hyperparameters. The proposed method enables the model to learn from difficult samples by the introduction of the focal loss function. It makes the learning rate of the proposed model more important.
Figures 11 and 12 show the effect of different learning rates on these two datasets. On the NSL-KDD dataset, the model performs the worst when the learning rate is equal to 0.003. When the learning rate equals 0.0035, the model without the autoencoder performs the best, reaching an accuracy of 87.85%. As the learning rate increases, the accuracy of the model gradually decreases. Conversely, the accuracy of the model using autoencoder increased as the learning rate increased. The reason is that, when the learning rate increases, the learning pace of the model becomes larger, resulting in the model not converging to the global minimum. As the learning rate becomes larger, it misclassifies most of the attack classes as normal classes. When using the autoencoder, the attack samples that are misclassified as normal classes are accurately identified. The highest accuracy of the model is obtained when the learning rate is equal to 0.03. Overall, most models that used autoencoders were above 90% accurate, which was higher than the models that did not use autoencoders.
On the UNSWNB15 dataset, the performance of the models is stable, whether or not we use autoencoders. When the learning rate reaches 0.004, the accuracy of the model with the autoencoder is slightly higher than the model without the autoencoder. After this point, the accuracy of the model with the autoencoder is slightly smaller than the model without the autoencoder. The possible reason is that the accuracy of the model without the autoencoder is already over 90%. When using the autoencoder, the accuracy of the model only slightly improves. As the learning rate increases, the model performance deteriorates and becomes worse with the autoencoder. However, it cannot demonstrate that the proposed model is ineffective. Autoencoders can still play an important role, as long as a suitable learning rate is found. As can be seen in Figure 12, the best performance of the model is obtained when the learning rate is 0.004.
When training datasets with autoencoders, the range of reconstruction errors produced by different datasets is different. Figure 13 presents the mean squared error for the both datasets. Figures 14 and 15 show the effect of different thresholds on the model. For the NSL-KDD dataset, the proposed method produces the best results when the threshold reaches 0.00095. After that, as the threshold increases, the accuracy of the model gradually decreases. The reason is that the increase in threshold causes the autoencoder to fail to identify the attack samples with large reconstruction errors. For the UNSWNB15 dataset, the proposed method achieves the best results when the thresholds are 0.0095. As the threshold increases, the performance of the model plateaus. Most of the attack samples with large reconstruction errors have already been identified, meaning increasing the threshold has no influence.
Tables 5 and 6 show the performance of different methods on the four evaluation metrics. For the NSL-KDD dataset, the proposed method achieves the best results in terms of recall rate, F1 score and accuracy. It is notable that the random forest (RF), gradient boosting decision Tree (GBDT) and Xgboost models all achieve 90% accuracy. For the UNSWNB15 dataset, the proposed method achieves the best performance in terms of precision, F1 score and accuracy. In contrast, the other methods did not exceed 81% in precision, and did not reach 90% in accuracy. Although these methods have a higher recall, they perform poorly on other metrics. In particular, the proposed method exceeds 90% on the F1 score, which indicates that our method performs more balanced in precision and recall. The other methods have a significant imbalance in precision and recall. Overall, the proposed method has better performance on these two datasets, which proves the effectiveness of the proposed method.
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 89.80 | 97.12 | 93.32 | 92.08 |
SVM | 89.89 | 94.45 | 92.12 | 90.80 |
RF | 90.05 | 96.89 | 93.35 | 92.14 |
GBDT | 90.00 | 97.09 | 93.41 | 92.21 |
Xgboost | 90.00 | 96.94 | 93.34 | 92.13 |
Adaboost | 89.68 | 97.15 | 93.27 | 92.02 |
Proposed method | 89.93 | 97.91 | 93.75 | 92.57 |
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 80.42 | 95.40 | 87.27 | 84.68 |
SVM | 75.03 | 99.58 | 85.58 | 81.52 |
RF | 77.51 | 99.32 | 87.07 | 83.76 |
GBDT | 76.04 | 99.51 | 86.20 | 82.46 |
Xgboost | 76.47 | 98.86 | 86.23 | 82.26 |
Adaboost | 76.57 | 98.85 | 86.30 | 82.72 |
Proposed method | 93.43 | 93.32 | 93.38 | 92.71 |
Figures 16 and 17 show the time overhead for the different methods. It can be seen that the support vector machine (SVM) model has the highest time overhead on these two datasets. The reason is that, after mapping the data to the nonlinear space, the SVM needs to calculate the maximum interval of the decision boundary, which increases the computational overhead. As the number of samples increases, the time overhead becomes larger. It shows that the SVM is not suitable for handling large datasets. In addition, the decision tree (DT) model has the least time overhead owing to the simple decision tree algorithm. The proposed method adds a portion of time overhead due to the use of the focal loss function. The time overheads of the proposed method are 5.9 and 17.25 seconds for these two datasets, respectively. Although the proposed method is not optimal in terms of time overhead, it is still less than the overheads of the SVM, RF, GBDT and Adaboost models. It means that the proposed method still has an advantage in terms of time overhead.
Table 7 shows the comparison of our method with existing methods. We present the results from these publications. For the NSL-KDD dataset, it can be seen that our method performs the best in terms of accuracy, recall and F1 score, reaching 92.57%, 97.91% and 93.75%, respectively. In terms of precision, the literature [20] performs the best. For the UNSWNB15 dataset, our method performs the best in terms of accuracy and F1 score, reaching 92.71% and 93.38%, respectively. In terms of recall, literature [15] and literature [32] obtained 99.28% and 98.06%, respectively, which are the best among all methods. In terms of precision, the literature [31] obtained the best results, achieving 93.88%. The reason is that the literature [15] and [31] used the integration of several different classifiers. The literature [32] used a deep learning approach based on artificial neural networks. However, they did not achieve an accuracy of 90%. In contrast, our method exceeded 90% in all metrics, showing that our method is more effective. This is owed to our proposed two-stage decision step.
Dataset | Method | Accuracy (%) | Recall (%) | Precision(%) | F1 score(%) |
NSL-KDDTest | RandomTree+NBtree [13] | 89.24 | N/A | N/A | N/A |
CBR-CNN [19] | 89.41 | N/A | N/A | N/A | |
AE-LSTM [24] | 89.00 | 88.00 | N/A | N/A | |
AIDA [20] | 92.41 | 92.00 | 94.52 | 93.24 | |
STL [27] | 88.39 | 95.95 | 85.44 | 90.40 | |
AE-IDS [30] | 84.21 | 80.37 | 87.00 | 81.98 | |
MFFSEM [31] | 84.33 | 96.43 | 74.61 | 84.13 | |
Our Method | 92.57 | 97.91 | 89.93 | 93.75 | |
UNSWNB15Test | Voting-CMN [15] | 89.29 | 99.28 | 82.37 | 90.04 |
RepTree [26] | 88.95 | N/A | N/A | N/A | |
ANN [32] | 86.71 | 98.06 | 81.54 | 89.04 | |
MFFSEM [31] | 88.85 | 80.44 | 93.88 | 86.64 | |
LOF [33] | 91.86 | N/A | N/A | N/A | |
GAA [54] | 91.80 | 91.00 | N/A | N/A | |
GBM [55] | 91.31 | N/A | N/A | N/A | |
Our Method | 92.71 | 93.32 | 93.43 | 93.38 |
In this work, we proposed a two-stage intrusion detection framework based on LightGBM and autoencoders. In this framework, to solve the curse of dimensionality, the recursive feature elimination method was used for feature selection. In addition, the focal loss function was introduced in LightGBM to enhance the learning of difficult samples. In order to improve the detection capability of zero-day attacks, this study divided the decision-making process into two stages, thereby improving the performance of the intrusion detection system. The experiments were performed on the NSL-KDD and UNSWNB15 datasets, and the accuracy rates were 92.57% and 92.71%, respectively. The recall reached 97.91% and 93.32%, respectively. Experiments compared classical methods and advanced methods respectively, and the results proved the effectiveness of the proposed method. We can conclude that the proposed method can improve the efficiency and performance of intrusion detection systems.
Although the proposed method achieves a high recall for the NSL-KDD dataset, the recall for UNSWNB15 still needs to be improved. In addition, the precision of our method on both datasets is not yet advanced. This means that we need to further optimize the model. In future work, we will mainly focus on two aspects: First, since the threshold of the autoencoder is the key factor affecting the model, we will develop a method to set the threshold automatically. Second, we segment the attack types and adopt a suitable sampling method to further improve the model performance.
This work was supported by the National Natural Science Foundation of China under Grant 61862007, and the Guangxi Natural Science Foundation under Grant 2020GXNSFBA297103.
We declare that there are no conflicts of interest.
[1] |
Adusah-Poku F (2016) Carbon dioxide emissions, urbanization and population: empirical evidence from Sub-Saharan Africa. Energ Econ Lett 3: 1–16. https://doi.org/10.18488/journal.82/2016.3.1/82.1.1.16 doi: 10.18488/journal.82/2016.3.1/82.1.1.16
![]() |
[2] |
Ali HS, Law SH, Zannah TI (2016) Dynamic impact of urbanization, economic growth, energy consumption, and trade openness on CO2 emissions in Nigeria. Environ Sci Pollut R 23: 12435–12443. https://doi.org/10.1007/s11356-016-6437-3 doi: 10.1007/s11356-016-6437-3
![]() |
[3] |
Al-Mulali U, Ozturk I (2015) The effect of energy consumption, urbanization, trade openness, industrial output, and the political stability on the environmental degradation in the MENA (Middle East and North African) region. Energy 84: 382–389. https://doi.org/10.1016/j.energy.2015.03.004 doi: 10.1016/j.energy.2015.03.004
![]() |
[4] |
Al-Mulali U, Ozturk I (2016). The investigation of environmental Kuznets curve hypothesis in the advanced economies: the role of energy prices. Renew Sust Energ Rev 54: 1622–1631. https://doi.org/10.1016/j.rser.2015.10.131 doi: 10.1016/j.rser.2015.10.131
![]() |
[5] |
Anser MK (2019) Impact of energy consumption and human activities on carbon emissions in Pakistan: application of STIRPAT model. Environ Sci Pollut Res 26: 13453–13463. https://doi.org/10.1007/s11356-019-04859-y doi: 10.1007/s11356-019-04859-y
![]() |
[6] |
Anser MK, Alharthi M, Aziz B, et al. (2020) Impact of urbanization, economic growth, and population size on residential carbon emissions in the SAARC countries. Clean Technol Envir, 1–14. https://doi.org/10.1007/s10098-020-01833-y doi: 10.1007/s10098-020-01833-y
![]() |
[7] |
Apergis N, Ozturk I (2015) Testing environmental Kuznets curve hypothesis in Asian countries. Ecol Indic 52: 16–22. https://doi.org/10.1016/j.ecolind.2014.11.026 doi: 10.1016/j.ecolind.2014.11.026
![]() |
[8] |
Apergis N, Payne JE (2010) Renewable energy consumption and economic growth: evidence from a panel of OECD countries. Energ Policy 38: 656–660. https://doi.org/10.1016/j.enpol.2009.09.002 doi: 10.1016/j.enpol.2009.09.002
![]() |
[9] |
Apergis N, Payne JE, Menyah K, et al. (2010) On the causal dynamics between emissions, nuclear energy, renewable energy, and economic growth. Ecol Econ 69: 2255–2260. https://doi.org/10.1016/j.ecolecon.2010.06.014 doi: 10.1016/j.ecolecon.2010.06.014
![]() |
[10] |
Balsalobre-Lorente D, Shahbaz M, Roubaud D, et al. (2018) How economic growth, renewable electricity and natural resources contribute to CO2 emissions? Energ Policy 113: 356–367. https://doi.org/10.1016/j.enpol.2017.10.050 doi: 10.1016/j.enpol.2017.10.050
![]() |
[11] |
Behera SR, Dash D (2017) The effect of urbanization, energy consumption, and foreign direct investment on the carbon dioxide emission in the SSEA (South and Southeast Asian) region. Renew Sust Energ Rev 70: 96–106. https://doi.org/10.1016/j.rser.2016.11.201 doi: 10.1016/j.rser.2016.11.201
![]() |
[12] |
Bekhet HA, Othman NS (2017) Impact of urbanization growth on Malaysia CO2 emissions: evidence from the dynamic relationship. J Clean Prod 154: 374–388. https://doi.org/10.1016/j.jclepro.2017.03.174 doi: 10.1016/j.jclepro.2017.03.174
![]() |
[13] |
Bölük G, Mert M (2014) Fossil & renewable energy consumption, GHGs (greenhouse gases) and economic growth: Evidence from a panel of EU (European Union) countries. Energy 74: 439–446. https://doi.org/10.1016/j.energy.2014.07.008 doi: 10.1016/j.energy.2014.07.008
![]() |
[14] |
Christmann P, Taylor G (2001) Globalization and the environment: Determinants of firm self-regulation in China. J Int Bus Stud 32: 439–458. https://doi.org/10.1057/palgrave.jibs.8490976 doi: 10.1057/palgrave.jibs.8490976
![]() |
[15] |
Çoban S, Topcu M (2013) The nexus between financial development and energy consumption in the EU: A dynamic panel data analysis. Energ Econ 39: 81–88. https://doi.org/10.1016/j.eneco.2013.04.001 doi: 10.1016/j.eneco.2013.04.001
![]() |
[16] |
Dauda L, Long X, Mensah CN, et al. (2019) The effects of economic growth and innovation on CO2 emissions in different regions. Environ Sci Pollut Res 26: 15028–15038. https://doi.org/10.1007/s11356-019-04891-y doi: 10.1007/s11356-019-04891-y
![]() |
[17] |
Diao XD, Zeng SX, Tam CM, et al. (2009) EKC analysis for studying economic growth and environmental quality: a case study in China. J Clean Product 17: 541–548. https://doi.org/10.1016/j.jclepro.2008.09.007 doi: 10.1016/j.jclepro.2008.09.007
![]() |
[18] |
Dietz T, Rosa EA, York R (2007) Driving the human ecological footprint. Front Ecol Environ, 13–18. https://doi.org/10.1890/1540-9295 doi: 10.1890/1540-9295
![]() |
[19] |
Dogan E, Turkekul B (2016) CO2 emissions, real output, energy consumption, trade, urbanization and financial development: testing the EKC hypothesis for the USA. Environ Sci Pollut Res 23: 1203–1213. https://doi.org/10.1007/s11356-015-5323-8 doi: 10.1007/s11356-015-5323-8
![]() |
[20] |
Dogan E, Seker F (2016) Determinants of CO2 emissions in the European Union: The role of renewable and non-renewable energy. Renew Energ 94: 429–439. https://doi.org/10.1016/j.renene.2016.03.078 doi: 10.1016/j.renene.2016.03.078
![]() |
[21] |
Dreher A (2006) Does globalization affect growth? Evidence from a new index of globalization. Appl Econ 38: 1091–111. https://doi.org/10.1080/00036840500392078 doi: 10.1080/00036840500392078
![]() |
[22] | Dreher A, Gaston N, Martens P (2008) The Measurement of Globalization. In Measuring Globalization. Springer New York, 25–74. https://doi.org/10.1007/978-0-387-74069-0_3 |
[23] |
Duh JD, Shandas V, Chang H, et al. (2008) Rates of urbanization and the resiliency of air and water quality. Sci Total Environ 400: 238–256. https://doi.org/10.1016/j.scitotenv.2008.05.002 doi: 10.1016/j.scitotenv.2008.05.002
![]() |
[24] |
Engle RF, Yoo BS (1987) Forecasting and testing in co-integrated systems. J Econometrics 35: 143–159. https://doi.org/10.1016/0304-4076(87)90085-6 doi: 10.1016/0304-4076(87)90085-6
![]() |
[25] |
Farhani S, Shahbaz M (2014) What role of renewable and non-renewable electricity consumption and output is needed to initially mitigate CO2 emissions in MENA region? Renew Sust Energ Revi 40: 80–90. https://doi.org/10.1016/j.rser.2014.07.170 doi: 10.1016/j.rser.2014.07.170
![]() |
[26] | Grossman G, Krueger A (1991) Environmental Impacts of North American Free Trade Agreement. National Bureau of Economics Research Working Paper 3194. |
[27] | Gygli S, Haelg F, Potrafke N, et al. (2019) The KOF Globalization Index - Revisited. CESifo Working Paper No. 7430. Available from: https://ssrn.com/abstract=3338784 or http://dx.doi.org/10.2139/ssrn.3338784. |
[28] |
Hanif I (2017) Economics-energy-environment nexus in Latin America and the Caribbean. Energy 141: 170–178. https://doi.org/10.1016/j.energy.2017.09.054 doi: 10.1016/j.energy.2017.09.054
![]() |
[29] |
Hanif I, Raza SMF, Gago-de-Santos P, et al. (2019) Fossil fuels, foreign direct investment, and economic growth have triggered CO2 emissions in emerging Asian economies: some empirical evidence. Energy 171: 493–501. https://doi.org/10.1016/j.energy.2017.09.054 doi: 10.1016/j.energy.2017.09.054
![]() |
[30] |
Im KS, MH Pesaran, Shin Y (2003) Testing for unit roots in heterogeneous panels. J Econometrics 115: 53–74. https://doi.org/10.1016/S0304-4076(03)00092-7 doi: 10.1016/S0304-4076(03)00092-7
![]() |
[31] | International Energy Agency (2019) Available from: https://www.iea.org/geco/emissions/ (Accessed 28 March 2019). |
[32] |
Jebli MB, Youssef, SB, Ozturk I (2016) Testing environmental Kuznets curve hypothesis: The role of renewable and non-renewable energy consumption and trade in OECD countries. Ecol Indic 60: 824–831. https://doi.org/10.1016/j.ecolind.2015.08.031 doi: 10.1016/j.ecolind.2015.08.031
![]() |
[33] |
Kahn ME, Schwartz J (2008) Urban air pollution progress despite sprawl: the "greening" of the vehicle fleet. J Urban Econ 63: 775–787. https://doi.org/10.1016/j.jue.2007.06.004 doi: 10.1016/j.jue.2007.06.004
![]() |
[34] |
Kao Chihwa (1999) Spurious Regression and Residual-based Tests for Co-integration in Panel Data. J Econometrics 90: 1–44. https://doi.org/10.1016/S0304-4076(98)00023-2 doi: 10.1016/S0304-4076(98)00023-2
![]() |
[35] |
Khan MK, Teng JZ, Khan MI, et al. (2019) Impact of globalization, economic factors and energy consumption on CO2 emissions in Pakistan. Sci Total Environ 688: 424–436. https://doi.org/10.1016/j.scitotenv.2019.06.065 doi: 10.1016/j.scitotenv.2019.06.065
![]() |
[36] |
Khalid K, Usman M, Mehdi MA (2021) The determinants of environmental quality in the SAARC region: a spatial heterogeneous panel data approach. Environ Sci Pollut Res 28: 6422–6436. https://doi.org/10.1007/s11356-020-10896-9 doi: 10.1007/s11356-020-10896-9
![]() |
[37] | Kuznets S (1955) Economic growth and income inequality. Am Econ Rev 45: 1–28. https://www.jstor.org/stable/1811581 |
[38] |
Lee KH, Min B (2014) Globalization and carbon constrained global economy: a fad or a trend? J Asia-Pac Bus 15: 105–121. https://doi.org/10.1080/10599231.2014.904181 doi: 10.1080/10599231.2014.904181
![]() |
[39] |
Levin A, Lin CF, Chu C (2002) Unit Root Tests in Panel Data: Asymptotic and Finite. J Econometrics 108: 1–24. https://doi.org/10.1016/S0304-4076(01)00098-7 doi: 10.1016/S0304-4076(01)00098-7
![]() |
[40] |
Li B, Yao R (2009) Urbanization and its impact on building energy consumption and efficiency in China. Renew Energ 34: 1994–1998. https://doi.org/10.1016/j.renene.2009.02.015 doi: 10.1016/j.renene.2009.02.015
![]() |
[41] |
López-Menéndez AJ, Pérez R, Moreno B (2014) Environmental costs and renewable energy: Re-visiting the Environmental Kuznets Curve. J Environ Manage 145: 368–373. https://doi.org/10.1016/j.jenvman.2014.07.017 doi: 10.1016/j.jenvman.2014.07.017
![]() |
[42] |
Magazzino C (2016) The relationship between CO2 emissions, energy consumption and economic growth in Italy. Inter J Sust Energ 35: 844–857. https://doi.org/10.1080/14786451.2014.953160 doi: 10.1080/14786451.2014.953160
![]() |
[43] | Mahalik MK, Mallick H (2014) Energy consumption, economic growth and financial development: exploring the empirical linkages for India. J Dev Areas, 139–159. https://www.jstor.org/stable/24241254 |
[44] |
Martínez-Zarzoso I, Maruotti A (2011) The impact of urbanization on CO2 emissions: Evidence from developing countries. Ecol Econ 70: 1344–1353. https://doi.org/10.1016/j.ecolecon.2011.02.009 doi: 10.1016/j.ecolecon.2011.02.009
![]() |
[45] | McCoskey S, Kao C (1999) A Monte Carlo Comparison of Tests for Co-integration in Panel data. Available from: https://ssrn.com/abstract=1807953 or http://dx.doi.org/10.2139/ssrn.1807953 |
[46] | Meadows DL, Meadows DH, Randers J, et al. (1972) The limits to growth: a report for the Club of Rome's project on the predicament of ma universe books, New York. Available from: http://www.clubofrome.org/docs/limits.rtf. |
[47] |
Narayan PK, Smyth R, Prasad A (2007) Electricity consumption in G7 countries: A panel cointegration analysis of residential demand elasticities. Energ Policy 35: 4485–4494. https://doi.org/10.1016/j.enpol.2007.03.018 doi: 10.1016/j.enpol.2007.03.018
![]() |
[48] |
Nasreen S, Anwar S, Ozturk I (2017) Financial stability, energy consumption and environmental quality: Evidence from South Asian economies. Renew Sust Energ Rev 67: 1105–1122. https://doi.org/10.1016/j.rser.2016.09.021 doi: 10.1016/j.rser.2016.09.021
![]() |
[49] |
Nejat P, Jomehzadeh F, Taheri MM, et al. (2015) A global review of energy consumption, CO2 emissions and policy in the residential sector (with an overview of the top ten v2 emitting countries). Renew Sust Energ Rev 43: 843–862. https://doi.org/10.1016/j.rser.2014.11.066 doi: 10.1016/j.rser.2014.11.066
![]() |
[50] |
Pedroni P (1999) Critical values for co-integration tests in heterogeneous panels with multiple regressors. Oxford Bull Econ Stat 61: 653–670. https://doi.org/10.1111/1468-0084.0610s1653 doi: 10.1111/1468-0084.0610s1653
![]() |
[51] |
Pedroni P (2004) Panel co-integration: asymptotic and finite sample properties of pooled time series tests with an application to the PPP hypothesis. Econometric Theory 20: 597–625. https://doi.org/10.1017/S0266466604203073 doi: 10.1017/S0266466604203073
![]() |
[52] | Pesaran MH (2004) General diagnostic tests for cross section dependence in panels. University of Cambridge, England. |
[53] |
Pesaran MH, Shin, Y, Smith RP (1999) Pooled mean group estimation of dynamic heterogeneous panels. J Am Stat Assoc 94: 621–634. 10.1080/01621459.1999.10474156 doi: 10.1080/01621459.1999.10474156
![]() |
[54] |
Raza, SA, Shah, N, Khan KA (2020) Residential energy environmental Kuznets curve in emerging economies: the role of economic growth, renewable energy consumption, and financial development. Environ Sci Pollut Res 27: 5620–5629. https://doi.org/10.1007/s11356-019-06356-8 doi: 10.1007/s11356-019-06356-8
![]() |
[55] |
Sadorsky P (2014) The effect of urbanization on CO2 emissions in emerging economies. Energ Econ 41: 147–153. https://doi.org/10.1016/j.eneco.2013.11.007 doi: 10.1016/j.eneco.2013.11.007
![]() |
[56] |
Sadorsky P (2009) Renewable energy consumption, CO2 emissions and oil prices in the G7 countries. Energ Econ 31: 456–462. https://doi.org/10.1016/j.eneco.2008.12.010 doi: 10.1016/j.eneco.2008.12.010
![]() |
[57] |
Saint Akadiri S, Alola AA, Akadiri AC (2019) The role of globalization, real income, tourism in environmental sustainability target. Evidence from Turkey. Sci Total Environ 687: 423–432. https://doi.org/10.1016/j.scitotenv.2019.06.139 doi: 10.1016/j.scitotenv.2019.06.139
![]() |
[58] |
Sahoo M, Sethi N (2021) The dynamic impact of urbanization, structural transformation, and technological innovation on ecological footprint and PM2. 5: evidence from newly industrialized countries. Environ Dev Sust, 1–34. https://doi.org/10.1007/s10668-021-01614-7 doi: 10.1007/s10668-021-01614-7
![]() |
[59] |
Shafiei S, Salim RA (2014) Non-renewable and renewable energy consumption and CO2 emissions in OECD countries: a comparative analysis. Energ Policy 66: 547–556. https://doi.org/10.1016/j.enpol.2013.10.064 doi: 10.1016/j.enpol.2013.10.064
![]() |
[60] |
Shahbaz M, Loganathan K, Muzaffar AT, et al. (2016) How urbanization affects CO2 emissions in Malaysia? The application of STIRPAT model. Renew Sust Energ Rev 57: 83–93. https://doi.org/10.1016/j.rser.2015.12.096 doi: 10.1016/j.rser.2015.12.096
![]() |
[61] |
Shahbaz M, Balsalobre D, Shahzad SJH (2019) The influencing factors of CO2 emissions and the role of biomass energy consumption: statistical experience from G-7 countries. Environ Model Asses 24: 143–161. https://doi.org/10.1007/s10666-018-9620-8 doi: 10.1007/s10666-018-9620-8
![]() |
[62] |
Shahbaz M, Mahalik MK, Shahzad SJH, et al. (2019) Testing the globalization-driven carbon emissions hypothesis: international evidence. Int Econ 158: 25–38. https://doi.org/10.1016/j.inteco.2019.02.002 doi: 10.1016/j.inteco.2019.02.002
![]() |
[63] |
Sharif A, Raza SA, Ozturk I, et al. (2019) The dynamic relationship of renewable and nonrenewable energy consumption with carbon emission: A global study with the application of heterogeneous panel estimations. Renew Energ 133: 685–691. https://doi.org/10.1016/j.renene.2018.10.052 doi: 10.1016/j.renene.2018.10.052
![]() |
[64] |
Sharma SS (2011) Determinants of carbon dioxide emissions: Empirical evidence from 69 countries. Appl Energ 88: 376–382. https://doi.org/10.1016/j.apenergy.2010.07.022 doi: 10.1016/j.apenergy.2010.07.022
![]() |
[65] |
Soytas U, Sari R (2006) Energy consumption and income in G-7 countries. J Policy Model 28: 739–750. https://doi.org/10.1016/j.jpolmod.2006.02.003 doi: 10.1016/j.jpolmod.2006.02.003
![]() |
[66] |
Stock JH, Watson MW (1993) A simple estimator of co-integrating vectors in higher order integrated systems. Econometrica J Econometric Soc, 783–820. https://doi.org/10.2307/2951763 doi: 10.2307/2951763
![]() |
[67] |
Ulucak R, Khan SUD (2020) Determinants of the ecological footprint: role of renewable energy, natural resources, and urbanization. Sust Cities Soc 54: 101996. https://doi.org/10.1016/j.scs.2019.101996 doi: 10.1016/j.scs.2019.101996
![]() |
[68] |
Wang Q, Zeng YE, Wu BW (2016) Exploring the relationship between urbanization, energy consumption, and CO2 emissions in different provinces of China. Renew Sust Energ Rev 54: 1563–1579. https://doi.org/10.1016/j.rser.2015.10.090 doi: 10.1016/j.rser.2015.10.090
![]() |
[69] | WDI (2019) World Development Indicators | Database. www.data.worldbank.org. |
[70] |
Xu B, Lin B (2015) How industrialization and urbanization process impacts on CO2 emissions in China: evidence from nonparametric additive regression models. Energ Econ 48: 188–202. https://doi.org/10.1016/j.eneco.2015.01.005 doi: 10.1016/j.eneco.2015.01.005
![]() |
[71] |
You W, Lv Z (2018) Spillover effects of economic globalization on CO2 emissions: a spatial panel approach. Energ Econ 73: 248–257. https://doi.org/10.1016/j.eneco.2018.05.016 doi: 10.1016/j.eneco.2018.05.016
![]() |
[72] |
Zaman K, Shahbaz M, Loganathan N, et al. (2016) Tourism development, energy consumption and Environmental Kuznets Curve: Trivariate analysis in the panel of developed and developing countries. Tour Manage 54: 275–283. https://doi.org/10.1016/j.tourman.2015.12.001 doi: 10.1016/j.tourman.2015.12.001
![]() |
[73] |
Zhang G, Zhang N, Liao W (2018) How do population and land urbanization affect CO2 emissions under gravity centre change? A spatial econometric analysis. J Clean Prod 202: 510–523. https://doi.org/10.1016/j.jclepro.2018.08.146 doi: 10.1016/j.jclepro.2018.08.146
![]() |
1. | Yang Chen, Kezheng Zuo, Zhimei Fu, New characterizations of the generalized Moore-Penrose inverse of matrices, 2022, 7, 2473-6988, 4359, 10.3934/math.2022242 | |
2. | Na Liu, Hongxing Wang, Efthymios G. Tsionas, The Characterizations of WG Matrix and Its Generalized Cayley–Hamilton Theorem, 2021, 2021, 2314-4785, 1, 10.1155/2021/4952943 | |
3. | Zhimei Fu, Kezheng Zuo, Yang Chen, Further characterizations of the weak core inverse of matrices and the weak core matrix, 2022, 7, 2473-6988, 3630, 10.3934/math.2022200 | |
4. | Dijana Mosić, Daochang Zhang, New Representations and Properties of the m-Weak Group Inverse, 2023, 78, 1422-6383, 10.1007/s00025-023-01878-7 | |
5. | Congcong Wang, Xiaoji Liu, Hongwei Jin, The MP weak group inverse and its application, 2022, 36, 0354-5180, 6085, 10.2298/FIL2218085W | |
6. | Dijana Mosić, Predrag S. Stanimirović, Lev A. Kazakovtsev, Application of m-weak group inverse in solving optimization problems, 2024, 118, 1578-7303, 10.1007/s13398-023-01512-9 | |
7. | Dijana Mosić, A generalization of the MP-m-WGI, 2024, 47, 1607-3606, 2133, 10.2989/16073606.2024.2352566 | |
8. | Dijana Mosić, Predrag S. Stanimirović, Lev A. Kazakovtsev, Minimization problem solvable by weighted m-weak group inverse, 2024, 1598-5865, 10.1007/s12190-024-02215-z | |
9. | Dijana Mosić, Daochang Zhang, Predrag S. Stanimirović, An extension of the MPD and MP weak group inverses, 2024, 465, 00963003, 128429, 10.1016/j.amc.2023.128429 | |
10. | Kezheng Zuo, Yang Chen, Li Yuan, Further representations and computations of the generalized Moore-Penrose inverse, 2023, 8, 2473-6988, 23442, 10.3934/math.20231191 | |
11. | D. Mosić, P. S. Stanimirović, L. A. Kazakovtsev, The m-weak group inverse for rectangular matrices, 2024, 32, 2688-1594, 1822, 10.3934/era.2024083 | |
12. | Mengyu He, Xiaoji Liu, Hongwei Jin, The MPWG inverse of third-order F-square tensors based on the T-product, 2024, 38, 0354-5180, 939, 10.2298/FIL2403939H | |
13. | Jinzhao Wu, Hongjie Jiang, Mengyu He, Xiaoji Liu, General strong fuzzy solutions of complex fuzzy matrix equations involving the Moore-Penrose weak group inverse, 2024, 654, 00200255, 119832, 10.1016/j.ins.2023.119832 | |
14. | Shuangzhe Liu, Hongxing Wang, Yonghui Liu, Conan Liu, Matrix derivatives and Kronecker products for the core and generalized core inverses, 2024, 535, 0022247X, 128128, 10.1016/j.jmaa.2024.128128 | |
15. | Jiaxuan Yao, Hongwei Jin, Xiaoji Liu, The weak group-star matrix, 2023, 37, 0354-5180, 7919, 10.2298/FIL2323919Y | |
16. | Dijana Mosić, Janko Marovt, Weighted MP weak group inverse, 2024, 32, 1844-0835, 221, 10.2478/auom-2024-0012 | |
17. |
Huanyin Chen,
On m-Generalized Group Inverse in Banach ∗ -Algebras,
2025,
22,
1660-5446,
10.1007/s00009-025-02818-1
|
Dataset | Feature name | Size |
NSL-KDD | duration, protocols_types, services, flag, src_bytes, dst_bytes, land, wrong_fragment, urgent, hot num_failed_logins, logged_in, num_compromised, root_shell, su_attempted, num_root, num_shells, num_access_files, num_outbound_cmds, is_hot_login, Is_guest_login, count, srv_count, serror_rate, srv_serror_rate, rerror_rate, srv_rerror_rate, same_srv_rate, diff_srv_rate, srv_diff_host_rate, dst_host_count, dst_host_srv_count, dst_host_same_srv_rate, dst_host_diff_srv_rate, dst_host_same_src_port_rate, dst_host_srv_diff_host_rate, dst_host_serror_rate, dst_host_srv_serror_rate, dst_host_rerror_rate, dst_host_srv_rerror_rate, label. | 42 |
UNSWNB15 | srcip, sport, dstip, dsport, protocol, state, dur, sbytes, dbytes, sttl, dttl, sloss, dloss, service, sload, dload, skts, dpkts, swin, dwin, stcpb, dtcpb, smeansz, dmeansz, trans_depth, res_bdy_len, sjit, djit, stime, ltime, sintpkt, dintpkt, tcprtt, synack, ackdat, is_sm_ips_ports, ct_state_ttl, ct_flw_http_mthd, is_ftp_login, ct_ftp_cmd, ct_srv_src, ct_srv_dst, ct_dst_ltm, ct_src_ ltm, ct_src_dport_ltm, ct_dst_sport_ltm, ct_dst_src_ltm, attack_type, label. | 49 |
NSL-KDD | Predicted label | ||
Normal | Attack | ||
True label | Normal | 8305 | 1406 |
Attack | 267 | 12,566 |
UNSWNB15 | Predicted label | ||
Normal | Attack | ||
True label | Normal | 34,029 | 2971 |
Attack | 3026 | 42,306 |
Dataset | Original | RFE |
NSL-KDD | 8.22 s | 5.9 s |
UNSWNB15 | 18.6 s | 17.25 s |
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 89.80 | 97.12 | 93.32 | 92.08 |
SVM | 89.89 | 94.45 | 92.12 | 90.80 |
RF | 90.05 | 96.89 | 93.35 | 92.14 |
GBDT | 90.00 | 97.09 | 93.41 | 92.21 |
Xgboost | 90.00 | 96.94 | 93.34 | 92.13 |
Adaboost | 89.68 | 97.15 | 93.27 | 92.02 |
Proposed method | 89.93 | 97.91 | 93.75 | 92.57 |
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 80.42 | 95.40 | 87.27 | 84.68 |
SVM | 75.03 | 99.58 | 85.58 | 81.52 |
RF | 77.51 | 99.32 | 87.07 | 83.76 |
GBDT | 76.04 | 99.51 | 86.20 | 82.46 |
Xgboost | 76.47 | 98.86 | 86.23 | 82.26 |
Adaboost | 76.57 | 98.85 | 86.30 | 82.72 |
Proposed method | 93.43 | 93.32 | 93.38 | 92.71 |
Dataset | Method | Accuracy (%) | Recall (%) | Precision(%) | F1 score(%) |
NSL-KDDTest | RandomTree+NBtree [13] | 89.24 | N/A | N/A | N/A |
CBR-CNN [19] | 89.41 | N/A | N/A | N/A | |
AE-LSTM [24] | 89.00 | 88.00 | N/A | N/A | |
AIDA [20] | 92.41 | 92.00 | 94.52 | 93.24 | |
STL [27] | 88.39 | 95.95 | 85.44 | 90.40 | |
AE-IDS [30] | 84.21 | 80.37 | 87.00 | 81.98 | |
MFFSEM [31] | 84.33 | 96.43 | 74.61 | 84.13 | |
Our Method | 92.57 | 97.91 | 89.93 | 93.75 | |
UNSWNB15Test | Voting-CMN [15] | 89.29 | 99.28 | 82.37 | 90.04 |
RepTree [26] | 88.95 | N/A | N/A | N/A | |
ANN [32] | 86.71 | 98.06 | 81.54 | 89.04 | |
MFFSEM [31] | 88.85 | 80.44 | 93.88 | 86.64 | |
LOF [33] | 91.86 | N/A | N/A | N/A | |
GAA [54] | 91.80 | 91.00 | N/A | N/A | |
GBM [55] | 91.31 | N/A | N/A | N/A | |
Our Method | 92.71 | 93.32 | 93.43 | 93.38 |
Dataset | Feature name | Size |
NSL-KDD | duration, protocols_types, services, flag, src_bytes, dst_bytes, land, wrong_fragment, urgent, hot num_failed_logins, logged_in, num_compromised, root_shell, su_attempted, num_root, num_shells, num_access_files, num_outbound_cmds, is_hot_login, Is_guest_login, count, srv_count, serror_rate, srv_serror_rate, rerror_rate, srv_rerror_rate, same_srv_rate, diff_srv_rate, srv_diff_host_rate, dst_host_count, dst_host_srv_count, dst_host_same_srv_rate, dst_host_diff_srv_rate, dst_host_same_src_port_rate, dst_host_srv_diff_host_rate, dst_host_serror_rate, dst_host_srv_serror_rate, dst_host_rerror_rate, dst_host_srv_rerror_rate, label. | 42 |
UNSWNB15 | srcip, sport, dstip, dsport, protocol, state, dur, sbytes, dbytes, sttl, dttl, sloss, dloss, service, sload, dload, skts, dpkts, swin, dwin, stcpb, dtcpb, smeansz, dmeansz, trans_depth, res_bdy_len, sjit, djit, stime, ltime, sintpkt, dintpkt, tcprtt, synack, ackdat, is_sm_ips_ports, ct_state_ttl, ct_flw_http_mthd, is_ftp_login, ct_ftp_cmd, ct_srv_src, ct_srv_dst, ct_dst_ltm, ct_src_ ltm, ct_src_dport_ltm, ct_dst_sport_ltm, ct_dst_src_ltm, attack_type, label. | 49 |
NSL-KDD | Predicted label | ||
Normal | Attack | ||
True label | Normal | 8305 | 1406 |
Attack | 267 | 12,566 |
UNSWNB15 | Predicted label | ||
Normal | Attack | ||
True label | Normal | 34,029 | 2971 |
Attack | 3026 | 42,306 |
Dataset | Original | RFE |
NSL-KDD | 8.22 s | 5.9 s |
UNSWNB15 | 18.6 s | 17.25 s |
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 89.80 | 97.12 | 93.32 | 92.08 |
SVM | 89.89 | 94.45 | 92.12 | 90.80 |
RF | 90.05 | 96.89 | 93.35 | 92.14 |
GBDT | 90.00 | 97.09 | 93.41 | 92.21 |
Xgboost | 90.00 | 96.94 | 93.34 | 92.13 |
Adaboost | 89.68 | 97.15 | 93.27 | 92.02 |
Proposed method | 89.93 | 97.91 | 93.75 | 92.57 |
![]() |
Precision (%) | Recall (%) | F1 score (%) | Accuracy (%) |
DT | 80.42 | 95.40 | 87.27 | 84.68 |
SVM | 75.03 | 99.58 | 85.58 | 81.52 |
RF | 77.51 | 99.32 | 87.07 | 83.76 |
GBDT | 76.04 | 99.51 | 86.20 | 82.46 |
Xgboost | 76.47 | 98.86 | 86.23 | 82.26 |
Adaboost | 76.57 | 98.85 | 86.30 | 82.72 |
Proposed method | 93.43 | 93.32 | 93.38 | 92.71 |
Dataset | Method | Accuracy (%) | Recall (%) | Precision(%) | F1 score(%) |
NSL-KDDTest | RandomTree+NBtree [13] | 89.24 | N/A | N/A | N/A |
CBR-CNN [19] | 89.41 | N/A | N/A | N/A | |
AE-LSTM [24] | 89.00 | 88.00 | N/A | N/A | |
AIDA [20] | 92.41 | 92.00 | 94.52 | 93.24 | |
STL [27] | 88.39 | 95.95 | 85.44 | 90.40 | |
AE-IDS [30] | 84.21 | 80.37 | 87.00 | 81.98 | |
MFFSEM [31] | 84.33 | 96.43 | 74.61 | 84.13 | |
Our Method | 92.57 | 97.91 | 89.93 | 93.75 | |
UNSWNB15Test | Voting-CMN [15] | 89.29 | 99.28 | 82.37 | 90.04 |
RepTree [26] | 88.95 | N/A | N/A | N/A | |
ANN [32] | 86.71 | 98.06 | 81.54 | 89.04 | |
MFFSEM [31] | 88.85 | 80.44 | 93.88 | 86.64 | |
LOF [33] | 91.86 | N/A | N/A | N/A | |
GAA [54] | 91.80 | 91.00 | N/A | N/A | |
GBM [55] | 91.31 | N/A | N/A | N/A | |
Our Method | 92.71 | 93.32 | 93.43 | 93.38 |