
In this work, we investigated the finite-time passivity problem of neutral-type complex-valued neural networks with time-varying delays. On the basis of the Lyapunov functional, Wirtinger-type inequality technique, and linear matrix inequalities (LMIs) approach, new sufficient conditions were derived to ensure the finite-time boundedness (FTB) and finite-time passivity (FTP) of the concerned network model. At last, two numerical examples with simulations were presented to demonstrate the validity of our criteria.
Citation: Haydar Akca, Chaouki Aouiti, Farid Touati, Changjin Xu. Finite-time passivity of neutral-type complex-valued neural networks with time-varying delays[J]. Mathematical Biosciences and Engineering, 2024, 21(5): 6097-6122. doi: 10.3934/mbe.2024268
[1] | Lu Yuan, Yuming Ma, Yihui Liu . Protein secondary structure prediction based on Wasserstein generative adversarial networks and temporal convolutional networks with convolutional block attention modules. Mathematical Biosciences and Engineering, 2023, 20(2): 2203-2218. doi: 10.3934/mbe.2023102 |
[2] | Xing Hu, Minghui Yao, Dawei Zhang . Road crack segmentation using an attention residual U-Net with generative adversarial learning. Mathematical Biosciences and Engineering, 2021, 18(6): 9669-9684. doi: 10.3934/mbe.2021473 |
[3] | Boyang Wang, Wenyu Zhang . ACRnet: Adaptive Cross-transfer Residual neural network for chest X-ray images discrimination of the cardiothoracic diseases. Mathematical Biosciences and Engineering, 2022, 19(7): 6841-6859. doi: 10.3934/mbe.2022322 |
[4] | Ying Xu, Jinyong Cheng . Secondary structure prediction of protein based on multi scale convolutional attention neural networks. Mathematical Biosciences and Engineering, 2021, 18(4): 3404-3422. doi: 10.3934/mbe.2021170 |
[5] | Jiajia Jiao, Xiao Xiao, Zhiyu Li . dm-GAN: Distributed multi-latent code inversion enhanced GAN for fast and accurate breast X-ray image automatic generation. Mathematical Biosciences and Engineering, 2023, 20(11): 19485-19503. doi: 10.3934/mbe.2023863 |
[6] | Shuaiyu Bu, Yuanyuan Li, Wenting Ren, Guoqiang Liu . ARU-DGAN: A dual generative adversarial network based on attention residual U-Net for magneto-acousto-electrical image denoising. Mathematical Biosciences and Engineering, 2023, 20(11): 19661-19685. doi: 10.3934/mbe.2023871 |
[7] | Binjie Hou, Gang Chen . A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network. Mathematical Biosciences and Engineering, 2024, 21(3): 4309-4327. doi: 10.3934/mbe.2024190 |
[8] | Jiyun Shen, Yiyi Xia, Yiming Lu, Weizhong Lu, Meiling Qian, Hongjie Wu, Qiming Fu, Jing Chen . Identification of membrane protein types via deep residual hypergraph neural network. Mathematical Biosciences and Engineering, 2023, 20(11): 20188-20212. doi: 10.3934/mbe.2023894 |
[9] | Hui Yao, Yuhan Wu, Shuo Liu, Yanhao Liu, Hua Xie . A pavement crack synthesis method based on conditional generative adversarial networks. Mathematical Biosciences and Engineering, 2024, 21(1): 903-923. doi: 10.3934/mbe.2024038 |
[10] | Hong'an Li, Man Liu, Jiangwen Fan, Qingfang Liu . Biomedical image segmentation algorithm based on dense atrous convolution. Mathematical Biosciences and Engineering, 2024, 21(3): 4351-4369. doi: 10.3934/mbe.2024192 |
In this work, we investigated the finite-time passivity problem of neutral-type complex-valued neural networks with time-varying delays. On the basis of the Lyapunov functional, Wirtinger-type inequality technique, and linear matrix inequalities (LMIs) approach, new sufficient conditions were derived to ensure the finite-time boundedness (FTB) and finite-time passivity (FTP) of the concerned network model. At last, two numerical examples with simulations were presented to demonstrate the validity of our criteria.
Proteins play an extremely important role in our daily activities, with functions such as immunity and cell signaling. Their different functions are due to their different structures. Therefore, to fully understand the functions of proteins and related research, it is necessary to predict their structure. Although the advent of AlphaFold2 [1] has changed the protein prediction landscape, it has achieved very reliable results for the prediction of protein tertiary structure [2], as the prediction of secondary structure is still of great significance, because the secondary structure will improve the alignment of the tertiary structure, thereby affecting the spatial morphology of the protein, so this paper proposes a method based on deep learning to predict the secondary structure of proteins.
Protein secondary structure is the local spatial conformation of amino acid residues in protein polypeptide chains, mainly in the form of 3-states (helix (H), chain (E), coil (C)), which can be divided into 8-states, namely α-helix (H), helix (G), π-helix (I), β-bridge (B), β-sheet (E), bend (S), turn (T) and coil (C) [3,4,5]. This study was devoted to the 8-state prediction of proteins, which can be more informative and more challenging.
In the 1990s, Burkhard Rost and Chris Sander first used neural networks to predict the secondary structure of proteins [6]. In addition to achieving excellent results, this method was pioneered in the field of protein structure prediction. Early protein secondary structure prediction used statistical methods and heuristic rules [7], such as Support Vector Machine [8], Bayesian classification algorithm, Markov model [9], and Feedforward neural network [10,11] that have been applied in the prediction of protein secondary structure. With the advent of the post-genomic era, the amount of protein data has increased. Owing to the high cost and difficulty of experiments, traditional experimental determination methods have been unable to meet the growing demand for protein and structural data analyses. Therefore, methods for protein structure prediction have become a hot issue in bioinformatics. In the last few years, as deep learning has made tremendous progress in natural language processing, machine vision and speech recognition, bioinformatics has also begun to extensively use deep learning methods.
In recent years, many scholars and researchers have achieved excellent results in the field of 8-state research on protein secondary structure. Busia et al. proposed a protein sequence prediction technique, which combined the successful experience of using convolutional neural networks in the past and language modeling, and achieved good results [12]. Using the combined synergy of a convolutional neural network, residual network and bidirectional recurrent neural network prediction, Zhang et al. [13] designed a local block composed of convolutional filters and raw input to capture local Sequence Features. Krieger et al. determined estimated class membership probabilities of residues in proteins using the nearest neighbor search, which is then fed into another dynamic programming algorithm, showing good results on the CASP dataset [14]. Uddin et al. proposed to combine the self-attention mechanism with the Deep Inception-Inside-Inception (Deep3I) network to track residues between amino acids at different distances through interaction [15]. Kotowski et al. proposed a single-sequence-based method called ProteinUnet, which effectively shortens the inference time, and improves the training speed [16]. Sonsare and Gunavathi proposed a model consisting of a 1D-Convnet and an improved recurrent neural network with an improved sequential coin toss optimizer, achieving good prediction accuracy on CB513 and CullPDB [17].
This paper proposes an 8-state protein secondary structure prediction method named WG-ICRN, as it based on WGAN and ResNet with Inception. WG-ICRN extracts the feature information of the protein use WGAN, and then combines this information with PSSM [18] to enhance the features, and the combined feature matrix is named WG-data. The increased length and width of WG-data makes its feature maps larger in area and richer in feature information, since WG-data was input into the ICRN module as input data. ICRN was a transformation of the residual network. Inception was introduced into the residual network to replace the convolution layer, and the width of the input data feature map was increased through multi-scale convolution to further enrich the features.
The main contributions of this study are: (1) We use WGAN to extract protein information in sliding window processed PSSM, and combine PSSM to build a new feature set with rich protein features. (2) The ICRN model combines Inception and the residual network, increasing the width of the network through Inception, while the residual network guarantees the depth of the network, improving the performance of the network from two aspects. (3) ICRN reduces the number of training parameters by using multiple smaller filters to reduce the dimension of the data, so the training time is shorter than the residual network, saving system resources. (4) Experimental results show that WG-ICRN is superior to other popular models in prediction accuracy.
Generative adversarial network (GAN) [19] was proposed by Ian Goodfellow in 2014, and consists of two parts: generator (G) and discriminator (D). G can generate similar fake data by learning the distribution characteristics of real data, while D judges and scores the authenticity of the data. GAN has been applied to image denoising and feature extraction [20,21,22], and has been proved to have good properties. GAN also has the problem that the model is difficult to optimize, as the tedious problem of G and D parameter optimization is difficult to solve. In recent years, a lot of optimization algorithms [23,24,25,26], such as Aquila optimizer [27] and the Gazelle optimization algorithm [28], provide a direction to solve this difficult problem. But the more critical issue for GAN is this: Owing to the approximate optimal D of GAN, the G loss faces the problem of gradient disappearance. The WGAN uses the Wasserstein distance, which can alleviate this critical problem, and has the advantage of reflecting the distance between two distributions even if they do not have any overlap [29].
The specific training process of WGAN is the constant game and confrontation between G and D. When training D, the data generated by the previous round of G and real data are directly spliced together as x, the fake data corresponded to 0, and the real data corresponds to 1. Then, a score (a number from 0 to 1) can be generated through D, x input, and through the loss function composed of the score and y, gradient backpropagation can be performed. The training process of D is shown in Figure 1(a). When training G, G and D need to be treated as a whole, which is named "D_on_G". The output of this whole system (referred to as the DG system) is still the score. Entering a set of random vectors z, we can generate a set of random data in G, and score the generated set of data through D to obtain the score, which is the forward process of the DG system. The training process is presented in Figure 1(b).
The GAN objective function is as formula (1), where, x and z represent the input real and random data, G(z) represents the data generated after G processes the random data z, and D(x) represents the probability that the data is the real data. In the most ideal case, G can generate data G(z) that is very similar to the real protein data, and it is difficult for D to judge the authenticity of these data, that is, D(G(z)) = 0.5.
minGmaxD(D,G)=Ex∼Pdata(x)[logD(x)]+Ez∼Pz(x)[log(1−D(G(z)))] | (1) |
Objective function (1) to be optimized by the GAN can be divided into 2 parts: Part 1, fix the G and optimize the D, then (1) can be rewritten as formula (2), convert it to minimized form as formula (3). Part 2, fix the D, optimize the G, which is equivalent to minimizing, as formula (4), so that the argument of D does not exceed a fixed constant, just maximize the formula (5).
maxDEx∼Pr[logD(x)]+Ex∼Pg[log(1−D(x))] | (2) |
minD−Ex∼Pr[logD(x)]−Ex∼Pg[log(1−D(x))] | (3) |
minGEx∼Pg[log(1−D(x))] | (4) |
L=Ex∼Pr[D(x)]−Ex∼Pg[log(D(x))] | (5) |
In this experiment, we introduced CNN in WGAN to assist in feature extraction. Local receptive fields and weight sharing operations in CNN can realize displacement, scaling and distortion invariance. We use ReLU as the activation function of the CNN, which is calculated as Eq. (6).
Fik=f(∑hPi−1h∗Wik+b) | (6) |
Here, f is ReLU, which Pi−1h represents the feature map obtained from the input protein data and the convolution kernel of the previous layer, Wik is the convolution kernel of the No. i layer, k is the number of convolution kernel, and b is the bias parameter. At the same time, we use gradient punishment to improve the stability of the network during WGAN training. The network structure of WGAN used in the experiment is shown in Figure 2.
The network depth is very important for the performance of the model, but, in fact, the deep network will face degradation problems, and the accuracy will also decrease. Studies have shown that this deep network has the problem of gradient explosion or disappearance, and Residual Networks (ResNet) [30] introduces the residual learning to alleviate this problem. Nowadays, ResNet is used in computer vision and medical analysis [31,32].
The specific process is that for a block structure, where the learned characteristics from when the input is X are recorded as H(X), and we hope that the residual F(X) = H(X) - X can be learned, when the original characteristics are F(X) + X. Because residual learning is easier than the original feature direct learning, when the residual is 0, the block only makes the constant mapping, which makes the network performance not decline, but in fact the residual will not be 0, which will also make the block learn the new feature on the basis of the input feature, so that it has better performance. Residual learning is similar to short-circuit connections, and is structured as shown in Figure 3.
The origin of the residual block structure consists of convolution and pooling before residual learning. The origin residual connection method is shown in Figure 4(a). The article [33] has conducted a more detailed analysis experiment on the residual structure and obtained the optimal residual learning structure, that is, batch normalization and ReLU were performed before convolutional layers, and the structure is shown in Figure 4(b).
In 2014, Szegedy et al. proposed the Inception structure for the first time [34]. Inception performs convolution operations on the feature map at a certain moment by using convolution kernels of different sizes, so as to obtain a new feature map, and then samples the size of the input feature according to the feature map of different sizes. It is worth noting that Inception does not change the size of the original features, but only enriches the characteristic information of the protein through different convolution kernels, making the characteristics diversified. The network structure of InceptionV2 is shown in Figure 5.
In this experiment, we use the improved Inception module instead of the first convolutional layer and maxpooling layer in the ResNet model, and the improved Inception module informs the WG-data to extract learning at different scales through convolutional kernels of different sizes, which enriches the feature information and improves the prediction accuracy of protein secondary structure prediction. Improved Inception module structure is shown in Figure 6.
In this paper, we use ICRN-N to represent the improved ResNet of different depths, and N refers to the number of network layers with privileged values, that is, only convolutional layers, as fully connected layers and pooled layers are included. We set the number of layers with weights of 10, 18 and 34 as the experimental model, and the structures of ICRN-10, ICRN-18, and ICRN-34 are shown in Table 1, respectively.
Layer name | ICRN-10 | ICRN-18 | ICRN-34 |
Inception Block | [1×13×33×33×3][1×13×33×3][Maxpool1×1] | ||
Residuals-Block-1 | [3×3,643×3,64]×2 | [3×3,643×3,64]×2 | [3×3,643×3,64]×3 |
Residuals-Block-2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×4 |
Residuals-Block-3 | / | [3×3,2563×3,256]×2 | [3×3,2563×3,256]×6 |
Residuals-Block-4 | / | [3×3,5123×3,512]×2 | [3×3,5123×3,512]×3 |
Average pool, fully connected, softmax |
The structure of WG-ICRN is shown in Figure 7, and it can be seen that our network model is mainly divided into the WGAN and ICRN modules. Firstly, a protein was processed into PSSMS with size of 20 × L, where 20 is the feature dimension, and L represents the protein length. Since the lengths of different proteins were different, sliding Windows (The length is W) were used to cut PSSMS. The processed data would be used as the learning model of WGAN, and key features would be extracted through the confrontation of G and D. We use several convolutional layers to assist G and D networks; G networks use Leaky ReLU as the activation function, due to the large number of iterations, and to prevent overfitting, we use Dropout in G networks. We Concatenated the final data (Si-data) generated by D and the PSSM processed by sliding window into a matrix of 40 × W, named WG-data.
The ICRN module consists of two parts, namely Inception block and residual block. The improved Inception will replace the first convolution layer with a size of 7 × 7, and the max pooling layer with a size of 3 × 3 in this model. The improved Inception can achieve the same convolution effect by using three layers of 3 × 3 convolution layers with fewer training parameters, which will save training time. At the same time, the multi-scale convolution model in Inception can extract feature maps of different sizes, which, when combined together, will increase the richness of features to some extent.
After two feature enhancements, the residual block in ICRN will conduct the final training on the data. We respectively use the residual network of different depths to test the data. At the end of the network, we adopt an average pooling layer to replace the flattening of the matrix features, which reduces the number of parameters.
Finally, in the output layer of the model, we use the fully connected layer and softmax layer to output the final prediction results and calculate the prediction accuracy through the evaluation criteria.
The main public datasets used in this study are the CullPDB [35] dataset and the datasets [36,37,38,39,40,41] CASP10, CASP11, CASP12, CASP13, CASP14 and CB513. The CullPDB dataset contains 12,288 proteins. These datasets show that the similarity of the data was less than 25%. In this study, the repeated protein dataset CullPDB was removed as the training set, with a total of 11650 proteins. For the CASP10-14, and CB513 datasets, there were 99, 81, 19, 22, 24 and 513 protein chains, respectively. The number of protein sequences in datasets is listed in Table 2.
Datasets | Number of proteins |
CullPDB | 11650 |
CASP10 | 99 |
CASP11 | 81 |
CASP12 | 19 |
CASP13 | 22 |
CASP14 | 24 |
CB513 | 513 |
Position-Specific Scoring Matrix (PSSM) is rich in biological evolution information, which greatly improves the accuracy of protein secondary structure prediction. It is a widely used feature for information. The PSSM of this experiment was generated by multiple sequence alignment of proteins in the NR database, setting the PSI-BLAST [42] parameter threshold to 0.001 and 3 iterations.
Q8 and SOV are evaluation criteria for evaluating protein prediction performance from different perspectives. Q8 is the ratio of the number of correctly predicted amino acids to all amino acids. It can be expressed by formula (7), and S is the total number of amino acids.
Q8=SH+SE+SG+SB+SI+SS+ST+SCS×100 | (7) |
Among them, SH, SE, SG, SB, SI, SS, ST and SC are the numbers of correctly predicted α-helices, beta-sheets, β bridges, 310 helices, π helices, turns, β-turns and random coils, respectively, and S is the total number of amino acids. The secondary structure accuracy of a state is calculated as formula (8).
Qj=SjNj,j∈{H,E,G,B,I,S,T,C} | (8) |
SOV [43] is a measure based on the ratio of overlapping segments. Assuming that all observed structural fragments are labeled Sob, all predicted fragments are labeled Spr, and So is a fragment with the same state as Sob and Spr, and for any pair of fragments in So, the actual length is minov (Sob, Spr), where at least one residue has a total length of maxov (Sob, Spr). The SOV calculation formula as formula (9).
SOV=100NSOV∑So[minov(Sob,Spr)+σ(Sob,Spr)maxov(Sob,Spr)length(Sob)] | (9) |
Among them, σ(Sob,Spr) allows the change of the observed fragment boundary in the protein structure, which is defined by the formula (10).
σ(Sob,Spr)=min{(maxov(Sob,Spr)−minov(Sob,Spr))minov(Sob,Spr)int[len(Sob)]/2int[len(Spr)]/2} | (10) |
The experiment in this paper was done with the processor Intel(R) Xeon(R) Glod 5118, and the graphics card RTX2080Ti and the system Linux. Firstly, we tested the influence of the number of CNN convolutional layers on the WGAN feature extraction ability. The size of the convolution kernel was set to 3 × 3 × 64, and different convolutional layers were set to 1, 2, 3, 4 and 5, and tested on CASP11-14. As can be seen from Table 3, when the number of convolutional layers is 3, the data generated by G are closer to the real data.
Layers | CASP11 | CASP12 | CASP13 | CASP14 |
1 | 68.26 | 68.85 | 67.21 | 68.36 |
2 | 71.27 | 70.63 | 67.76 | 69.41 |
3 | 72.55 | 71.81 | 69.88 | 70.29 |
4 | 71.71 | 71.23 | 68.73 | 68.22 |
5 | 70.33 | 69.46 | 67.24 | 67.61 |
Because the number of iterations of the generator and discriminator will also affect the feature extraction ability of the WGAN, this study tested the influence of different iterations on the experiment, in which 3 convolutional layers are set, and the parameters of the convolution kernel are set to 3 × 3 × 64, 3 × 3 × 128 and 3 × 3 × 256, and the experimental results under different iterations are shown in Figure 8.
As shown in Figure 7, the best effect is achieved when the number of iterations is 20,000, that is, the features extracted by G are the most realistic and effective. After more than 20,000 iterations, D's ability to judge the authenticity of the generated data decreases to the point where there is a large error between the simulated and real features.
To test the influence of the length of sliding window on the experimental results, we selected 13, 15, 17, 19 and 21 for Q8 prediction. The experimental results are shown in Table 4, which shows that when the sliding window is 19, the experimental results are the best.
Sliding window | CASP11 | CASP12 | CASP13 | CASP14 |
13 | 67.47 | 68.16 | 65.88 | 65.20 |
15 | 68.09 | 69.27 | 66.67 | 67.41 |
17 | 70.66 | 70.24 | 68.18 | 69.13 |
19 | 71.55 | 70.81 | 68.88 | 69.29 |
21 | 70.29 | 70.41 | 68.42 | 68.94 |
Using different depths of ResNet, we tested CASP11-14, and obtained the experimental results shown in Figure 9. It can be seen that WG-ICRN-18 has the highest accuracy, because the dimension of WG-data is not high, and when the number of layers is too deep, part of the data will be lost, which causes a decrease in accuracy. In addition, we calculate the SOV and Qj (j∈{H,E,G,B,I,S,T,C}) of each test set under the WG-ICRN method, and the results are shown in Table 5.
Dataset | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
SOV | 70.98 | 69.37 | 68.83 | 67.41 | 66.39 | 73.91 |
Q8 | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |
QG | 52.72 | 55.32 | 47.90 | 36.56 | 33.51 | 37.71 |
QH | 92.66 | 85, 74 | 87.43 | 93.2 | 89.75 | 92.25 |
QI | 0 | 0 | 0 | 0 | 0 | 0 |
QT | 62.21 | 49.47 | 56.78 | 57.31 | 45.29 | 53.67 |
QB | 9.81 | 19.30 | 7.25 | 8.30 | 3.88 | 7.68 |
QE | 88.80 | 82.15 | 77.76 | 84.37 | 76.79 | 80.44 |
QS | 53.88 | 43.68 | 48.90 | 34.74 | 13.33 | 25.39 |
QC | 68.12 | 62.91 | 67.38 | 70.71 | 68.54 | 71.37 |
This paper divides CullPDB into five parts for five-fold cross-validation, four as training sets and one as test, and the results of cross-validation are shown in Table 6.
1 | 2 | 3 | 4 | 5 | Average | |
Q8 | 72.72 | 73.18 | 73.43 | 72.61 | 71.36 | 72.66 |
We did ablation experiments to demonstrate the importance of each structure. We used five network models to test CASP11-14, and the experimental results are presented in Table 7. Thus, WG-ICRN is the model proposed in this paper, WG-Res is combining WGAN and ResNet and WG-CNN is a network model combining WGAN and CNN, where the three methods input data adopts WG-data, and the network model structure of CNN uses 3 convolutional layers: 3 × 3 × 64, 3 × 3 × 128 and 3 × 3 × 256. ResNet is a residual network model based on the best ResNet-18, CNN is using a 3-layer convolutional neural network, a structure is 3 × 3 × 64, 3 × 3 × 128, and 3 × 3 × 256. The input data for ResNet and CNN were PSSM. In addition, we calculated the average training time of each of the 11650 proteins in the CullPDB dataset for the 5 methods. These results are shown in Table 7.
CASP11 | CASP12 | CASP13 | CASP14 | Training time (s) | |
WG-ICRN | 71.55 | 70.81 | 68.88 | 69.29 | 21.9 |
WG-Res | 71.43 | 70.67 | 68.83 | 69.17 | 22.4 |
WG-CNN | 70.47 | 68.79 | 67.33 | 68.24 | 21.7 |
ResNet | 68.76 | 67.84 | 65.57 | 66.19 | 9.8 |
CNN | 66.62 | 65.29 | 63.69 | 64.71 | 9.6 |
By comparing the experimental results of the five methods in the table, it can be seen that, when the input data is the same PSSM, the prediction accuracy of ResNet is higher than that of CNN, because the deeper number of network layers makes training more adequate and increases training time, but the efficiency of ResNet is still better than that of CNN. WGAN extracted features significantly improves the prediction accuracy of Q8, and greatly increases the training time because of the increased size of the training data. Our proposed ICRN model reduces the time complexity by introducing Inception and extracts horizontal multi-scale feature fusion, which reduces the training time and improves the prediction accuracy compared with ResNet.
Furthermore, we compared other models with our proposed method. Common with WG-ICRN is that these methods are improvements or hybrid models of CNN, and among them is, DeepACLSTM [44], which combines asymmetric convolution (ACNN) and bidirectional long short-term memory neural network (BiLSTM), 1D-Inception [45] Taking inspiration from InceptionV3 to extract features from 1D sequences using several parallel convolutions, DCRNN [46] uses an end-to-end model with multi-scale CNN and stacked bidirectional GRU. CNN_BIGRU [47] used CNN and bidirectional gated recurrent units to prediction. We re-run the code of the above method on the same computer, and the training set uses the same as WG-Res, which has been screened by data, and contains a total of 11,650 proteins. The experimental results are shown in Table 8. By comparison, it can be seen that WG-ICRN has excellent performance in predicting the secondary structure of protein 8 states, because of the deep layers advantages of ResNet, and, in addition, the matrix will contain richer feature information than the one-dimensional sequence, so the experimental results as input data will be better.
Method | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
DeepACLSTM | 73.09 | 71.49 | 70.35 | 68.91 | 68.81 | 75.51 |
1D-Inception | 71.86 | 70.07 | 69.78 | 67.51 | 68.3 | 74.68 |
DCRNN | 72.11 | 70.50 | 69.41 | 68.05 | 68.87 | 74.85 |
CNN_BIGRU | 71.87 | 70.94 | 69.67 | 67.83 | 68.69 | 75.54 |
WG-ICRN | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |
The prediction of protein secondary structure is important work to comprehensively understand and explore the diverse functions and spatial structure of proteins. This paper combines WGAN and ICRN, for the first time, to propose a novel protein 8-state secondary structure prediction method, WG-ICRN. In WG-ICRN, WGAN can extract protein features in amino acid sequences, and then we combine PSSM with the extracted features into a new feature matrix WG-data that contains more protein feature information. We also use ICRN to further extract the residue interactions in WG-data and complete the prediction. We introduced the improved Inception module into ResNet and proposed the ICRN model, which cannot only reduce parameter calculation and improve efficiency, but also increase network width to improve network performance. We evaluate the proposed model on six datasets CASP10-14 and CB513. Experimental results show that the prediction performance of WG-ICRN is better than the four other popular methods. In addition, this paper also proves that WGAN has a powerful feature extraction ability, and the ICRN model can handle protein data more comprehensively, and the combination of the two has achieved remarkable results. However, it is difficult for WGAN to achieve the balance between generator and discriminator, which also makes training more tedious and time-consuming. In addition, secondary structure prediction is also slightly affected by residues in the global range, but WG-ICRN mainly focuses on local features and ignores long-range features. In future work, we will continue to optimize the feature extraction technique and fully utilize different feature information of protein sequences to improve prediction performance.
The codes and datasets for this work are at https://github.com/ShunLi999/WG-ICRN.git
This work was supported by the National Natural Science Foundation of China (grant number 61375013) and the Natural Science Foundation of Shandong Province (grant number ZR2013FM020).
The authors declare there are no conflicts of interest.
[1] |
C. Aouiti, R. Sakthivel, F. Touati, Global dissipativity of fuzzy bidirectional associative memory neural networks with proportional delays, Iran. J. Fuzzy Syst., 18 (2021), 65–80. https://doi.org/10.22111/ijfs.2021.5914 doi: 10.22111/ijfs.2021.5914
![]() |
[2] |
F. Wu, T. Kang, Y. Shao, Q. Wang, Stability of Hopfield neural network with resistive and magnetic coupling, Chaos Solitons Fractals, 172 (2023), 113569. https://doi.org/10.1016/j.chaos.2023.113569 doi: 10.1016/j.chaos.2023.113569
![]() |
[3] |
B. B. He, H. C. Zhou, Asymptotic stability and synchronization of fractional order Hopfield neural networks with unbounded delay, Math. Methods Appl. Sci., 46 (2023), 3157–3175. https://doi.org/10.1002/mma.8000 doi: 10.1002/mma.8000
![]() |
[4] |
R. P. Agarwal, S. Hristova, Stability of delay Hopfield neural networks with generalized proportional Riemann-Liouville fractional derivative, AIMS Math., 8 (2023), 26801–26820. https://doi.org/10.3934/math.20231372 doi: 10.3934/math.20231372
![]() |
[5] |
P. Li, R. Gao, C. Xu, J. Shen, S. Ahmad, Y. Li, Exploring the impact of delay on Hopf bifurcation of a type of BAM neural network models concerning three nonidentical delays, Neural Process. Lett., 55 (2023), 1595–11635. https://doi.org/10.1007/s11063-023-11392-0 doi: 10.1007/s11063-023-11392-0
![]() |
[6] |
B. Zhou, Q. Song, Boundedness and complete stability of complex-valued neural networks with time delay, IEEE Trans. Neural Networks Learn. Syst., 24 (2013), 1227–1238. https://doi.org/10.1109/TNNLS.2013.2247626 doi: 10.1109/TNNLS.2013.2247626
![]() |
[7] |
Z. Zhang, X. Liu, J. Chen, R. Guo, S. Zhou, Further stability analysis for delayed complex-valued recurrent neural networks, Neurocomputing, 251 (2017), 81–89. https://doi.org/10.1016/j.neucom.2017.04.013 doi: 10.1016/j.neucom.2017.04.013
![]() |
[8] |
M. S. Ali, G. Narayanan, Z. Orman, V. Shekher, S. Arik, Finite time stability analysis of fractional-order complex-valued memristive neural networks with proportional delays, Neural Process. Lett., 51 (2020), 407–426. https://doi.org/10.1007/s11063-019-10097-7 doi: 10.1007/s11063-019-10097-7
![]() |
[9] |
Z. Zhang, J. Cao, Periodic solutions for complex-valued neural networks of neutral type by combining graph theory with coincidence degree theory, Adv. Differ. Equations, 2018 (2018), 1–23. https://doi.org/10.1186/s13662-018-1716-6 doi: 10.1186/s13662-018-1716-6
![]() |
[10] |
Y. Shi, X. Chen, P. Zhu, Dissipativity for a class of quaternion-valued memristor-based neutral-type neural networks with time-varying delays, Math. Methods Appl. Sci., 46 (2023), 18166–18184. https://doi.org/10.1002/mma.9551 doi: 10.1002/mma.9551
![]() |
[11] |
N. Li, W. X. Zheng, Passivity analysis for quaternion-valued memristor-based neural networks with time-varying delay, IEEE Trans. Neural Networks Learn. Syst., 31 (2019), 639–650. https://doi.org/10.1109/TNNLS.2019.2908755 doi: 10.1109/TNNLS.2019.2908755
![]() |
[12] |
C. Ge, J. H. Park, C. Hua, C. Shi, Robust passivity analysis for uncertain neural networks with discrete and distributed time-varying delays, Neurocomputing, 364 (2019), 330–337. https://doi.org/10.1016/j.neucom.2019.06.077 doi: 10.1016/j.neucom.2019.06.077
![]() |
[13] |
S. Chandran, R. Ramachandran, J. Cao, R. P. Agarwal, G. Rajchakit, Passivity analysis for uncertain BAM neural networks with leakage, discrete and distributed delays using novel summation inequality, Int. J. Control Autom. Syst., 17 (2019), 2114–2124. https://doi.org/10.1007/s12555-018-0513-z doi: 10.1007/s12555-018-0513-z
![]() |
[14] |
M. V. Thuan, D. C. Huong, D. T. Hong, New results on robust finite-time passivity for fractional-order neural networks with uncertainties, Neural Process. Lett., 50 (2019), 1065–1078. https://doi.org/10.1007/s11063-018-9902-9 doi: 10.1007/s11063-018-9902-9
![]() |
[15] |
J. Han, Finite-time passivity and synchronization for a class of fuzzy inertial complex-valued neural networks with time-varying delays, Axioms, 13 (2024), 39. https://doi.org/10.3390/axioms13010039 https://doi.org/10.3390/axioms13010039 doi: 10.3390/axioms13010039
![]() |
[16] |
A. Chaouki, F. Touati, Global dissipativity of Clifford-valued multidirectional associative memory neural networks with mixed delays, Comput. Appl. Math., 39 (2020), 1–21. https://doi.org/10.1007/s40314-020-01367-5 doi: 10.1007/s40314-020-01367-5
![]() |
[17] |
S. Guo, B. Du, Global exponential stability of periodic solution for neutral-type complex-valued neural networks, Discrete Dyn. Nat. Soc., 2016 (2016). https://doi.org/10.1155/2016/1267954 doi: 10.1155/2016/1267954
![]() |
[18] |
Y. P. Liu, L. H. Zhao, Y. W. Shi, S. Y. Ren, J. L. Wang, Finite-time passivity and synchronisation of complex networks with multiple output couplings, Int. J. Control, 96 (2023), 1470–1490. https://doi.org/10.1080/00207179.2022.2053208 doi: 10.1080/00207179.2022.2053208
![]() |
[19] |
R. Wei, J. Cao, F. E. Alsaadi, Fixed-time passivity of coupled quaternion-valued neural networks with multiple delayed couplings, Soft Comput., 27 (2023), 8959–8970. https://doi.org/10.1007/s00500-022-07500-2 doi: 10.1007/s00500-022-07500-2
![]() |
[20] |
X. Chen, Q. Song, Global stability of complex-valued neural networks with both leakage time delay and discrete time delay on time scales, Neurocomputing, 121 (2013), 254–264. https://doi.org/10.1016/j.neucom.2013.04.040 doi: 10.1016/j.neucom.2013.04.040
![]() |
[21] |
Z. Wang, X. Liu, Exponential stability of impulsive complex-valued neural networks with time delay, Math. Comput. Simul., 156 (2019), 143–157. https://doi.org/10.1016/j.matcom.2018.07.006 doi: 10.1016/j.matcom.2018.07.006
![]() |
[22] |
M. Chinnamuniyandi, S. Chandran, C. Xu, Fractional order uncertain BAM neural networks with mixed time delays: An existence and Quasi-uniform stability analysis, J. Intell. Fuzzy Syst., 46 (2024), 4291–4313. https://doi.org/10.3233/JIFS-234744 https://doi.org/10.3233/JIFS-234744 doi: 10.3233/JIFS-234744
![]() |
[23] |
G. Velmurugan, R. Rakkiyappan, S. Lakshmanan, Passivity analysis of memristor-based complex-valued neural networks with time-varying delays, Neural Process. Lett., 42 (2015), 517–540. https://doi.org/10.1007/s11063-014-9371-8 doi: 10.1007/s11063-014-9371-8
![]() |
[24] |
Z. Zhang, S. Yu, Global asymptotic stability for a class of complex-valued Cohen-Grossberg neural networks with time delays, Neurocomputing, 171 (2016), 1158–1166. https://doi.org/10.1016/j.neucom.2015.07.051 doi: 10.1016/j.neucom.2015.07.051
![]() |
[25] |
H. Wang, S. Duan, T. Huang, L. Wang, C. Li, Exponential stability of complex-valued memristive recurrent neural networks, IEEE Trans. Neural Networks Learn. Syst., 28 (2016), 766–771. https://doi.org/10.1109/TNNLS.2015.2513001 doi: 10.1109/TNNLS.2015.2513001
![]() |
[26] |
Z. Zhang, C. Lin, B. Chen, Global stability criterion for delayed complex-valued recurrent neural networks, IEEE Trans. Neural Networks Learn. Syst., 25 (2013), 1704–1708. https://doi.org/10.1109/TNNLS.2013.2288943 doi: 10.1109/TNNLS.2013.2288943
![]() |
[27] |
J. Pan, X. Liu, W. Xie, Exponential stability of a class of complex-valued neural networks with time-varying delays, Neurocomputing, 164 (2015), 293–299. https://doi.org/10.1016/j.neucom.2015.02.024 doi: 10.1016/j.neucom.2015.02.024
![]() |
[28] |
W. Qian, S. Cong, T. Li, S. Fei, Improved stability conditions for systems with interval time-varying delay, Int. J. Control Autom. Syst., 10 (2012), 1146–1152. https://doi.org/10.1007/s12555-012-0609-9 doi: 10.1007/s12555-012-0609-9
![]() |
[29] |
J. Hu, J. Wang, Global stability of complex-valued recurrent neural networks with time-delays, IEEE Trans. Neural Networks Learn. Syst., 23 (2012), 853–865. https://doi.org/10.1109/TNNLS.2012.2195028 doi: 10.1109/TNNLS.2012.2195028
![]() |
[30] |
T. Dong, X. Liao, A. Wang, Stability and Hopf bifurcation of a complex-valued neural network with two time delays, Nonlinear Dyn., 82 (2015), 173–184. https://doi.org/10.1007/s11071-015-2147-5 doi: 10.1007/s11071-015-2147-5
![]() |
[31] |
X. Xu, J. Zhang, J. Shi, Dynamical behaviour analysis of delayed complex-valued neural networks with impulsive effect, Int. J. Syst. Sci., 48 (2017), 686–694. https://doi.org/10.1080/00207721.2016.1206988 doi: 10.1080/00207721.2016.1206988
![]() |
[32] |
R. Samidurai, S. Rajavel, J. Cao, A. Alsaedi, F. Alsaadi, B. Ahmad, Delay-partitioning approach to stability analysis of state estimation for neutral-type neural networks with both time-varying delays and leakage term via sampled-data control, Int. J. Syst. Sci., 48 (2017), 1752–1765. https://doi.org/10.1080/00207721.2017.1282060 doi: 10.1080/00207721.2017.1282060
![]() |
[33] |
M. S. Ali, S. Saravanan, Q. Zhu, Finite-time stability of neutral-type neural networks with random time-varying delays, Int. J. Syst. Sci., 48 (2017), 3279–3295. https://doi.org/10.1080/00207721.2017.1367434 doi: 10.1080/00207721.2017.1367434
![]() |
[34] |
C. Hua, Y. Wang, S. Wu, Stability analysis of neural networks with time-varying delay using a new augmented Lyapunov—Krasovskii functional, Neurocomputing, 332 (2019), 1–9. https://doi.org/10.1016/j.neucom.2018.08.044 doi: 10.1016/j.neucom.2018.08.044
![]() |
[35] |
I. Khonchaiyaphum, N. Samorn, T. Botmart, K. Mukdasai, Finite-time passivity analysis of neutral-type neural networks with mixed time-varying delays, Mathematics, 9 (2021), 3321. https://doi.org/10.3390/math9243321 doi: 10.3390/math9243321
![]() |
[36] |
J. Xiao, Z. Zeng, Finite-time passivity of neural networks with time varying delay, J. Franklin Inst., 357 (2020), 2437–2456. https://doi.org/10.1016/j.jfranklin.2020.01.023 doi: 10.1016/j.jfranklin.2020.01.023
![]() |
[37] |
S. Saravanan, M. S. Ali, A. Alsaedi, B. Ahmad, Finite-time passivity for neutral-type neural networks with time-varying delays—via auxiliary function-based integral inequalities, Nonlinear Anal. Modell. Control, 25 (2020), 206–224. https://doi.org/10.15388/namc.2020.25.16513 doi: 10.15388/namc.2020.25.16513
![]() |
[38] |
A. Seuret, F. Gouaisbaut, On the use of the Wirtinger inequalities for time-delay systems, IFAC Proc. Vol., 45 (2012), 260–265. https://doi.org/10.3182/20120622-3-US-4021.00035 doi: 10.3182/20120622-3-US-4021.00035
![]() |
1. | 佳轩 崔, 基于机器学习和深度学习的蛋白质结构预测研究进展, 2024, 1, 3007-7486, 32, 10.52810/FAAI.2024.003 | |
2. | Jian Zhang, Jingjing Qian, Quan Zou, Feng Zhou, Lukasz Kurgan, 2025, Chapter 1, 978-1-0716-4212-2, 1, 10.1007/978-1-0716-4213-9_1 |
Layer name | ICRN-10 | ICRN-18 | ICRN-34 |
Inception Block | [1×13×33×33×3][1×13×33×3][Maxpool1×1] | ||
Residuals-Block-1 | [3×3,643×3,64]×2 | [3×3,643×3,64]×2 | [3×3,643×3,64]×3 |
Residuals-Block-2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×4 |
Residuals-Block-3 | / | [3×3,2563×3,256]×2 | [3×3,2563×3,256]×6 |
Residuals-Block-4 | / | [3×3,5123×3,512]×2 | [3×3,5123×3,512]×3 |
Average pool, fully connected, softmax |
Datasets | Number of proteins |
CullPDB | 11650 |
CASP10 | 99 |
CASP11 | 81 |
CASP12 | 19 |
CASP13 | 22 |
CASP14 | 24 |
CB513 | 513 |
Layers | CASP11 | CASP12 | CASP13 | CASP14 |
1 | 68.26 | 68.85 | 67.21 | 68.36 |
2 | 71.27 | 70.63 | 67.76 | 69.41 |
3 | 72.55 | 71.81 | 69.88 | 70.29 |
4 | 71.71 | 71.23 | 68.73 | 68.22 |
5 | 70.33 | 69.46 | 67.24 | 67.61 |
Sliding window | CASP11 | CASP12 | CASP13 | CASP14 |
13 | 67.47 | 68.16 | 65.88 | 65.20 |
15 | 68.09 | 69.27 | 66.67 | 67.41 |
17 | 70.66 | 70.24 | 68.18 | 69.13 |
19 | 71.55 | 70.81 | 68.88 | 69.29 |
21 | 70.29 | 70.41 | 68.42 | 68.94 |
Dataset | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
SOV | 70.98 | 69.37 | 68.83 | 67.41 | 66.39 | 73.91 |
Q8 | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |
QG | 52.72 | 55.32 | 47.90 | 36.56 | 33.51 | 37.71 |
QH | 92.66 | 85, 74 | 87.43 | 93.2 | 89.75 | 92.25 |
QI | 0 | 0 | 0 | 0 | 0 | 0 |
QT | 62.21 | 49.47 | 56.78 | 57.31 | 45.29 | 53.67 |
QB | 9.81 | 19.30 | 7.25 | 8.30 | 3.88 | 7.68 |
QE | 88.80 | 82.15 | 77.76 | 84.37 | 76.79 | 80.44 |
QS | 53.88 | 43.68 | 48.90 | 34.74 | 13.33 | 25.39 |
QC | 68.12 | 62.91 | 67.38 | 70.71 | 68.54 | 71.37 |
1 | 2 | 3 | 4 | 5 | Average | |
Q8 | 72.72 | 73.18 | 73.43 | 72.61 | 71.36 | 72.66 |
CASP11 | CASP12 | CASP13 | CASP14 | Training time (s) | |
WG-ICRN | 71.55 | 70.81 | 68.88 | 69.29 | 21.9 |
WG-Res | 71.43 | 70.67 | 68.83 | 69.17 | 22.4 |
WG-CNN | 70.47 | 68.79 | 67.33 | 68.24 | 21.7 |
ResNet | 68.76 | 67.84 | 65.57 | 66.19 | 9.8 |
CNN | 66.62 | 65.29 | 63.69 | 64.71 | 9.6 |
Method | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
DeepACLSTM | 73.09 | 71.49 | 70.35 | 68.91 | 68.81 | 75.51 |
1D-Inception | 71.86 | 70.07 | 69.78 | 67.51 | 68.3 | 74.68 |
DCRNN | 72.11 | 70.50 | 69.41 | 68.05 | 68.87 | 74.85 |
CNN_BIGRU | 71.87 | 70.94 | 69.67 | 67.83 | 68.69 | 75.54 |
WG-ICRN | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |
Layer name | ICRN-10 | ICRN-18 | ICRN-34 |
Inception Block | [1×13×33×33×3][1×13×33×3][Maxpool1×1] | ||
Residuals-Block-1 | [3×3,643×3,64]×2 | [3×3,643×3,64]×2 | [3×3,643×3,64]×3 |
Residuals-Block-2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×2 | [3×3,1283×3,128]×4 |
Residuals-Block-3 | / | [3×3,2563×3,256]×2 | [3×3,2563×3,256]×6 |
Residuals-Block-4 | / | [3×3,5123×3,512]×2 | [3×3,5123×3,512]×3 |
Average pool, fully connected, softmax |
Datasets | Number of proteins |
CullPDB | 11650 |
CASP10 | 99 |
CASP11 | 81 |
CASP12 | 19 |
CASP13 | 22 |
CASP14 | 24 |
CB513 | 513 |
Layers | CASP11 | CASP12 | CASP13 | CASP14 |
1 | 68.26 | 68.85 | 67.21 | 68.36 |
2 | 71.27 | 70.63 | 67.76 | 69.41 |
3 | 72.55 | 71.81 | 69.88 | 70.29 |
4 | 71.71 | 71.23 | 68.73 | 68.22 |
5 | 70.33 | 69.46 | 67.24 | 67.61 |
Sliding window | CASP11 | CASP12 | CASP13 | CASP14 |
13 | 67.47 | 68.16 | 65.88 | 65.20 |
15 | 68.09 | 69.27 | 66.67 | 67.41 |
17 | 70.66 | 70.24 | 68.18 | 69.13 |
19 | 71.55 | 70.81 | 68.88 | 69.29 |
21 | 70.29 | 70.41 | 68.42 | 68.94 |
Dataset | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
SOV | 70.98 | 69.37 | 68.83 | 67.41 | 66.39 | 73.91 |
Q8 | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |
QG | 52.72 | 55.32 | 47.90 | 36.56 | 33.51 | 37.71 |
QH | 92.66 | 85, 74 | 87.43 | 93.2 | 89.75 | 92.25 |
QI | 0 | 0 | 0 | 0 | 0 | 0 |
QT | 62.21 | 49.47 | 56.78 | 57.31 | 45.29 | 53.67 |
QB | 9.81 | 19.30 | 7.25 | 8.30 | 3.88 | 7.68 |
QE | 88.80 | 82.15 | 77.76 | 84.37 | 76.79 | 80.44 |
QS | 53.88 | 43.68 | 48.90 | 34.74 | 13.33 | 25.39 |
QC | 68.12 | 62.91 | 67.38 | 70.71 | 68.54 | 71.37 |
1 | 2 | 3 | 4 | 5 | Average | |
Q8 | 72.72 | 73.18 | 73.43 | 72.61 | 71.36 | 72.66 |
CASP11 | CASP12 | CASP13 | CASP14 | Training time (s) | |
WG-ICRN | 71.55 | 70.81 | 68.88 | 69.29 | 21.9 |
WG-Res | 71.43 | 70.67 | 68.83 | 69.17 | 22.4 |
WG-CNN | 70.47 | 68.79 | 67.33 | 68.24 | 21.7 |
ResNet | 68.76 | 67.84 | 65.57 | 66.19 | 9.8 |
CNN | 66.62 | 65.29 | 63.69 | 64.71 | 9.6 |
Method | CASP10 | CASP11 | CASP12 | CASP13 | CASP14 | CB513 |
DeepACLSTM | 73.09 | 71.49 | 70.35 | 68.91 | 68.81 | 75.51 |
1D-Inception | 71.86 | 70.07 | 69.78 | 67.51 | 68.3 | 74.68 |
DCRNN | 72.11 | 70.50 | 69.41 | 68.05 | 68.87 | 74.85 |
CNN_BIGRU | 71.87 | 70.94 | 69.67 | 67.83 | 68.69 | 75.54 |
WG-ICRN | 73.32 | 71.55 | 70.81 | 68.88 | 69.29 | 75.56 |