
The collapse is the most frequent and harmful geological hazard during the construction of the shallow buried tunnel, which seriously threatens the life and property safety of construction personnel. To realize the process control of collapse in the tunnel construction, a three-stage risk evaluation method of collapse in the whole construction process of shallow tunnels was put forward. Firstly, according to the engineering geology and hydrogeology information obtained in the prospecting stage, a fuzzy model of preliminary risk evaluation based on disaster-pregnant environment factors was proposed to provide a reference for the optimization design of construction and support schemes in the design stage. Secondly, the disaster-pregnant environment factors were corrected based on the obtained information, such as advanced geological forecast and geological sketch, and the disaster-causing factors were introduced. An extension theory model of secondary risk evaluation was established to guide the reasonable excavation and primary support schemes. Finally, the disaster-pregnant and disaster-causing factors were corrected according to the excavation condition, an attribute model of final risk evaluation for the collapse was constructed combined with the mechanical response index of the surrounding rock. Meanwhile, the risk acceptance criteria and construction decision-making method of the collapse in the shallow buried tunnels were formulated to efficiently implement the multi-level risk control of this hazard. The proposed method has been successfully applied to the Huangjiazhuang tunnel of the South Shandong High-Speed Railway. The comparison showed that the evaluation results are highly consistent for these practical situations, which verify the application value of this study for guiding the safe construction of shallow buried tunnels.
Citation: Zhiqiang Li, Sheng Wang, Yupeng Cao, Ruosong Ding. Dynamic risk evaluation method of collapse in the whole construction of shallow buried tunnels and engineering application[J]. Mathematical Biosciences and Engineering, 2022, 19(4): 4300-4319. doi: 10.3934/mbe.2022199
[1] | Virginia Giorno, Amelia G. Nobile . Exact solutions and asymptotic behaviors for the reflected Wiener, Ornstein-Uhlenbeck and Feller diffusion processes. Mathematical Biosciences and Engineering, 2023, 20(8): 13602-13637. doi: 10.3934/mbe.2023607 |
[2] | Buyu Wen, Bing Liu, Qianqian Cui . Analysis of a stochastic SIB cholera model with saturation recovery rate and Ornstein-Uhlenbeck process. Mathematical Biosciences and Engineering, 2023, 20(7): 11644-11655. doi: 10.3934/mbe.2023517 |
[3] | Junjing Xiong, Xiong Li, Hao Wang . The survival analysis of a stochastic Lotka-Volterra competition model with a coexistence equilibrium. Mathematical Biosciences and Engineering, 2019, 16(4): 2717-2737. doi: 10.3934/mbe.2019135 |
[4] | Virginia Giorno, Serena Spina . On the return process with refractoriness for a non-homogeneous Ornstein-Uhlenbeck neuronal model. Mathematical Biosciences and Engineering, 2014, 11(2): 285-302. doi: 10.3934/mbe.2014.11.285 |
[5] | Aniello Buonocore, Luigia Caputo, Enrica Pirozzi, Maria Francesca Carfora . A simple algorithm to generate firing times for leaky integrate-and-fire neuronal model. Mathematical Biosciences and Engineering, 2014, 11(1): 1-10. doi: 10.3934/mbe.2014.11.1 |
[6] | Yuanshi Wang, Donald L. DeAngelis . A mutualism-parasitism system modeling host and parasite with mutualism at low density. Mathematical Biosciences and Engineering, 2012, 9(2): 431-444. doi: 10.3934/mbe.2012.9.431 |
[7] | Meng Gao, Xiaohui Ai . A stochastic Gilpin-Ayala mutualism model driven by mean-reverting OU process with Lévy jumps. Mathematical Biosciences and Engineering, 2024, 21(3): 4117-4141. doi: 10.3934/mbe.2024182 |
[8] | Jinyu Wei, Bin Liu . Global dynamics of a Lotka-Volterra competition-diffusion-advection system for small diffusion rates in heterogenous environment. Mathematical Biosciences and Engineering, 2021, 18(1): 564-582. doi: 10.3934/mbe.2021031 |
[9] | Giuseppe D'Onofrio, Enrica Pirozzi . Successive spike times predicted by a stochastic neuronal model with a variable input signal. Mathematical Biosciences and Engineering, 2016, 13(3): 495-507. doi: 10.3934/mbe.2016003 |
[10] | Sheng Wang, Lijuan Dong, Zeyan Yue . Optimal harvesting strategy for stochastic hybrid delay Lotka-Volterra systems with Lévy noise in a polluted environment. Mathematical Biosciences and Engineering, 2023, 20(4): 6084-6109. doi: 10.3934/mbe.2023263 |
The collapse is the most frequent and harmful geological hazard during the construction of the shallow buried tunnel, which seriously threatens the life and property safety of construction personnel. To realize the process control of collapse in the tunnel construction, a three-stage risk evaluation method of collapse in the whole construction process of shallow tunnels was put forward. Firstly, according to the engineering geology and hydrogeology information obtained in the prospecting stage, a fuzzy model of preliminary risk evaluation based on disaster-pregnant environment factors was proposed to provide a reference for the optimization design of construction and support schemes in the design stage. Secondly, the disaster-pregnant environment factors were corrected based on the obtained information, such as advanced geological forecast and geological sketch, and the disaster-causing factors were introduced. An extension theory model of secondary risk evaluation was established to guide the reasonable excavation and primary support schemes. Finally, the disaster-pregnant and disaster-causing factors were corrected according to the excavation condition, an attribute model of final risk evaluation for the collapse was constructed combined with the mechanical response index of the surrounding rock. Meanwhile, the risk acceptance criteria and construction decision-making method of the collapse in the shallow buried tunnels were formulated to efficiently implement the multi-level risk control of this hazard. The proposed method has been successfully applied to the Huangjiazhuang tunnel of the South Shandong High-Speed Railway. The comparison showed that the evaluation results are highly consistent for these practical situations, which verify the application value of this study for guiding the safe construction of shallow buried tunnels.
Predicting and identifying DTI is of great significance in medicine and biology [1,2,3]. Measuring the affinity with drugs and proteins in wet lab is the most expensive and time-consuming method. However, these experiments are often expensive and time-consuming due to the complexity of molecules [4]. Virtual screening (VS) through computation can significantly reduce costs. Structure-based VS and ligand-based VS, which are classical virtual screening methods, have achieved great success [5,6]. Structure-based methods use 3-dimensional (3D) conformations of proteins and drugs for study of bioactivity. The ligand-based method is based on the assumption that similar molecules will interact with similar proteins [7]. However, the application of these methods is limited. For example, ligand-based VS methods perform poorly when the molecule has a few known binding proteins and structure-based VS methods cannot be executed when the 3D structure of a protein is unknown. Since the accurate reconstruction of protein structures is still to be developed, the construction of 3D-free DTI prediction methods has attracted increasing attention [8]. The machine learning approach considers the chemical space, genomic space and their interactions in a specific framework and formulates the DTI prediction as a classification problem following a feature-based and similarity-based approach. Similarity-based approaches rely on the assumption that drugs with similar structures should have similar effects, feature-based approaches construct a feature vector consisting of a combination of descriptors for the drug and the protein as model inputs. Bleakley et al. [9] proposed a new supervised method of inference to predict unknown drug-target interactions, which uses support vector machines as local classifiers. Since then, a variety of machine learning-based algorithms have been proposed that consider both composites and protein information in a unified model [10,11,12,13,14,15,16,17,18,19].
In recent years, the development of deep learning in drug discovery has been rapid. In comparison to traditional machine learning, end-to-end models eliminate the need to define and compute descriptors before modeling, providing different strategies and representations for proteins and drugs. Initially, manually specified descriptors were used to represent drugs and proteins, and a fully connected neural network (FCN) was designed to make predictions [20]. Since descriptors are designed from a single perspective and cannot be changed during the training process, descriptor-based approaches cannot extract task-relevant features. Therefore, many end-to-end models have been proposed. Lee et al. [21] proposed a model called DeepConv-DTI to predict DTI. The model uses convolution layer to extract local residue features of generalized proteins. Tsubaki et al. [22] used different models to represent drugs and proteins from the perspective of considering the structure of drugs as graph structures. Graph neural networks (GNN) were used to learn features from drug sequences, and convolutional neural network (CNN) was used to train protein sequences. In order to consider deeper features between molecules, Li et al. [23] proposed a multi-objective neural network (MONN), which introduced a gate recurrent unit (GRU) module to predict the affinity and can accurately determine the interaction and affinity between molecules. Zamora-Resendiz et al. [24] defined a new spatial graph convolutional network (GCN) architecture that employs graph reduction to reduce the number of training parameters and facilitate the abstraction of intermediate representations. Ryu et al. [25] combined GCN with an attention mechanism to enable GCN to identify atoms in different environments, which could extract better structural features related to a target molecular property such as solubility, polarity, synthetic accessibility and photovoltaic efficiency, compared to the vanilla GCN. Ru et al. [26] combined the ideas of adjacency and learning to rank to establish correlations between proteins and drugs using adjacency, and predicted the binding affinity of drugs and proteins using learning to rank methods to input features into the classifier. Transformer is a model that uses a self-attention to improve the speed of model training, and has achieved great success in the field of natural language processing. Wang et al. [27] first extracted drug carriers by GNN, and then represented protein features by Transformer and CNN to obtain remote interaction information in proteins by a one-sided attention mechanism. Chen et al. [28] obtained the interaction features by Transformer decoder and proposed a more rigorous method for data set partitioning. Ren et al. [29] presented a deep learning framework based on multimodal representation and meta-path semantic analysis, which drugs and proteins are represented as multimodal data and the relationships between them are captured by meta-path semantic analysis. However, most of these methods only consider a single non-covalent interaction between drugs and proteins. In fact, there is much more than one interaction between drugs and proteins.
Inspired by the Transformer decoder, which is able to extract long-range interdependent features [28], this paper proposed a dual-pathway model for DTI prediction based on mutual reaction features, called Mutual-DTI. The transformer's decoder was modified to treat drugs and proteins as two distinct sequences. Additionally, a module was added to extract mutual features that enable learning of the complex interactions between atoms and amino acids. Figure 1 shows an overview of the entire network. The dual pathway approach has also been applied in other field. For example, dual attention matching (DAM) [30] was proposed to learn the global features from local features with self-attention but ignored the mutual influence information between local features of two modalities.
In this paper, we captured the spatial and other feature information of drugs with a GNN and represented the protein features with a gated convolutional network. The drug features and protein features were then input as two sequences into the Transformer decoder, which included the mutual feature module. Different from DAM, the mutual feature module simultaneously considered the local features of both drug molecules and proteins, which effectively extracted the interaction features between two sequences. Finally, the drug-protein feature vector was input into the fully connected layer for prediction. We expected the Mutual-DTI model to exhibit better performance and generality with the addition of the mutual feature module. To validate this, we evaluated it on two benchmark datasets and conducted ablation experiments on a more tightly delineated dataset [28]. The results showed that Mutual-DTI exhibited better performance. We further visualized the attention scores obtained by Mutual-DTI learning, and the results showed that the mutual feature module of Mutual-DTI helped to reduce the search space of binding points.
GNN aggregates operations to extract node features in a graph. We represent drugs using a graph structure where nodes represent atoms like carbon and hydrogen, and edges represent chemical bonds like single and double bonds. We use the RDKit Python library* to convert a simplified molecular input line entry system (SMILES) string into a two-dimensional drug molecular graph.
*Website:http://www.rdkit.org/
We define a drug graph by G={V,E}, where V is the set of atomic nodes and E is the set of chemical bond edges. Considering the small number of atomic and chemical bond types, we perform a local breadth-first search for the nodes in the graph, where the search depth r is equal to the number of hops of a particular node [22,31]. For example, we start searching from node vi, traverse the subgraph of range r and record the information of all neighboring nodes and edges of node vi in the subgraph. We define subgraph for node vi in the depth r range as follows:
Gsubri=(Vsubri,Esubri) | (2.1) |
Esubri={emn|m,n∈N(i,r)} | (2.2) |
where N(i,r) is the set of nodes adjacent to vi in the subgraph Gsubri, including vi. emn is the edge connecting vm and node vn.
According to the subgraph Gsubri, we can extract the corresponding chemical features, such as atomic type, atomicity, aromaticity, etc. The details are shown in Table 1.
Feature | Representation |
atom type | C, N, O, S, F, P, Cl, Br, B, H (onehot) |
degree of atom | 0, 1, 2, 3, 4, 5 (onehot) |
number of hydrogens attached | 0, 1, 2, 3, 4 (onehot) |
implicit valence electrons | 0, 1, 2, 3, 4, 5 (onehot) |
aromaticity | 0 or 1 |
We use the random initialized embedding of the extracted chemical features as the initial input to the GNN. We denote the embedding of the n-th layer network node vi as f(n)i∈Rd, In the GNN, we update f(n)i according to the following equation:
f(n)i=σ(f(n−1)i+∑j∈N(i,r)h(n−1)ij) | (2.3) |
where σ is the sigmoid function: σ(x)=1/(1+e−x) and h(n−1)ij is the hidden vector between nodes vi and vj. This hidden vector can be computed by the following neural network:
h(n)ij=ε(ωf(n)i+b) | (2.4) |
where ε is the nonlinear activation function ReLU: ε(x)=max(0,x),ω∈Rd×d is the weight hyperparameter and b∈Rd is the deviation vector. After the GNN layer, we obtain the feature vector c1,c2,c3,⋯,cl of a drug sequence, where l is the number of atoms in the sequence.
A protein sequence consists of 20 amino acids. If we learn a protein sequence as a sentence, there are only 20 kinds of words that make up the sentence. To increase the diversity of features, based on the n-gram language model, we define the words in a protein sequence as n-gram amino acids [22]. For a given amino acid sequence, we split it into repeated n-gram amino acid sequences. For example, we set n to 3, and the protein sequence MVVMNSL⋯TSQATP is split into MVV,VVM,VMN,MNS,⋯,TSQ,SQA,QAT,ATP, so that the variety of words composing the sentence will be expanded to 203.
To ensure a reasonable vocabulary, we sets n to 3. For a given protein sequence S = a1a2a3⋯aL, where L is the length of the protein sequence and ai is the i-th amino acid. We split it into:
[a1a2a3],[a2a3a4],⋯,[aL−2aL−1aL] |
We use ai:i+2∈Rd to denote the d-dimensional embedding of the word [aiai+1ai+2]. We do the initialize d-dimensional embedding of the protein sequence processed by the above method and then input it into a gated convolution network with Conv1D and gated linear units [32]. We compute the hidden layer according to Eq (2.5):
Li(X)=(X×ω1+s)⊗σ(X×ω2+t) | (2.5) |
where Li is the i-layer in the gated convolution network, X∈Rn×d1 is the input to the i-layer, ω1∈Rd1×d2,s∈Rd2,ω2∈Rd1×d2 and t∈Rd2 are the learning parameters, n is the length of the sequence. d1,d2 are the dimensions of the input and hidden features respectively, σ is sigmoid function and ⊗ is matrix product. The output of the gated convolution network is the final representation of the protein sequence.
We extracted feature vectors of drug and protein sequences using the drug and protein modules and inputted them into the Transformer decoder. The decoder learned mutual features, resulting in drug and protein sequences with interaction features as output. Since the order of the feature vectors has no effect on the DTI modeling, we remove the positional embedding in the Transformer. The key technology of the decoder is the multi-headed self-attention layer. The multi-headed self-attention layer consists of several scaled point-attention layers for extracting the interaction information between the encoder and the decoder. The self-attention layer accepts three inputs, i.e., key K, value V and query Q, and computes the attention in the following manner:
Self_attention(Q,K,V)=softmax(QKT√dk)V | (2.6) |
where dk is a scaling factor that depends on the number of layers. Considering the complex reaction processes involving non-covalent chemical bonds within drugs and proteins, we added a module to extract interaction features using a multi-headed self-attentive layer in the Transformer decoder. The decoder takes the drug and protein sequences as inputs, enabling the extraction of drug- and protein-dominated interaction features simultaneously. The module further extracts complex interaction feature vectors between atoms and amino acids within the sequence, as depicted in Figure 1.
After the interaction features are extracted by the decoder, we obtain the interaction feature matrices D∈Rb×n1×d and P∈Rb×n2×d for the drugs and proteins. Where b is the batch size, n1,n2 is the number of words and d is the feature dimension. Averaging over the different dimensions of the feature matrix :
Da=mean(D,1) | (2.7) |
Pa=mean(P,1) | (2.8) |
where mean(input,dim) is a mean operation that returns the mean value of each row in the given dimension dim. The obtained feature vectors Da and Pa are concatenated and fed to the classification block.
The classification module consists of a multilayer fully connected neural network with an activation function of ReLU and a final layer output representing the interaction probability ˆy. As a binary classification task, we use binary cross-entropy loss to train Mutual-DTI:
Loss=−[ylogˆy+(1−y)log(1−ˆy)] | (2.9) |
Mutual-DTI was implemented with Pytorch 1.10.0. The original transformer model had 6 layers and contained 512 hidden dimensions, we reduced the number of layers from 6 to 3, the number of hidden layers from 512 to 10, the number of protein representations, atomi representations, hidden layers and y interactions to 10 and the number of attention heads to 2, as this configuration achieves excellent generalization capabilities. During training, we used the Adam optimizer [33] with the learning rate set to 0.005 and the batch size set to 128.All settings and hyperparameters of Mutual-DTI are shown in the Table 2.
Name | Value |
Dimension of atom representation | 10 |
Dimension of protein representation | 10 |
Number of decoder layers | 3 |
Number of hidden layers | 10 |
Number of attention heads | 2 |
Learning rate | 5e-3 |
Weight decay | 1e-6 |
Dropout | 0.1 |
Batch size | 128 |
The human dataset and the C.elegans dataset were created by Liu et al. [34]. These two datasets comprise of compound-protein pairs, which include both negative and positive samples that are highly plausible. The human dataset comprises 3369 positive interactions between 1052 unique compounds and 852 unique proteins, while the C. elegans dataset comprises 4000 positive interactions between 1434 unique compounds and 2504 unique proteins. As shown in Table 3. We randomly divided into training set, validation set and test set in the ratio of 8:1:1. In addition, we utilized AUC, precision and recall as evaluation metrics for Mutual-DTI, and compared it with some traditional machine learning methods on both the human and C. elegans datasets. These traditional machine learning methods for comparison are k-NN, RF, L2-logistic (L2) and SVM. their results are from the original paper [34]. The main results are shown in Figure 2. Mutual-DTI outperforms the machine learning methods under both benchmark datasets.
Dataset | Drugs | Proteins | Interactions | Positive | Negative |
Human | 1052 | 852 | 6728 | 3369 | 3359 |
C.elegans | 1434 | 2504 | 7786 | 4000 | 3786 |
GPCR | 5359 | 356 | 15343 | 7989 | 7354 |
Davis | 68 | 379 | 25772 | 7320 | 18452 |
In other experiments, we compare the proposed method with recent deep learning methods used for DTI prediction. The methods are as follows: GNN-CPI [22], GNN-PT [27], TransformerCPI [28]. The main hyperparameters are set as follows:
GNN-CPI: vector dimensionality of vertices, edges and n-grams = 10, numbers of layers in gnn = 3, window size = 11, numbers of layers in cnn = 2, numbers of layers in output = 3.
GNN-PT: numbers of layers in gnn = 3, numbers of layers in output = 1, heads of attention = 2.
TransformerCPI: dimension of atom = 34, dimension of protein = 100, dimension of hidden = 64, number of hidden layers = 3, heads of attention = 8.
The same settings of the original paper were used for all parameter settings. We used the same preprocessing for the initial data of drugs and proteins. We preprocessed the dataset in the same way as in previous experiments, and repeated the experiment three times using different random seeds. For each repetition, we randomly split the dataset and saved the model parameters corresponding to the optimal validation set AUC for each test set. The main results under the human dataset and the C. elegans dataset are shown in Tables 4 and 5. On the human dataset, the average evaluation metrics of Mutual-DTI are 0.984, 0.962 and 0.943 for AUC, precision and recall, respectively, which outperform the other methods. On the C. elegans dataset, the average evaluation metrics AUC, precision and recall of Mutual-DTI are 0.987, 0.948 and 0.949, respectively, which mostly outperform the other models. The results suggest that Mutual-DTI can effectively learn informative features for predicting interactions from both one-dimensional protein sequences and two-dimensional molecular maps, demonstrating its generalizability across different datasets.
Methods | AUC | Precision | Recall |
GNN-CPI | 0.917 ± 0.072 | 0.783 ± 0.061 | 0.889 ± 0.096 |
GNN-PT | 0.978 ± 0.006 | 0.939 ± 0.010 | 0.934 ± 0.006 |
TransformerCPI | 0.972 ± 0.005 | 0.938 ± 0.018 | 0.932 ± 0.001 |
Mutual-DTI | 0.984 ± 0.001 | 0.962 ± 0.019 | 0.943 ± 0.016 |
Methods | AUC | Precision | Recall |
GNN-CPI | 0.899 ± 0.104 | 0.850 ± 0.132 | 0.778 ± 0.192 |
GNN-PT | 0.984 ± 0.007 | 0.940 ± 0.024 | 0.933 ± 0.014 |
TransformerCPI | 0.984 ± 0.004 | 0.943 ± 0.025 | 0.951 ± 0.016 |
Mutual-DTI | 0.987 ± 0.004 | 0.948 ± 0.018 | 0.949 ± 0.013 |
To evaluate the importance of interactive feature modules, we propose two sub-models. The first one is no-mutual block, which has no mutual feature module, another network with mutual feature module. In order to improve the accuracy of the experimental results, we evaluated a more strictly divided GPCR dataset [28]. As shown in Table 3. The key to constructing this dataset is that the drugs in its training set appear in only one class of samples (positive interaction or negative interaction DTI pairs), and in the test set appear in only the opposite class of samples. This forces the model to use protein information to learn interaction patterns and make opposite predictions for selected drugs, which is more realistic.
Table 6 shows the prediction performance of the two models on the GPCR dataset. As shown in the table, by comparing models with and without mutual features module, it can be concluded that improvements can indeed be achieved using interaction features. This suggests the need to establish correlations between drug and protein information in the DTI prediction extrapolation process. We also conducted experiments on the Davis dataset, which contains 7320 positive and 18,452 negative interactions. The Davis dataset we used was created by Zhao et al. [35]. As shown in Table 7, the model with the interaction module included also showed superior performance on the unbalanced dataset.
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.810 ± 0.023 | 0.704 ± 0.014 | 0.768 ± 0.030 |
Mutual-DTI | 0.820 ± 0.014 | 0.699 ± 0.010 | 0.796 ± 0.046 |
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.886 ± 0.005 | 0.728 ± 0.023 | 0.654 ± 0.005 |
Mutual-DTI | 0.900 ± 0.002 | 0.767 ± 0.013 | 0.680 ± 0.027 |
In this section, we employed a three-dimensional surface plot to analyze the impact of model hyperparameters (atomic and protein dimensions) on the DTI prediction performance. Since these two parameters are among the most important hyperparameters. We sampled values for atomic and protein dimensions from 10 to 40, with a gap of 10 dimensions each time. For instance, the atomic dimension/protein dimension values were: 10/10, 10/20, 10/30, ..., 20/10, 20/20, ..., 40/30, 40/40, for a total of 16 different settings, and experiments were conducted multiple times with different random seeds. Other settings were the same as the previous experiments. As shown in the Figure 3, the x-axis represents the atomic dimension, the y-axis represents the protein dimension and the z-axis represents the AUC obtained from the test set. From the results, it can be seen that under different dimension settings, the surfaces are very smooth and the model exhibits good robustness.
To demonstrate that the mutual feature module we introduced not only enhances the performance of the model but also provides deeper interpretation, we conducted a case study. First, we applied a Frobenius parametric solution to the protein feature vector matrix obtained by the Transformer decoder. Next, we used the Softmax function to derive attention weights of the protein sequences, which were then mapped onto the 3D structure of the complex to visualize the regions that are more efficient for drug-protein reaction. The attention weights of the crystal structure of gw0385-bound HIV protease D545701 (PDB:2FDD) are shown in Figure 4. The complex has a total of 12 binding sites. We have marked the regions that received high attention more than 0.75 in red, and it can be found that a total of 4 of the 12 binding sites received high attention scores, namely ASP-25, ALA-28, PRO-81 and ALA82. The results show that Mutual-DTI helps to narrow down the search space of binding sites.
Based on our prior experiments on robustness, it is evident that the model's AUC slightly varies when encoding and embedding atoms and proteins in various dimensions. Our inference is that the GNN module, which learns the drug molecule characteristics, and the gated convolutional unit module, which learns the protein features, can effectively extract feature information in the Mutual-DTI model. This indicates that constructing drug and protein sequences into subgraphs via local breadth-first search algorithms and constructing words via n-gram methods is reasonable. In the ablation experiment, we discovered that removing the module that extracts mutual reaction features resulted in a significant decrease in the model's prediction accuracy. Our speculation is that the model only learns each sequence's individual feature information after learning drug and protein features, while DTI is a dynamic process. Introducing the Mutual learning module treats drug and protein features as the main body, and they dynamically focus on each other's key parts in the learning layer, thus directly capturing the interaction features of the two given sequences. By learning the interaction features, the model obtains a deeper understanding of the DTI process and can more easily capture the crucial parts that may contribute to the reaction when dealing with unknown drug and protein data, leading to superior performance in predicting results.
From the perspective of model complexity analysis, Mutual-DTI has higher complexity compared to the networks that only use GNN(e.g., GNN-CPI). This is because we use a more complex self-attention mechanism, which allows us to capture long-range dependencies between tokens in the sequence. Compared to TransformerCPI, which is also based on Transformer, Mutual-DTI has lower complexity. The rationale behind this is that, while Mutual-DTI takes into account two parallel multi-head attention layers, the network's number of attention heads is diminished from 8 to 2, and we devised a lower hidden layer dimension. Consequently, this has notably curtailed the number of parameters in the Mutual-DTI. These designs help Mutual-DTI better fit the training data while avoiding overfitting due to excessive complexity.
In this paper, we present a Transformer-based network model for predicting DTI and introduce a module for extracting sequence interaction features to model complex reaction processes between atoms and amino acids. To validate the effectiveness of Mutual-DTI, we compare it with the latest baseline on two benchmark datasets. The results show that Mutual-DTI outperforms the baseline. We also evaluate Mutual-DTI on the label reversal dataset and observe a significant improvement with the introduction of the mutual feature module. Finally, we map the attention weights obtained by the mutual feature module to the protein sequences, which helps us better interpret the model and determine the reliability of the predictions.
Although Mutual-DTI shows effective performance in predicting DTI, there is still room for improvement. The experimental results show a significant decrease in performance on the strictly limited label inversion dataset compared to the human dataset and the C.elegans dataset. This suggests that the feature extraction of sequences is very limited, and adding a 3D representation of molecules or proteins may help extract more information.
The work was supported by Open Funding Project of the State Key Laboratory of Biocatalysis and Enzyme Engineering under grant No. SKLBEE2021020 and SKLBEE2020020, the High-level Talents Fund of Hubei University of Technology under grant No. GCRC2020016, Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province under grant No. 2020E10010-02 and Natural Science Foundation of Hubei Province under grant No. 2021CFB282.
All authors declare no conflicts of interest in this paper.
[1] |
Q. H. Qian, P. Lin, Safety risk management of underground engineering in China: progress, challenges and strategies, J. Rock Mech. Geotech. Eng., 8 (2016), 423–442. https://doi.org/10.1016/j.jrmge.2016.04.001 doi: 10.1016/j.jrmge.2016.04.001
![]() |
[2] |
S. Wang, L. P. Li, S. Cheng, J. Y. Yang, H. Jin, S. Gao, et al., Study on an improved real-time monitoring and fusion prewarning method for water inrush in tunnels, Tunnelling Underground Space Technol., 112 (2021), 103884. https://doi.org/10.1016/j.tust.2021.103884 doi: 10.1016/j.tust.2021.103884
![]() |
[3] |
W. Chen, G. H. Zhang, H. Wang, L. B. Chen, Risk assessment of mountain tunnel collapse based on rough set and conditional information entropy, Rock Soil Mech., 40 (2019), 1–10. https://doi.org/10.16285/j.rsm.2018.1290 doi: 10.16285/j.rsm.2018.1290
![]() |
[4] |
Q. J. Zuo, L. Wu, C.Y. Lin, C. M. Xu, B. Li, Z. L. Lu, et al., Collapse mechanism and treatment measures for tunnel in water-rich soft rock crossing fault, Chin. J. Rock Mech. Eng., 35 (2016), 369–377. https://doi.org/10.13722/j.cnki.jrme.2014.1632 doi: 10.13722/j.cnki.jrme.2014.1632
![]() |
[5] |
M. Fera, R. Macchiaroli, Proposal of a quali-quantitative assessment model for health and safety in small and medium enterprises, WIT Trans. Bulit Environ., 108 (2009), 117–126. https://doi.org/10.2495/SAFE090121 doi: 10.2495/SAFE090121
![]() |
[6] |
H. H. Einstein, Risk and risk analysis in rock engineering, Tunnelling Underground Space Technol., 11 (1996), 141–155. https://doi.org/10.1016/0886-7798(96)00014-4 doi: 10.1016/0886-7798(96)00014-4
![]() |
[7] | B. Nilsen, A. Palmstrom, H. Stille, Quality control of a subsea tunnel project in complex ground conditions, in Proceedings of the ITA World Tunnel Congress Oslo, Norway, (1999), 137–144. |
[8] |
R. Sturk, L. Olsson, J. Johansson, Risk and decision analysis for large underground projects as applied to the stockholm ring road tunnels, Tunnelling Underground Space Technol., 11 (1996), 157–164. https://doi.org/10.1016/0886-7798(96)00019-3 doi: 10.1016/0886-7798(96)00019-3
![]() |
[9] | S. V. Woude, U. Maidl, J. J. Honker, Risk management for the betuweroute shield driven tunnels, Claiming Underground Space, 2003 (2003), 1043–1049. |
[10] | S. D. Eskesen, P. R. Tengborg, J. Kampmann, T. H. Veicherts, Guidelines for tunnelling risk management: international tunnelling association, working group No. 2, Tunnelling Underground Space Technol., 19 (2004), 217–237. https://doi.org/10.1016/j.tust.2004.01.001 |
[11] |
H. H. Choi, H. N. Cho, J. W. Seo, Risk assessment methodology for underground construction projects, J. Constr. Eng. Manage., 130 (2004), 258–272. https://doi.org/10.1061/(ASCE)0733-9364(2004)130:2(258) doi: 10.1061/(ASCE)0733-9364(2004)130:2(258)
![]() |
[12] |
H. S. Shin, Y. C. Kwon, Y. S. Jung, G. J. Bae, Y. G. Kim, Methodology for quantitative hazard assessment for tunnel collapses based on case histories in Korea, Int. J. Rock Mech. Min. Sci., 46 (2009), 1072–1087. https://doi.org/10.1016/j.ijrmms.2009.02.009 doi: 10.1016/j.ijrmms.2009.02.009
![]() |
[13] | M. Fera, R. Macchiaroli, Use of analytic hierarchy process and fire dynamics simulator to assess the fire protection systems in a tunnel on fire, Int. J. Risk Assess. Manage., 14 (2010), 504–529. |
[14] | I. Benekos, D. Diamantidis, On risk assessment and risk acceptance of dangerous goods transportation through road tunnels in Greece, Saf. Sci., 91 (2017), 1–10. http://dx.doi.org/10.1016/j.ssci.2016.07.013 |
[15] |
A. N. Beard, Tunnel safety, risk assessment and decision-making, Tunnelling Underground Space Technol., 25 (2010), 91–94. https://doi.org/10.1016/j.tust.2009.07.006 doi: 10.1016/j.tust.2009.07.006
![]() |
[16] |
L. Chen, H. W. Huang, Risk analysis of rock tunnel engineering, Chin. J. Rock Mech. Eng., 24 (2005), 110–115. https://doi.org/10.3321/j.issn:1000-6915.2005.01.018 doi: 10.3321/j.issn:1000-6915.2005.01.018
![]() |
[17] |
S. C. Li, Z. Q. Zhou, L. P. Li, Z. H. Xu, Q. Q. Zhang, S. S. Shi, Risk assessment of water inrush in karst tunnels based on attribute synthetic evaluation system, Tunnelling Underground Space Technol., 38 (2013), 50–58. https://doi.org/10.1016/j.tust.2013.05.001 doi: 10.1016/j.tust.2013.05.001
![]() |
[18] |
J. J. Chen, F. Zhou, J. S. Yang, B. C. Liu, Fuzzy analytic hierarchy process for risk evaluation of collapseduring construction of mountain tunnel, Rock Soil Mech., 30 (2009), 2365–2370. https://doi.org/10.16285/j.rsm.2009.08.017 doi: 10.16285/j.rsm.2009.08.017
![]() |
[19] |
Y. C. Zhai, Y. S. Hu, X. H. Liao, Y. L. Sun, Renovated nonlinear fuzzy assessment method for casting the tunnel collapse risk based on the entropy weighting, J. Saf. Environ., 16 (2016), 41–45. https://doi.org/10.13637/j.issn.1009-6094.2016.05.008 doi: 10.13637/j.issn.1009-6094.2016.05.008
![]() |
[20] |
Y. C. Yuan, S. C. Li, L. P. Li, T. Lei, S. Wang, B. L. Sun, Risk evaluation theory and method of collapse in mountain tunnel and its engineering applications, J. Cent. South Univ. (Sci. Technol.), 47 (2016), 2406–2414. https://doi.org/10.11817/j.issn.1672-7207.2016.07.031 doi: 10.11817/j.issn.1672-7207.2016.07.031
![]() |
[21] |
C. L. Gao, S. C. Li, J. Wang, L. P. Li, P. Lin, The risk assessment of tunnels based on grey correlation and entropy weight method, Geotech. Geol. Eng., 36 (2018), 1621–1631. https://doi.org/10.1007/s10706-017-0415-5 doi: 10.1007/s10706-017-0415-5
![]() |
[22] |
G. Z. Ou, Y. Y. Jiao, G. H. Zhang, J. P. Zou, F. Tan, W. S. Zhang, Collapse risk assessment of deep-buried tunnel during construction and its application, Tunnelling Underground Space Technol., 115 (2021), 104019. https://doi.org/10.1016/j.tust.2021.104019 doi: 10.1016/j.tust.2021.104019
![]() |
[23] |
S. C. Li, S. S. Shi, L. P. Li, Z. Q. Zhou, M. Guo, T. Lei, Attribute recognition model and its application of mountain tunnel collapse risk assessment, J. Basic Sci. Eng., 21 (2013), 147–158. https://doi.org/10.3969/j.issn.1005-0930.2013.01.016 doi: 10.3969/j.issn.1005-0930.2013.01.016
![]() |
[24] |
Z. G. Xu, N. G. Cai, X. F. Li, M. T. Xian, T. W. Dong, Risk assessment of loess tunnel collapse during construction based on an attribute recognition model, Bull. Eng. Geol. Environ., 80 (2021), 6205–6220. https://doi.org/10.1007/s10064-021-02300-8 doi: 10.1007/s10064-021-02300-8
![]() |
[25] |
S. Wang, L. P. Li, S. Cheng, Risk assessment of collapse in mountain tunnels and software development, Arabian J. Geosci., 13 (2020), 1196. https://doi.org/10.1007/s12517-020-05520-6 doi: 10.1007/s12517-020-05520-6
![]() |
[26] |
S. Wang, L. P. Li, S. S. Shi, S. Cheng, H. J. Hu, T. Wen, Dynamic risk assessment method of collapse in mountain tunnels and application, Geotech. Geol. Eng., 38 (2020), 2913–2926. https://doi.org/10.1007/s10706-020-01196-7 doi: 10.1007/s10706-020-01196-7
![]() |
[27] |
W. G. Cao, Y. C. Zhai, J. Y. Wang, Y. J. Zhang, Method of set pair analysis for collapse risk during construction of mountain tunnel, Chin. J. Highway Transp., 25 (2012), 90–99. https://doi.org/10.19721/j.cnki.1001-7372.2012.02.013 doi: 10.19721/j.cnki.1001-7372.2012.02.013
![]() |
[28] |
W. Chen, G. H. Zhang, Y. Y. Jiao, H. Wang, Unascertained measure-set pair analysis model of collapse risk Evaluation in mountain tunnels and its engineering application, KSCE J. Civil Eng., 25 (2021), 451–467. https://doi.org/10.1007/s12205-020-0627-8 doi: 10.1007/s12205-020-0627-8
![]() |
[29] |
X. J. Guan, Evaluation method on risk grade of tunnel collapse based on extension connection cloud model, J Saf. Sci. Technol., 14 (2018), 186–192. https://doi.org/10.11731/j.issn.1673-193x.2018.11.030 doi: 10.11731/j.issn.1673-193x.2018.11.030
![]() |
[30] |
G. Yang, D. W. Liu, F. J. Chu, H. D. Peng, W. X. Huang, Evaluation on risk grade of tunnel collapse based on cloud model, J. Saf. Sci. Technol., 11 (2015), 95–101. https://doi.org/10.11731/j.issn.1673-193x.2015.06.015 doi: 10.11731/j.issn.1673-193x.2015.06.015
![]() |
[31] |
Y. L. An, L. M. Peng, B. Wu, F. Zhang, Comprehensive extension assessment on tunnel collapse risk, J. Cent. South Univ. (Sci. Technol.), 42 (2011), 514–520. https://doi.org/10.4028/www.scientific.net/AMR.211-212.106 doi: 10.4028/www.scientific.net/AMR.211-212.106
![]() |
[32] |
Z. Yang, X. L. Rong, H. Lu, X. Dong, Risk assessment on the tunnel collapse probability by the theory of extenics in combination with the entropy weight and matter-element model, J. Saf. Environ., 16 (2016), 15–19. https://doi.org/10.13637/j.issn.1009-6094.2016.02.003 doi: 10.13637/j.issn.1009-6094.2016.02.003
![]() |
[33] |
Y. C. Wang, X. Yin, F. Geng, H. W. Jing, H. J. Su, R. C. Liu, Risk assessment of water inrush in karst tunnels based on the efficacy coefficient method, Pol. J. Environ. Stud., 4 (2017), 1765–1775. https://doi.org/10.15244/pjoes/65839 doi: 10.15244/pjoes/65839
![]() |
[34] |
M. Caterino, M. Fera, R. Macchiaroli, A. Lambiase, Appraisal of a new safety assessment method using the petri nets for the machines safety, IFAC Papers Online, 51 (2018), 933–938. https://doi.org/10.1016/j.ifacol.2018.08.488 doi: 10.1016/j.ifacol.2018.08.488
![]() |
[35] |
Z. Q. Zhou, S. C. Li, L. P. Li, B. Sui, S. S. Shi, Q. Q. Zhang, Causes of geological hazards and risk control of collapse in shallow tunnels, Rock Soil Mech., 34 (2013), 1376–1382. https://doi.org/10.16285/j.rsm.2013.05.028 doi: 10.16285/j.rsm.2013.05.028
![]() |
[36] | S. Wang, Regional Dynamic Risk Assessment and Early Warning of Tunnel Water Inrush and Application, Master thesis, Shandong University, 2016. |
1. | Yang Zhang, Caiqi Liu, Mujiexin Liu, Tianyuan Liu, Hao Lin, Cheng-Bing Huang, Lin Ning, Attention is all you need: utilizing attention in AI-enabled drug discovery, 2023, 25, 1467-5463, 10.1093/bib/bbad467 | |
2. | Jiahui Wen, Haitao Gan, Zhi Yang, Ming Shi, Ji Wang, 2024, Chapter 30, 978-981-99-8140-3, 400, 10.1007/978-981-99-8141-0_30 | |
3. | Farnaz Palhamkhani, Milad Alipour, Abbas Dehnad, Karim Abbasi, Parvin Razzaghi, Jahan B. Ghasemi, DeepCompoundNet: enhancing compound–protein interaction prediction with multimodal convolutional neural networks, 2023, 0739-1102, 1, 10.1080/07391102.2023.2291829 | |
4. | Wenhui Zhang, Xin Wen, Kewei Zhang, Rui Cao, Baolu Gao, 2024, BN-DTI: A deep learning based sequence feature incorporating method for predicting drug-target interaction, 979-8-3503-6413-2, 1, 10.1109/IJCB62174.2024.10744528 | |
5. | Qian Liao, Yu Zhang, Ying Chu, Yi Ding, Zhen Liu, Xianyi Zhao, Yizheng Wang, Jie Wan, Yijie Ding, Prayag Tiwari, Quan Zou, Ke Han, Application of Artificial Intelligence In Drug-target Interactions Prediction: A Review, 2025, 2, 3005-1444, 10.1038/s44385-024-00003-9 |
Feature | Representation |
atom type | C, N, O, S, F, P, Cl, Br, B, H (onehot) |
degree of atom | 0, 1, 2, 3, 4, 5 (onehot) |
number of hydrogens attached | 0, 1, 2, 3, 4 (onehot) |
implicit valence electrons | 0, 1, 2, 3, 4, 5 (onehot) |
aromaticity | 0 or 1 |
Name | Value |
Dimension of atom representation | 10 |
Dimension of protein representation | 10 |
Number of decoder layers | 3 |
Number of hidden layers | 10 |
Number of attention heads | 2 |
Learning rate | 5e-3 |
Weight decay | 1e-6 |
Dropout | 0.1 |
Batch size | 128 |
Dataset | Drugs | Proteins | Interactions | Positive | Negative |
Human | 1052 | 852 | 6728 | 3369 | 3359 |
C.elegans | 1434 | 2504 | 7786 | 4000 | 3786 |
GPCR | 5359 | 356 | 15343 | 7989 | 7354 |
Davis | 68 | 379 | 25772 | 7320 | 18452 |
Methods | AUC | Precision | Recall |
GNN-CPI | 0.917 ± 0.072 | 0.783 ± 0.061 | 0.889 ± 0.096 |
GNN-PT | 0.978 ± 0.006 | 0.939 ± 0.010 | 0.934 ± 0.006 |
TransformerCPI | 0.972 ± 0.005 | 0.938 ± 0.018 | 0.932 ± 0.001 |
Mutual-DTI | 0.984 ± 0.001 | 0.962 ± 0.019 | 0.943 ± 0.016 |
Methods | AUC | Precision | Recall |
GNN-CPI | 0.899 ± 0.104 | 0.850 ± 0.132 | 0.778 ± 0.192 |
GNN-PT | 0.984 ± 0.007 | 0.940 ± 0.024 | 0.933 ± 0.014 |
TransformerCPI | 0.984 ± 0.004 | 0.943 ± 0.025 | 0.951 ± 0.016 |
Mutual-DTI | 0.987 ± 0.004 | 0.948 ± 0.018 | 0.949 ± 0.013 |
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.810 ± 0.023 | 0.704 ± 0.014 | 0.768 ± 0.030 |
Mutual-DTI | 0.820 ± 0.014 | 0.699 ± 0.010 | 0.796 ± 0.046 |
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.886 ± 0.005 | 0.728 ± 0.023 | 0.654 ± 0.005 |
Mutual-DTI | 0.900 ± 0.002 | 0.767 ± 0.013 | 0.680 ± 0.027 |
Feature | Representation |
atom type | C, N, O, S, F, P, Cl, Br, B, H (onehot) |
degree of atom | 0, 1, 2, 3, 4, 5 (onehot) |
number of hydrogens attached | 0, 1, 2, 3, 4 (onehot) |
implicit valence electrons | 0, 1, 2, 3, 4, 5 (onehot) |
aromaticity | 0 or 1 |
Name | Value |
Dimension of atom representation | 10 |
Dimension of protein representation | 10 |
Number of decoder layers | 3 |
Number of hidden layers | 10 |
Number of attention heads | 2 |
Learning rate | 5e-3 |
Weight decay | 1e-6 |
Dropout | 0.1 |
Batch size | 128 |
Dataset | Drugs | Proteins | Interactions | Positive | Negative |
Human | 1052 | 852 | 6728 | 3369 | 3359 |
C.elegans | 1434 | 2504 | 7786 | 4000 | 3786 |
GPCR | 5359 | 356 | 15343 | 7989 | 7354 |
Davis | 68 | 379 | 25772 | 7320 | 18452 |
Methods | AUC | Precision | Recall |
GNN-CPI | 0.917 ± 0.072 | 0.783 ± 0.061 | 0.889 ± 0.096 |
GNN-PT | 0.978 ± 0.006 | 0.939 ± 0.010 | 0.934 ± 0.006 |
TransformerCPI | 0.972 ± 0.005 | 0.938 ± 0.018 | 0.932 ± 0.001 |
Mutual-DTI | 0.984 ± 0.001 | 0.962 ± 0.019 | 0.943 ± 0.016 |
Methods | AUC | Precision | Recall |
GNN-CPI | 0.899 ± 0.104 | 0.850 ± 0.132 | 0.778 ± 0.192 |
GNN-PT | 0.984 ± 0.007 | 0.940 ± 0.024 | 0.933 ± 0.014 |
TransformerCPI | 0.984 ± 0.004 | 0.943 ± 0.025 | 0.951 ± 0.016 |
Mutual-DTI | 0.987 ± 0.004 | 0.948 ± 0.018 | 0.949 ± 0.013 |
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.810 ± 0.023 | 0.704 ± 0.014 | 0.768 ± 0.030 |
Mutual-DTI | 0.820 ± 0.014 | 0.699 ± 0.010 | 0.796 ± 0.046 |
Methods | AUC | Precision | Recall |
no-mutual-DTI | 0.886 ± 0.005 | 0.728 ± 0.023 | 0.654 ± 0.005 |
Mutual-DTI | 0.900 ± 0.002 | 0.767 ± 0.013 | 0.680 ± 0.027 |