Research article Special Issues

Drug-target binding affinity prediction method based on a deep graph neural network


  • Received: 01 August 2022 Revised: 06 September 2022 Accepted: 07 September 2022 Published: 30 September 2022
  • The development of new drugs is a long and costly process, Computer-aided drug design reduces development costs while computationally shortening the new drug development cycle, in which DTA (Drug-Target binding Affinity) prediction is a key step to screen out potential drugs. With the development of deep learning, various types of deep learning models have achieved notable performance in a wide range of fields. Most current related studies focus on extracting the sequence features of molecules while ignoring the valuable structural information; they employ sequence data that represent only the elemental composition of molecules without considering the molecular structure maps that contain structural information. In this paper, we use graph neural networks to predict DTA based on corresponding graph data of drugs and proteins, and we achieve competitive performance on two benchmark datasets, Davis and KIBA. In particular, an MSE of 0.227 and CI of 0.895 were obtained on Davis, and an MSE of 0.127 and CI of 0.903 were obtained on KIBA.

    Citation: Dong Ma, Shuang Li, Zhihua Chen. Drug-target binding affinity prediction method based on a deep graph neural network[J]. Mathematical Biosciences and Engineering, 2023, 20(1): 269-282. doi: 10.3934/mbe.2023012

    Related Papers:

  • The development of new drugs is a long and costly process, Computer-aided drug design reduces development costs while computationally shortening the new drug development cycle, in which DTA (Drug-Target binding Affinity) prediction is a key step to screen out potential drugs. With the development of deep learning, various types of deep learning models have achieved notable performance in a wide range of fields. Most current related studies focus on extracting the sequence features of molecules while ignoring the valuable structural information; they employ sequence data that represent only the elemental composition of molecules without considering the molecular structure maps that contain structural information. In this paper, we use graph neural networks to predict DTA based on corresponding graph data of drugs and proteins, and we achieve competitive performance on two benchmark datasets, Davis and KIBA. In particular, an MSE of 0.227 and CI of 0.895 were obtained on Davis, and an MSE of 0.127 and CI of 0.903 were obtained on KIBA.



    加载中


    [1] Y. Zhang, Artificial intelligence for bioinformatics and biomedicine, Curr. Bioinf., 15 (2020), 801–802. https://doi.org/10.2174/157489361508201221092330 doi: 10.2174/157489361508201221092330
    [2] B. Jena, S. Saxena, G. K. Nayak, L. Saba, N. Sharma, J. S. Suri, Artificial intelligence-based hybrid deep learning models for image classification: The first narrative review, Comput. Biol. Med., 137 (2021), 104803. https://doi.org/10.1016/j.compbiomed.2021.104803 doi: 10.1016/j.compbiomed.2021.104803
    [3] H. Lin, Development and application of artificial intelligence methods in biological and medical data, Curr. Bioinf., 15 (2020), 515–516. https://doi.org/10.2174/157489361506200610112345 doi: 10.2174/157489361506200610112345
    [4] R. C. Andrade, M. Boroni, M. K. Amazonas, F. R. Vargas, New drug candidates for osteosarcoma: Drug repurposing based on gene expression signature, Comput. Biol. Med., 134 (2021), 104470. https://doi.org/10.1016/j.compbiomed.2021.104470 doi: 10.1016/j.compbiomed.2021.104470
    [5] J. Wang, Y. Shi, X. Wang, H. Chang, A drug target interaction prediction based on LINE-RF learning, Curr. Bioinf., 15 (2020), 750–757. https://doi.org/10.2174/1574893615666191227092453 doi: 10.2174/1574893615666191227092453
    [6] M. Aslam, M. Shehroz, F. Ali, A. Zia, S. Pervaiz, M. Shah, et al., Chlamydia trachomatis core genome data mining for promising novel drug targets and chimeric vaccine candidates identification, Comput. Biol. Med., 136 (2021), 104701. https://doi.org/10.1016/j.compbiomed.2021.104701 doi: 10.1016/j.compbiomed.2021.104701
    [7] J. Yan, J. Huang, C. Zhang, H. Huo, F. Chen, Virtual screening of acetylcholinesterase inhibitors based on machine learning combined with molecule docking methods, Curr. Bioinf., 16 (2021), 963–971. https://doi.org/10.2174/1574893615999200719234045 doi: 10.2174/1574893615999200719234045
    [8] F. F. Ahmed, M. Khatun, M. Mosharaf, M. N. Mollah, Prediction of protein-protein interactions in Arabidopsis thaliana using partial training samples in a machine learning framework, Curr. Bioinf., 16 (2021), 865–879. https://doi.org/10.2174/1574893616666210204145254 doi: 10.2174/1574893616666210204145254
    [9] D. P. Boso, D. D. Mascolo, R. Santagiuliana, P. Decuzzi, B. A. Schrefler, Drug delivery: Experiments, mathematical modelling and machine learning, Comput. Biol. Med., 123 (2020), 103820. https://doi.org/10.1016/j.compbiomed.2020.103820 doi: 10.1016/j.compbiomed.2020.103820
    [10] Y. Ding, J. Tang, F. Guo, Q. Zou, Identification of drug-target interactions via multiple kernel-based triple collaborative matrix factorization, Briefings Bioinf., 23 (2022). https://doi.org/10.1093/bib/bbab582 doi: 10.1093/bib/bbab582
    [11] R. Su, X. Liu, L. Wei, Q. Zou, Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response, Methods, 166 (2019), 91–102. https://doi.org/10.1016/j.ymeth.2019.02.009 doi: 10.1016/j.ymeth.2019.02.009
    [12] Q. Bai, S. Liu, Y. Tian, T. Xu, A. J. Banegas-Luna, H. Pérez-Sánchez, Application advances of deep learning methods for de novo drug design and molecular dynamics simulation, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 12 (2022), e1581. https://doi.org/10.1002/wcms.1581 doi: 10.1002/wcms.1581
    [13] Q. Bai, S. Tan, T. Xu, H. Liu, J. Huang, X. Yao, MolAICal: A soft tool for 3D drug design of protein targets by artificial intelligence and classical algorithm, Briefings Bioinf., 22 (2021). https://doi.org/10.1093/bib/bbaa161 doi: 10.1093/bib/bbaa161
    [14] J. Li, A. Fu, L. Zhang, An overview of scoring functions used for protein-ligand interactions in molecular docking, Interdiscip. Sci.: Comput. Life Sci., 11 (2019), 320–328. https://doi.org/10.1007/s12539-019-00327-w doi: 10.1007/s12539-019-00327-w
    [15] Y. Ding, J. Tang, F. Guo, Protein crystallization identification via fuzzy model on linear neighborhood representation, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2019), 1986–1995. https://doi.org/10.1109/TCBB.2019.2954826 doi: 10.1109/TCBB.2019.2954826
    [16] Y. Ding, J. Tang, F. Guo, Human protein subcellular localization identification via fuzzy model on kernelized neighborhood representation, Appl. Soft Comput., 96 (2020), 106596. https://doi.org/10.1016/j.asoc.2020.106596 doi: 10.1016/j.asoc.2020.106596
    [17] T. Nguyen, H. Le, T. P. Quinn, T. Nguyen, T. D. Le, S. Venkatesh, GraphDTA: Predicting drug-target binding affinity with graph neural networks, Bioinformatics, 37 (2021), 1140–1147. https://doi.org/10.1093/bioinformatics/btaa921 doi: 10.1093/bioinformatics/btaa921
    [18] M. Jiang, Z. Li, S. Zhang, S. Wang, X. Wang, Q. Yuan, et al., Drug-target affinity prediction using graph neural network and contact maps, RSC Adv., 10 (2020), 20701–20712. https://doi.org/10.1039/D0RA02297G doi: 10.1039/D0RA02297G
    [19] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, preprint, arXiv: 1609.02907.
    [20] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, preprint, arXiv: 1710.10903.
    [21] M. I. Davis, J. P. Hunt, S. Herrgard, P. Ciceri, L. M. Wodicka, G. Pallares, et al., Comprehensive analysis of kinase inhibitor selectivity, Nat. Biotechnol., 29 (2011), 1046–1051. https://doi.org/10.1038/nbt.1990 doi: 10.1038/nbt.1990
    [22] R. Wang, X. Fang, Y. Lu, S. Wang, The PDBbind database: Collection of binding affinities for protein-ligand complexes with known three-dimensional structures, J. Med. Chem., 47 (2004), 2977–2980. https://doi.org/10.1021/jm030580l doi: 10.1021/jm030580l
    [23] R. Wang, X. Fang, Y. Lu, Y. C. Yang, S. Wang, The PDBbind database: Methodologies and updates, J. Med. Chem., 48 (2005), 4111–4119. https://doi.org/10.1021/jm048957q doi: 10.1021/jm048957q
    [24] D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, J. Chem. Inf. Comput. Sci., 28 (1988), 31–36. https://doi.org/10.1021/ci00057a005 doi: 10.1021/ci00057a005
    [25] M. Michel, D. Menéndez Hurtado, A. Elofsson, PconsC4: Fast, accurate and hassle-free contact predictions, Bioinformatics, 35 (2019), 2677–2679. https://doi.org/10.1093/bioinformatics/bty1036 doi: 10.1093/bioinformatics/bty1036
    [26] Q. Wu, Z. Peng, I. Anishchenko, Q. Cong, D. Baker, J. Yang, Protein contact prediction using metagenome sequence data and residual neural networks, Bioinformatics, 36 (2020), 41–48. https://doi.org/10.1093/bioinformatics/btz477 doi: 10.1093/bioinformatics/btz477
    [27] J. C. Jeong, X. Lin, X. W. Chen, On position-specific scoring matrix for protein function prediction, IEEE/ACM Trans. Comput. Biol. Bioinf., 8 (2010), 308–315. https://doi.org/10.1109/TCBB.2010.93 doi: 10.1109/TCBB.2010.93
    [28] Y. Ding, P. Tiwari, Q. Zou, F. Guo, H. M. Pandey, C-loss based higher-order fuzzy inference systems for identifying DNA N4-methylcytosine sites, IEEE Trans. Fuzzy Syst., 2022 (2022). https://doi.org/10.1109/TFUZZ.2022.3159103 doi: 10.1109/TFUZZ.2022.3159103
    [29] X. Hu, L. Chu, J. Pei, W. Liu, J. Bian, Model complexity of deep learning: A survey, Knowl. Inf. Syst., 63 (2021), 2585–2619. https://doi.org/10.1007/s10115-021-01605-0 doi: 10.1007/s10115-021-01605-0
    [30] Q. Li, Z. Han, X. M. Wu, Deeper insights into graph convolutional networks for semi-supervised learning, in Thirty-Second AAAI conference on artificial intelligence, AAAI, New Orleans, USA, (2018), 3538–3545. https://doi.org/10.1609/aaai.v32i1.11604
    [31] G. Taubin, A signal processing approach to fair surface design, in Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, ACM, (1995), 351–358. https://doi.org/10.1145/218380.218473
    [32] Y. Ding, W. He, J. Tang, Q. Zou, F. Guo, Laplacian regularized sparse representation based classifier for identifying DNA N4-methylcytosine Sites via L2, 1/2-matrix Norm, IEEE/ACM Trans. Comput. Biol. Bioinf., 2021 (2021). https://doi.org/10.1109/TCBB.2021.3133309 doi: 10.1109/TCBB.2021.3133309
    [33] Y. Ding, J. Tang, F. Guo, Identification of drug-target interactions via dual laplacian regularized least squares with multiple kernel fusion, Knowledge-Based Syst., 204 (2020), 106254. https://doi.org/10.1016/j.knosys.2020.106254 doi: 10.1016/j.knosys.2020.106254
    [34] P. Tiwari, S. Dehdashti, A. K. Obeid, P. Marttinen, P. Bruza, Kernel method based on non-linear coherent states in quantum feature space, J. Phys. A: Math. Theor., 55 (2022), 355301. https://doi.org/10.1088/1751-8121/ac818e doi: 10.1088/1751-8121/ac818e
    [35] J. Klicpera, S. Weißenberger, S. Günnemann, Diffusion improves graph learning, preprint, arXiv: 1911.05485.
    [36] L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: Bringing order to the web, Stanford InfoLab., 1999 (1999).
    [37] F.Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, K. Weinberger, Simplifying graph convolutional networks, in International conference on machine learning, PMLR, 97 (2019), 6861–6871. https://doi.org/10.48550/arXiv.902.07153
    [38] H. Zhu, P. Koniusz, Simple spectral graph convolution, in International Conference on Learning Representations, (2020).
    [39] F. Fouss, K. Francoisse, L.Yen, A. Pirotte, M. Saerens, An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification, Neural networks, 31 (2012), 53–72. https://doi.org/10.1016/j.neunet.2012.03.001 doi: 10.1016/j.neunet.2012.03.001
    [40] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, Pytorch: An imperative style, high-performance deep learning library, in Advances in neural information processing systems, 32 (2019).
    [41] M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, preprint, arXiv: 1903.02428.
    [42] C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, et al., Weisfeiler and leman go neural: Higher-order graph neural networks, in Proceedings of the AAAI conference on artificial intelligence, AAAI, Honolulu, USA, 33 (2019), 4602–4609. https://doi.org/10.1609/aaai.v33i01.33014602
    [43] W. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, in Advances in neural information processing systems, 30 (2017).
    [44] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, et al., Convolutional networks on graphs for learning molecular fingerprints, in Advances in neural information processing systems, 28 (2015). https://doi.org/10.48550/arXiv.1509.09292
    [45] M. Gönen, G. Heller, Concordance probability and discriminatory power in proportional hazards regression, Biometrika, 92 (2005), 965–970. https://doi.org/10.1093/biomet/92.4.965 doi: 10.1093/biomet/92.4.965
    [46] D. M. Allen, Mean square error of prediction as a criterion for selecting variables, Technometrics, 13 (1971), 469–475. https://doi.org/10.1080/00401706.1971.10488811 doi: 10.1080/00401706.1971.10488811
    [47] Z. Xu, S. Wang, F. Zhu, J. Huang, Seq2seq fingerprint: An unsupervised deep molecular embedding for drug discovery, in Proceedings of the 8th ACM international conference on bioinformatics, computational biology, and health informatics, ACM, Boston, USA, (2017), 285–294. https://doi.org/10.1145/3107411.3107424
    [48] E. Asgari, M. R. Mofrad Continuous distributed representation of biological sequences for deep proteomics and genomics, PloS one, 10 (2015), e0141287. https://doi.org/10.1371/journal.pone.0141287 doi: 10.1371/journal.pone.0141287
    [49] J. Chung, C. Gulcehre, K. Cho, . Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, preprint, arXiv: 1412.3555.
    [50] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, ACM, San Francisco, USA, (2016), 785–794. https://doi.org/10.1145/2939672.2939785
    [51] G. Fu, Y. Ding, A. Seal, B. Chen, Y. Sun, E. Bolton, Predicting drug target interactions using meta-path-based semantic network analysis, BMC Bioinf., 17 (2016), 1–10. https://doi.org/10.1186/s12859-016-1005-x doi: 10.1186/s12859-016-1005-x
    [52] Y. Pu, J. Li, J. Tang, F. Guo, DeepFusionDTA: Drug-target binding affinity prediction with information fusion and hybrid deep-learning ensemble model, IEEE/ACM Trans. Comput. Biol. Bioinf., 2021 (2021). https://doi.org/10.1109/TCBB.2021.3103966 doi: 10.1109/TCBB.2021.3103966
    [53] H. Öztürk, E. Ozkirimli, A. Özgür, WideDTA: Prediction of drug-target binding affinity. preprint, arXiv: 1902.04166.
    [54] M. A. Thafar, M. Alshahrani, S. Albaradei, T. Gojobori, M. Essack, X. Gao, Affinity2Vec: Drug-target binding affinity prediction through representation learning, graph mining, and machine learning, Sci. Rep., 12 (2022), 1–18. https://doi.org/10.1038/s41598-022-08787-9 doi: 10.1038/s41598-022-08787-9
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2489) PDF downloads(230) Cited by(0)

Article outline

Figures and Tables

Figures(3)  /  Tables(5)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog