Research article

RLF-LPI: An ensemble learning framework using sequence information for predicting lncRNA-protein interaction based on AE-ResLSTM and fuzzy decision


  • Received: 27 November 2021 Revised: 28 February 2022 Accepted: 02 March 2022 Published: 11 March 2022
  • Long non-coding RNAs (lncRNAs) play a regulatory role in many biological cells, and the recognition of lncRNA-protein interactions is helpful to reveal the functional mechanism of lncRNAs. Identification of lncRNA-protein interaction by biological techniques is costly and time-consuming. Here, an ensemble learning framework, RLF-LPI is proposed, to predict lncRNA-protein interactions. The RLF-LPI of the residual LSTM autoencoder module with fusion attention mechanism can extract the potential representation of features and capture the dependencies between sequences and structures by k-mer method. Finally, the relationship between lncRNA and protein is learned through the method of fuzzy decision. The experimental results show that the ACC of RLF-LPI is 0.912 on ATH948 dataset and 0.921 on ZEA22133 dataset. Thus, it is demonstrated that our proposed method performed better in predicting lncRNA-protein interaction than other methods.

    Citation: Jinmiao Song, Shengwei Tian, Long Yu, Qimeng Yang, Qiguo Dai, Yuanxu Wang, Weidong Wu, Xiaodong Duan. RLF-LPI: An ensemble learning framework using sequence information for predicting lncRNA-protein interaction based on AE-ResLSTM and fuzzy decision[J]. Mathematical Biosciences and Engineering, 2022, 19(5): 4749-4764. doi: 10.3934/mbe.2022222

    Related Papers:

  • Long non-coding RNAs (lncRNAs) play a regulatory role in many biological cells, and the recognition of lncRNA-protein interactions is helpful to reveal the functional mechanism of lncRNAs. Identification of lncRNA-protein interaction by biological techniques is costly and time-consuming. Here, an ensemble learning framework, RLF-LPI is proposed, to predict lncRNA-protein interactions. The RLF-LPI of the residual LSTM autoencoder module with fusion attention mechanism can extract the potential representation of features and capture the dependencies between sequences and structures by k-mer method. Finally, the relationship between lncRNA and protein is learned through the method of fuzzy decision. The experimental results show that the ACC of RLF-LPI is 0.912 on ATH948 dataset and 0.921 on ZEA22133 dataset. Thus, it is demonstrated that our proposed method performed better in predicting lncRNA-protein interaction than other methods.



    加载中


    [1] D. Guan, W. Zhang, G. H. Liu, J. C. Belmonte, Switching cell fate, ncRNAs coming to play, Cell Death Dis., 4 (2013), e464. https://doi.org/10.1038/cddis.2012.196 doi: 10.1038/cddis.2012.196
    [2] J. J. Quinn, H. Y. Chang, Unique features of long non-coding RNA biogenesis and function, Nat. Rev. Genet., 17 (2016), 47–62. https://doi.org/10.1038/nrg.2015.10 doi: 10.1038/nrg.2015.10
    [3] K. Panzitt, M. M. O. Tschernatsch, C. Guelly, T. Moustafa, M. Stradner, H. M. Strohmaier, et al., Characterization of HULC, a novel gene with striking up-regulation in hepatocellular carcinoma, as noncoding RNA, Gastroenterology, 132 (2007), 330–342. https://doi.org/10.1053/j.gastro.2006.08.026 doi: 10.1053/j.gastro.2006.08.026
    [4] J. Wang, X. Liu, H. Wu, P. Ni, Z. Gu, Y. Qiao, et al., CREB up-regulates long non-coding RNA, HULC expression through interaction with microRNA-372 in liver cancer, Nucleic Acids Res., 38 (2010), 5366–5383. https://doi.org/10.1093/nar/gkq285 doi: 10.1093/nar/gkq285
    [5] A. C. Kaushik, A. Mehmood, X. Wang, D. Q. Wei, X. Dai, Globally ncrnas expression profiling of tnbc and screening of functional lncrna, Front. Bioeng. Biotechnol., 8 (2021), 1480. https://doi.org/10.3389/fbioe.2020.523127 doi: 10.3389/fbioe.2020.523127
    [6] X. Pan, P. Rijnbeek, J. Yan, H. B. Shen, Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks, BMC Genomics, 19 (2018). https://doi.org/10.1186/s12864-018-4889-1
    [7] D. Adjeroh, M. Allaga, J. Tan, J. Lin, Y. Jiang, A. Abbasi, et al., Feature-based and string-based models for predicting RNA-protein interaction, Molecules, 23 (2018), 697. https://doi.org/10.3390/molecules23030697 doi: 10.3390/molecules23030697
    [8] S. W. Zhang, X. N. Fan, Computational methods for predicting ncRNA-protein interactions, Med. Chem., 13 (2017), 515–525. https://doi.org/10.2174/1573406413666170510102405 doi: 10.2174/1573406413666170510102405
    [9] L. Peng, F. Liu, J. Yang, X. Liu, Y. Meng, X. Deng, et al., Probing lncRNA–protein interactions: data repositories, models, and algorithms, Front. Genet., (2020), 1346. https://doi.org/10.3389/fgene.2019.01346
    [10] H. Hu, L. Zhang, H. Ai, H. Zhang, Y. Fan, Q. Zhao, H Liu, et al., HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy, RNA Biol., 15 (2018), 797–806. https://doi.org/10.1080/15476286.2018.1457935 doi: 10.1080/15476286.2018.1457935
    [11] Q. Lu, S. Ren, M. Lu, Y. Zhang, D. Zhu, X. Zhang, et al., Computational prediction of associations between long non-coding RNAs and proteins, BMC Genomics, 14 (2013). https://doi.org/10.1186/1471-2164-14-651
    [12] W. Zhang, Q. Qu, Y. Zhang, W. Wang, The linear neighborhood propagation method for predicting long non-coding RNA–protein interactions, Neurocomputing, 273 (2018), 526–534. https://doi.org/10.1016/j.neucom.2017.07.065 doi: 10.1016/j.neucom.2017.07.065
    [13] Q. Zhao, Y. Zhang, H. Hu, G. Ren, W. Zhang, H. Liu, IRWNRLPI: integrating random walk and neighborhood regularized logistic matrix factorization for lncRNA-protein interaction prediction, Front. Genet., 9 (2018), 239. https://doi.org/10.3389/fgene.2018.00239 doi: 10.3389/fgene.2018.00239
    [14] R. Zhu, G. Li, J. X. Liu, L. Y. Dai, Y. Guo, ACCBN: Ant-Colony-clustering-based bipartite network method for predicting long non-coding RNA-protein interactions, BMC Bioinf., 20 (2019). https://doi.org/10.1186/s12859-018-2586-3
    [15] T. Zhang, M. Wang, J. Xi, A. Li, LPGNMF: predicting long non-coding RNA and protein interaction using graph regularized nonnegative matrix factorization, IEEE/ACM Trans. Comput. Biol. Bioinf., 17 (2018), 189–197. https://doi.org/10.1109/TCBB.2018.2861009 doi: 10.1109/TCBB.2018.2861009
    [16] H. Zhang, Z. Ming, C. Fan, Q. Zhao, H. Liu, A path-based computational model for long non-coding RNA-protein interaction prediction, Genomics, 112 (2020), 1754–1760. https://doi.org/10.1016/j.ygeno.2019.09.018 doi: 10.1016/j.ygeno.2019.09.018
    [17] U. K. Muppirala, V. G. Honavar, D. Dobbs, Predicting RNA-protein interactions using only sequence information, BMC Bioinf., 12 (2011). https://doi.org/10.1186/1471-2105-12-489
    [18] Y. Wang, X. Chen, Z. P. Liu, Q. Huang, Y. Wang, D. Xu, et al., De novo prediction of RNA-protein interactions from sequence information, Mol. Biosyst., 9 (2013), 133–142. https://doi.org/10.1039/C2MB25292A doi: 10.1039/C2MB25292A
    [19] X. Pan, Y. X. Fan, J. Yan, H. B. Shen, IPMiner: hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction, BMC Genomics, 17 (2016), 582. https://doi.org/10.1186/s12864-016-2931-8 doi: 10.1186/s12864-016-2931-8
    [20] L. Peng, R. Yuan, L. Shen, P. Gao, L. Zhou, LPI-EnEDT: an ensemble framework with extra tree and decision tree classifiers for imbalanced lncRNA-protein interaction data classification, Biodata Min., 14 (2021), 50. https://orcid.org/0000-0002-2321-3901
    [21] C. Peng, S. Han, H. Zhang, Y. Li, RPITER: a hierarchical deep learning framework for ncRNA–protein interaction prediction, Int. J. Mol. Sci., 20 (2019), 1070. https://doi.org/10.3390/ijms20051070 doi: 10.3390/ijms20051070
    [22] J. S. Wekesa, J. Meng, Y. Luan, A deep learning model for plant lncRNA-protein interaction prediction with graph attention, Mol. Genet. Genomics, 295 (2020), 1091–1102. https://doi.org/10.1007/s00438-020-01682-w doi: 10.1007/s00438-020-01682-w
    [23] J. S. Wekesa, J. Meng, Y. Luan, Multi-feature fusion for deep learning to predict plant lncRNA-protein interaction, Genomics, 112 (2020), 2928–2936. https://doi.org/10.1016/j.ygeno.2020.05.005 doi: 10.1016/j.ygeno.2020.05.005
    [24] H. Zhou, Y. Luan, J. S. Wekesa, J. Meng, Prediction of plant lncRNA-protein interactions using sequence information based on deep learning, in International Conference on Intelligent Computing, (2019), 358–368. https://doi.org/10.1007/978-3-030-26766-7_33
    [25] Y. Huang, B. Niu, Y. Gao, L. Fu, W. Li, CD-HIT Suite: a web server for clustering and comparing biological sequences, Bioinformatics, 26 (2010), 680–682. https://doi.org/10.1093/bioinformatics/btq003 doi: 10.1093/bioinformatics/btq003
    [26] I. Goodfellow, Y. Bengio, A. Courville, Regularization for deep learning, Deep learn., (2016), 216–261.
    [27] Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, E. Hovy, Hierarchical attention networks for document classification, in Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, (2016), 1480–1489. https://doi.org/10.18653/v1/N16-1174
    [28] Q. Kang, J. Meng, J. Cui, Y. Luan, M. Chen, PmliPred: a method based on hybrid model and fuzzy decision for plant miRNA–lncRNA interaction prediction, Bioinformatics, 36 (2020), 2986–2992. https://doi.org/10.1093/bioinformatics/btaa074 doi: 10.1093/bioinformatics/btaa074
    [29] R. Lorenz, S. H. Bernhart, C. H. Siederdissen, H. Tafer, C. Flamm, P. F. Stadler, et al., ViennaRNA Package 2.0, Algorithms Mol. Biol., 6 (2011). https://doi.org/10.1186/1748-7188-6-26
    [30] C. Geourjon, G. Deleage, SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments, Bioinformatics, 11 (1995), 681–684. https://doi.org/10.1093/bioinformatics/11.6.681 doi: 10.1093/bioinformatics/11.6.681
    [31] G. Montavon, G. Orr, K. R. Müller, Neural Networks: Tricks of the Trade, springer, 2012.
    [32] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15 (2014), 1929–1958.
    [33] J. S. Wekesa, Y. Luan, J. Meng, LPI-DL: A recurrent deep learning model for plant lncRNA-protein interaction and function prediction with feature optimization, in 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), (2020), 499–502. https://doi.org/10.1109/BIBM49941.2020.9313431
    [34] H. C. Yi, Z. H. You, D. S. Huang, X. Li, T. H. Jiang, L. P. Li, A deep learning framework for robust and accurate prediction of ncRNA-protein interactions using evolutionary information, Mol. Ther.-Nucleic Acids, 11 (2019), 337–344. https://doi.org/10.1016/j.omtn.2018.03.001 doi: 10.1016/j.omtn.2018.03.001
    [35] Z. H. Zhan, L. N. Jia, Y. Zhou, L. P. Li, H. C. Yi, BGFE: a deep learning model for ncRNA-protein interaction predictions based on improved sequence information, Int. J. Mol. Sci., 20 (2019), 978. https://doi.org/10.3390/ijms20040978 doi: 10.3390/ijms20040978
    [36] H. C. Yi, Z. H. You, M. N. Wang, Z. H. Guo, Y. B. Wang, J. R. Zhou, RPI-SE: a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information, BMC Bioinf., 21 (2020), 60. https://doi.org/10.1186/s12859-020-3406-0 doi: 10.1186/s12859-020-3406-0
    [37] Q. Zhao, H. Yu, Z. Ming, H. Hu, G. Ren, H. Liu, The bipartite network projection-recommended algorithm for predicting long non-coding RNA-protein interactions, Mol. Ther.-Nucleic Acids, 13 (2018), 464–471. https://doi.org/10.1016/j.omtn.2018.09.020 doi: 10.1016/j.omtn.2018.09.020
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2961) PDF downloads(115) Cited by(11)

Article outline

Figures and Tables

Figures(5)  /  Tables(7)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog