DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning

Pingping Sun; Yongbing Chen; Bo Liu; Yanxin Gao; Ye Han; Fei He; Jinchao Ji; Pingping Sun; Yongbing Chen; Bo Liu; Yanxin Gao; Ye Han; Fei He; Jinchao Ji

doi:10.3934/mbe.2019310

Mathematical Biosciences and Engineering

2019, Volume 16, Issue 6: 6231-6241. doi: 10.3934/mbe.2019310

Previous Article Next Article

Research article Special Issues

DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning

1.
School of Information Science and Technology, Northeast Normal University, Changchun, China
2.
School of Physical Education, Northeast Normal University, Changchun, China
3.
School of Information Technology, Jilin Agriculture University, Changchun, China

Received: 01 March 2019 Accepted: 20 June 2019 Published: 04 July 2019

RNA modification plays an indispensable role in the regulation of organisms. RNA modification site prediction offers an insight into diverse cellular processing. Regarding different types of RNA modification site prediction, it is difficult to tell the most relevant feature combinations from a variant of RNA properties. Thereby, the performance of traditional machine learning based predictors relied on the skill of feature engineering. As a data-driven approach, deep learning can detect optimal feature patterns to represent input data. In this study, we developed a predictor for multiple types of RNA modifications method called DeepMRMP (Multiple Types RNA Modification Sites Predictor), which is based on the bidirectional Gated Recurrent Unit (BGRU) and transfer learning. DeepMRMP makes full use of multiple RNA site modification data and correlation among them to build predictor for different types of RNA modification sites. Through 10-fold cross-validation of the RNA sequences of H. sapiens, M. musculus and S. cerevisiae, DeepMRMP acted as a reliable computational tool for identifying N¹-methyladenosine (m¹A), pseudouridine (Ψ), 5-methylcytosine (m⁵C) modification sites.

Keywords:

Citation: Pingping Sun, Yongbing Chen, Bo Liu, Yanxin Gao, Ye Han, Fei He, Jinchao Ji. DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning[J]. Mathematical Biosciences and Engineering, 2019, 16(6): 6231-6241. doi: 10.3934/mbe.2019310

Related Papers:

[1]	Honglei Wang, Wenliang Zeng, Xiaoling Huang, Zhaoyang Liu, Yanjing Sun, Lin Zhang . MTTLm⁶A: A multi-task transfer learning approach for base-resolution mRNA m⁶A site prediction based on an improved transformer. Mathematical Biosciences and Engineering, 2024, 21(1): 272-299. doi: 10.3934/mbe.2024013
[2]	Jianhua Jia, Lulu Qin, Rufeng Lei . DGA-5mC: A 5-methylcytosine site prediction model based on an improved DenseNet and bidirectional GRU method. Mathematical Biosciences and Engineering, 2023, 20(6): 9759-9780. doi: 10.3934/mbe.2023428
[3]	Mingshuai Chen, Xin Zhang, Ying Ju, Qing Liu, Yijie Ding . iPseU-TWSVM: Identification of RNA pseudouridine sites based on TWSVM. Mathematical Biosciences and Engineering, 2022, 19(12): 13829-13850. doi: 10.3934/mbe.2022644
[4]	Xiao Chu, Weiqing Wang, Zhaoyun Sun, Feichao Bao, Liang Feng . An N⁶-methyladenosine and target genes-based study on subtypes and prognosis of lung adenocarcinoma. Mathematical Biosciences and Engineering, 2022, 19(1): 253-270. doi: 10.3934/mbe.2022013
[5]	Sakorn Mekruksavanich, Anuchit Jitpattanakul . RNN-based deep learning for physical activity recognition using smartwatch sensors: A case study of simple and complex activity recognition. Mathematical Biosciences and Engineering, 2022, 19(6): 5671-5698. doi: 10.3934/mbe.2022265
[6]	Sungwon Kim, Meysam Alizamir, Youngmin Seo, Salim Heddam, Il-Moon Chung, Young-Oh Kim, Ozgur Kisi, Vijay P. Singh . Estimating the incubated river water quality indicator based on machine learning and deep learning paradigms: BOD₅ Prediction. Mathematical Biosciences and Engineering, 2022, 19(12): 12744-12773. doi: 10.3934/mbe.2022595
[7]	Sakorn Mekruksavanich, Wikanda Phaphan, Anuchit Jitpattanakul . Epileptic seizure detection in EEG signals via an enhanced hybrid CNN with an integrated attention mechanism. Mathematical Biosciences and Engineering, 2025, 22(1): 73-105. doi: 10.3934/mbe.2025004
[8]	Xinyan Ma, Yunyun Liang, Shengli Zhang . iAVPs-ResBi: Identifying antiviral peptides by using deep residual network and bidirectional gated recurrent unit. Mathematical Biosciences and Engineering, 2023, 20(12): 21563-21587. doi: 10.3934/mbe.2023954
[9]	Huawei Jiang, Tao Guo, Zhen Yang, Like Zhao . Deep reinforcement learning algorithm for solving material emergency dispatching problem. Mathematical Biosciences and Engineering, 2022, 19(11): 10864-10881. doi: 10.3934/mbe.2022508
[10]	Jianhua Jia, Mingwei Sun, Genqiang Wu, Wangren Qiu . DeepDN_iGlu: prediction of lysine glutarylation sites based on attention residual learning method and DenseNet. Mathematical Biosciences and Engineering, 2023, 20(2): 2815-2830. doi: 10.3934/mbe.2023132

Abstract

1. Introduction

Post-transcriptional modification of RNA plays a crucial role in understanding a variety of cellular processes, such as RNA splicing, RNA degradation, protein translation, stability and immune tolerance ^[1]. Among different types of RNA modifications, m¹A modification is related to gene mutation and helps to maintain the stability of mitochondrial tRNA. Pseudouridine modification is critical to the stabilization of tRNA structure, and the splice RNA responsible for gene regulation ^[2,3,4,5]. M⁵C affects RNA structural stability and translation efficiency ^[6,7]. N⁶-methyladenosine (m⁶A) modification involves a variety of important biological processes such as RNA localization and degradation, RNA structural dynamics, cell differentiation and reprogramming ^[8,9,10,11]. The chemical structures of m¹A, pseudouridine and m⁵C modifications are shown in Figure 1 ^[3,12,13]. Because RNA modifications occur on specific nucleotides with functional group changes, to detect RNA modification sites via biological experiments would require much time, money and efforts. Among all RNA modification data, m6A together with the three types are the most accessible data.

Figure 1. Three RNA modification schematics.

DownLoad: Full-Size Img PowerPoint

As an alternative way, computational tools have been published for RNA modification site prediction since 2016. They all combined handcrafted features from RNA sequence analysis with traditional machine learning methods for prediction. Chen et al. developed a m¹A prediction tool called RAMPred based on RNA Chemical Properties (CPs) and Support Vector Machines (SVM) ^[14].Combined on CPs, Nucleotide chemical (NC) and SVM, Wei Chen et al. released a pseudouridine prediction tool called iRNA-PseU ^[15]. He et al. utilized Dinucleotide Composition (DC), NC, Position-Specific Dinucleotide Preferences (PSDP), Position-Specific Nucleotide Preferences (PSNP), Pseudouridine Synthase (PUS) and SVM to build a pseudouridine prediction tool called PseUI in 2018 ^[16]. Qiu et al. presented a m⁵C prediction tool called iRNAm5C-PseDNC by integrating PseDNC, DC and Random Forest (RF) in 2017 ^[17]. Li encoded RNA sequences to one-hot vectors and used RF to implement m⁵C prediction ^[18]. These existing RNA modification site predictors just focused on a single type of RNA modification site prediction. Meanwhile some researchers attempted to develop prediction tools for multiple types of modification sites. Feng et al. published iRNA-PseColl tool to predict methylation of m⁶A, m¹A and m⁵C ^[19]. Chen et al worked out a m⁶A, m¹A, adenosine to inosine methylation predictor by CP, NC and SVM in 2018 ^[20]. These multiple types of RNA modification site of predictors can provide more comprehensive knowledge than single type of predictors. In addition, the performance of traditional machine learning methods highly depends on the effectiveness of feature engineering. However, it is difficult to tell the most relevant feature combinations for specific RNA modification. Deep learning can skip handcrafted features and conduct end-to-end prediction. Through multiple neuron layers and activate functions, it obtains the capacity of mapping from raw input to latent representation, which is trained by labeled data. Under such data-driven model, deep features of RNA sequence related to the semantic information of RNA modification sites would come out. Huang et al. have proved that deep learning has better effects on predicting m⁶A RNA modification sites ^[21]. As the preliminary attempt of deep learning in multiple types of RNA modification site prediction, we developed a model for predicting m¹A, pseudouridine and m⁵C RNA modification sites. To the best of knowledge, this is the first deep learning-based tool for multiple types of RNA modification site prediction. Thanks to the larger scale data in m⁶A type, some researchers have presented deep learning-based predictors to identify this type of RNA modification sites, and achieved excellent improvement.Yet the available data of the three types of RNA modification are still too small to build a deep network independently. In this work, we took advantages of large-scale m⁶A data to pretrain a deep learning model, and then employed transfer learning strategy to fine-tune its network parameters for our targeted types of RNA modifications. Further, a multi-type RNA modification predictor on three species of H. sapiens, M. musculus and S. cerevisiae named DeepMRMP was carried out.

2. Materials and method

2.1. Benchmark datasets

In this study, all positive samples were extracted from the RMABase 2.0 database ^[22], which contains ~1373000 N⁶-Methyladenosines, ~5400 N¹-Methyladenosines, ~9600 pseudouridine modifications, ~1000 5-methylcytosine modifications, and other types from 13 species ^[23]. We randomly retrieved m¹A, pseudouridine and m⁵C data on three species, H. sapiens, M. musculus and S. cerevisiae as positive samples and 10 times the number of other RNA gene fragments as negative samples. The details of our experimental data is shown in Table 1.

Table 1. The number of positive samples for all involving RNA modification from three species.

Modification type	H. sapiens	M. musculus	S. cerevisiae	Total
m¹A	2574	1052	1220	4819
pseudouridine	4128	3320	2122	9570
m⁵C	680	97	211	988

| Show Table

DownLoad: CSV

For fair comparisons, we cut each RNA sequence into a length of 41 RNA fragment, which is the most adopted length in existing tools. Taking m¹A as an example, each RNA fragment in these datasets can be represented as follows:

$\mathrm{R} = N_{1} N_{2} N_{3} \cdots N_{20} X N_{22} \cdots N_{39} N_{40} N_{41}$

(1)

In which the center X is the targeted site, i.e. A (adenine) for m¹A, U (Uracil) for pseudouridine, and C (cytosine) for m⁵C respectively. N₁ to N₂₀ represent the upstreaming flank nucleotides towards target site while N₂₂ to N₄₁ denote its down-streaming flank nucleotides.

In order to validate the generalization of our model, we divided our dataset into 10 folds by random selection. Each fold included training set validation set and testing set with the ratio of 3:1:1. Furthermore, for the purpose of avoiding over-estimation, each fold data was processed by the CD-HIT2D-EST tool to remove sequences with high similarity ^[24,25]. Here we adopted the most stringent threshold 0.8 supporting in CD-HIT2D-EST.

2.2. Encoding of RNA segments

One-hot encoding is one of the most common and effective encoding ways in sequence analysis ^[26,27,28], which projects each sequence to a single vector at Euclidean space. In our work, each RNA sequence was encoded into one-hot vector for further GRU network modeling. In our one-hot encoding, each nucleotide in RNA fragment can be encoded into a four-dimensional matrix such as A = [1,0,0,0], C = [0,1,0,0], G = [0,0,1,0], U = [0,0,0,1].

2.3. Deep network and transfer learning

2.3.1. Deep network structure

Recurrent Neural Network (RNN) is a deep architecture in capable of memorizing contextual information, which is ideal for biological sequence analysis ^[29,30]. GRU as a lite version of RNN showed its effectiveness in predicting m⁶A modification sites ^[21]. The bidirectional version of GRU extracts sequence embedding representation from sequences to capture the potential motifs around the modification sites ^[31]. In our study, we stacked two bidirectional GRU layer with a unit size of 64. Following BGRU layers, we added a dense layer with 64 units to fully connect all latent representations. The activation function of all layers is Relu, which is in capable of generating sparse output and accelerating converge ^[31]. Adam optimizer and 5e-4 learning rate were employed in training procedure ^[32]. The training procedure would stop if the model kept stable during continuous 20 epochs. The details of our deep network can be found in Table 2.

Table 2. Detailed of our deep network.

Layer	Hyper-parameters
	Activation function	units	Dropout
GRU	Relu	64	0.2
GRU	Relu	64	0.2
Dense	Relu	64	0.2
Dense	Softmax	2	0

| Show Table

DownLoad: CSV

2.3.2. Transfer learning strategy

Large scale data is required to understand the latent patterns in modeling a deep network ^[33]. For the situation of relative small data, transfer learning is a promising strategy to span the data gap. It delivers the knowledge from the source domain to the target domain by relaxing the assumption that the training data and the test data must be independent and identically distributed ^[34,35]. We hypothesized some potential motifs distributed cross different types of RNA modification sites, so that we chose to use m⁶A data for per-training to detect such general sequence motif patterns, and then fine-tuned the deep model by the m¹A, pseudouridine, m⁵C methylation data for the corresponding predictors. When fine-tuning the trained model, we set the learning rate to 5e-5, but increased the patient parameter in earlystop operation In doing so, we can make full use of relative small scale data of m¹A, pseudouridine and m⁵C to generate their specific deep features on the basis of the general pretrained model.

2.3.3. Evaluation indicators

In recent studies, four evaluation parameters, Accuracy (Acc), Sensitivity (Sn), Specificity (Sp), and the Matthews correlation coefficient (MCC) have been frequently used to measure the predictor's quality. In this study we also used ROC (receiver operating characteristic) curve, PR (precision-recall) curve and F1 score, which are less affected by the unbalanced data set, to evaluate the performance of predictors. ROC curve reflects the overall relationship between sensitivity and specificity when different thresholds are applied. PRC curve and F1 score reflects the overall relationship between precision and recall.

$\left\{ {\begin{array}{*{20}{c}} {Accuracy = \frac{{TP + TN}}{{TP + TN + FP + FN}}}\\ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {Sn\left( {Recall} \right) = \frac{{TP}}{{TP + FN}}}\\ {\begin{array}{*{20}{c}} {{\rm{Sp}} = \frac{{TN}}{{FP + TN}}}\\ {MCC = \frac{{TP*TN - FP*FN}}{{\sqrt {\left( {TP + FP} \right)\left( {TP + FN} \right)\left( {TN + FP} \right)\left( {TN + FN} \right)} }}} \end{array}} \end{array}}\\ {\begin{array}{*{20}{c}} {Precision = \frac{{TP}}{{TP + FP}}}\\ {F1score = \frac{{2*Precision*Recall}}{{Precision + Recall}}} \end{array}} \end{array}} \end{array}} \right.$

(2)

where TP, TN, FP and FN represent the number of true positive, true negative, false positive and false negative samples, respectively. The larger the area under the AUC and PRC curve, the higher the prediction performance.

Moreover, we used the independent dataset measure the predictive performance of the predictor. The procedure of this validation method is briefly described as follows. First, we train our model by selecting a previously partitioned set of training and validation. This process is repeated 10 times, with each of the 10 subsets used exactly once as the validation data. Last, the 10 results are averaged to obtain a final prediction estimation.

3. Results

3.1. Compare models with and without transfer learning

To measure the effectiveness of the underlying transfer learning, we compared the performance with and without transfer learning. For fair comparison, all classifiers were used under equal conditions, modeling with the same dataset and feature extraction method. Algorithm performance is presented in Figure 2. As shown in Figure 2, when the transfer learning was used, we found that the performance improved significantly. Therefore, transfer learning is used in our predictive model.

Figure 2. ROC and PRC curves of the three RNA modification with and without transfer learning.

DownLoad: Full-Size Img PowerPoint

WebLogo is a commonly used sequence feature analysis tool ^[36,37]. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position.

The modification site of m⁶A, m¹A, pseudouridine and m⁵C is at A, A, T and C. When we drew, the fixed position is removed to magnify the surrounding features. After truncation, the first 10 sites of the x-axis are the sequences before the modification sites, and the 11–20 sites of the x-axis are the sequences after the modification sites. As seen in Figure 3, the characteristics of the m⁶A data used in the pre-training are obvious in the ninth to twelfth. The features of m¹A and the pseudouridine datasets are also concentrated in the ninth to twelfth positions, which is consistent with the assumption of transfer learning algorithm. For the case that the WebLogo map of m⁵C is distributed uniformly, we have two assumptions: (1) The motif of m⁵C are relatively dispersed. (2) 988 m⁵C data are insufficient to extract the features of m⁵C. After experimentation, we found that the AUC of m⁵C model increased by 0.16 after using transfer learning. The results of this experiment show that the features of m⁵C are similar to those of the other three modifications. It is difficult to effectively identify features with 988 m⁵C data using WebLogo.

Figure 3. Four types of modifications WebLogo map.

DownLoad: Full-Size Img PowerPoint

3.2. Compare models using different networks

To measure the effectiveness of the underlying GRU network, we compared its performance with other two commonly used deep learning algorithms, such as CNN network and CNN and network that use both CNN and GRU. For fair comparison, all classifiers were used under equal conditions, modeling with the same dataset and feature extraction method. Three algorithms performances are presented in Figure 4.

Figure 4. ROC and PRC curves of the three RNA modification under different model.

DownLoad: Full-Size Img PowerPoint

In order to compare the ROC and PRC curves of each of the three RNA modification prediction models more clearly, we enlarged the image of the m¹A data. As shown in Figure 4, we have a good performance in m¹A dataset with all three models (AUC = 0.99 and PRC = 0.99). The GRU model achieved a batter performance in ROC and PRC curve compared with the CNN model and CNN-GUR hybrid model. The CNN model has a better effect than the GRU and CNN-GRU hybrid model in m5C dataset. The AUC and PRC of the CNN network, the GRU network and the CNN-GRU hybrid network were 0.79, 0.73, 0.71 and 0.75, 0.72, 0.71 respectively. CNN works better with fewer samples. But when the sample is sufficient, the GRU performance is higher. According to our analysis, CNN networks perform better in small samples because of their simple structure. Due to the large number of memory units, GRU networks need more samples to achieve better performance. The CNN-GRU hybrid model requires the largest number of samples. With the accumulation of m⁵C samples, our model will become more and more reliable.

3.3. Comparison with other tools

In order to further prove its superiority, the predictive results of the proposed method were also compared with the prediction results of the classifiers released in 2018, i.e., iRNA-3typeA, PseUI and RNAm5Cfinder. Table 3 shows the predictive performance of our tool and the performance of the three tools mentioned above when using the same independent test set.

Table 3. Comparison DeepMRMP with other tools.

Types	tools	Acc	precision	recall	SP	F1 score	MCC
M¹A	iRNA-3typeA^[20]	0.5119	0.5060	0.9979	0.0258	0.6715	0.1012
	DeepMRMP	0.9927	0.9887	0.9969	0.9886	0.9928	0.9856
pseudouridine	PseUI^[16]	0.6018	0.5989	0.6165	0.5872	0.6076	0.2038
	DeepMRMP	0.6264	0.6675	0.5036	0.7492	0.5741	0.2608
M⁵C	RNAm5Cfinder^[18]	0.6326	0.7954	0.3571	0.9081	0.4929	0.3179
	DeepMRMP	0.6632	0.7580	0.4795	0.8469	0.5874	0.3510

| Show Table

DownLoad: CSV

As seen in Table 3, among the two m¹A prediction tools, the DeepMRMP outperforms the m¹A predictor in iRNA-3typeA. To be specific, the Acc, precision, recall, Sp, F1 score and MCC of the DeepMRMP are 0.9927, 0.9887, 0.9969, 0.9886, 0.9928 and 0.9856, respectively. All the metrics are higher than m¹A predictor in iRNA-3typeA. When compared with the PseUI, our model showed improvements of 0.0246 of the Acc, 0.0686 of the precision, 0.1620 of the Sp and 0.057 of the MCC on the independent test sets. Our model has four metrics (Acc, recall, F1 score and MCC) higher than RNAm5Cfinder. A higher F1 score and MCC mean that our model is better, and higher recall means that our predicted results contain more positive samples. Our model allows researchers to pre-screen more likely samples before biological experiments, thus saving manpower and material resources.

4. Conclusion

In this study, we proposed a model, DeepMRMP, for accurately and efficiently identifying m¹A, pseudouridine and m⁵C sites in RNA sequences. We compared our model DeepMRMP with the latest m¹A, pseudouridine and m⁵C site prediction model by using independent tests. The results showed that our predictor performed strong robustness and generalization than those of other predictors. Further comparative experiments also exhibited that the outperformance might benefit from our deep network and transfer learning strategy. We believe that DeepMRMP has great potentials and with more data become available, the performance of DeepMRMP could be further improved. The source code of DeepMRMP is available at https://github.com/Chenyb939/DeepMRMP

Acknowledgments

This work is partially supported by National Natural Science Funds of China (Grant No. 61802057), The science and technology research project of "13th Five-Year" of the Education Department of Jilin province under Grant No. JJKH20190290KJ, the scientific research foundation of Jilin Agricultural University and the China Scholarship Council to Fei He.

Conflict of interest

The authors declare no conflict of interest.

References

[1]	S. Dunin-Horkawicz, A. Czerwoniec, M. J. Gajda, et al., MODOMICS: A database of RNA modification pathways, Nucleic Acids Res., 34(2006), D145–D149.
[2]	J. H. Ge and Y. T. Yu, RNA pseudouridylation: New insights into an old modification, Trends Biochem. Sci., 38(2013), 210–218. 2. J. H. Ge and Y. T. Yu, RNA pseudouridylation: New insights into an old modification, Trends Biochem. Sci., 38(2013), 210–218.
[3]	M. Charette and M. W. Gray, Pseudouridine in RNA: what, where, how, and why, IUBMB Life, 49(2010), 341–351. 3. M. Charette and M. W. Gray, Pseudouridine in RNA: what, where, how, and why, IUBMB Life, 49(2010), 341–351.
[4]	D. R. Davis, C. A. Veltri, L. J. J. o. B. S. Nielsen, et al., An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr, J. Biomol. Struct. Dyn., 15(1998), 1121–1132.
[5]	A. Basak and C. Query, A pseudouridine residue in the spliceosome core is part of the filamentous growth program in yeast, Cell Reports, 8(2014), 966–973.
[6]	X. Yang, Y. Yang, B. F. Sun, et al., 5-methylcytosine promotes mRNA export-NSUN2 as the methyltransferase and ALYREF as an m⁵C reader, Cell Res., 27(2017), 606–625.
[7]	M. Frye and F. M. Watt, The RNA methyltransferase Misu (NSun2) mediates Myc-induced proliferation and is upregulated in tumors, Curr. Biol., 16(2006), 971–981.
[8]	X. Wang, Z. Lu, A. Gomez, et al., N⁶-methyladenosine-dependent regulation of messenger RNA stability, Nature, 505(2014), 117–120.
[9]	C. Roost, S. R. Lynch, P. J. Batista, et al., Structure and thermodynamics of N⁶-methyladenosine in RNA: A spring-loaded base modification, J. Am. Chem. Soc., 137(2015), 2107–2115.
[10]	T. Chen, Y. J. Hao, Y. Zhang, et al., m⁶A RNA methylation is regulated by micrornas and promotes reprogramming to pluripotency, Cell Stem Cell, 16(2015), 289–301.
[11]	S. Geula, S. Moshitch-Moshkovitz, D. Dominissini, et al., m⁶A mRNA methylation facilitates resolution of naive pluripotency toward differentiation, Science, 347(2015), 1002–1006.
[12]	X. Li, X. Xiong, K. Wang, et al., Transcriptome-wide mapping reveals reversible and dynamic N1-methyladenosine methylome, Nat. Chem. Biol., 12(2016), 311.
[13]	S. Nachtergaele and C. J. R. B. He, The emerging biology of RNA post-transcriptional modifications, RNA Biol., 14(2016), 156–163.
[14]	W. Chen, P. M. Feng, H. Tang, et al., RAMPred: Identifying the N-1-methyladenosine sites in eukaryotic transcriptomes, Sci. Rep., 6(2016), 31080.
[15]	W. Chen, H. Tang, J. Ye, et al., iRNA-PseU: Identifying RNA pseudouridine sites, Mol. Ther.-Nucl. Acids, 5(2016).
[16]	J. J. He, T. Fang, Z. Z. Zhang, et al., PseUI: Pseudouridine sites identification based on RNA sequence information, BMC Bioinform., 19(2018), 306.
[17]	W. R. Qiu, S. Y. Jiang, Z. C. Xu, et al., iRNAm5C-PseDNC: Identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition, Oncotarget, 8(2017), 41178–41188.
[18]	J. W. Li, Y. Huang, X. Y. Yang, et al., RNAm5Cfinder: A web-server for predicting RNA 5-methylcytosine (m⁵C) sites based on random forest, Sci. Rep., 8(2018).
[19]	P. M. Feng, H. Ding, H. Yang, et al., iRNA-PseColl: Identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC, Mol. Ther.-Nucl. Acids, 7(2017), 155–163.
[20]	W. Chen, P. M. Feng, H. Yang, et al., iRNA-3typeA: Identifying three types of modification at RNA's adenosine sites, Mol. Ther.-Nucl. Acids, 11(2018), 468–474.
[21]	Y. Huang, N. N. He, Y. Chen, et al., BERMP: A cross-species classifier for predicting m⁶A sites by integrating a deep learning algorithm and a random forest approach, Int. J. Biol. Sci., 14(2018), 1669–1677.
[22]	J. J. Xuan, W. J. Sun, P. H. Lin, et al., RMBase v2.0: Deciphering the map of RNA modifications from epitranscriptome sequencing data, Nucleic Acids Res., 46(2018), D327–D334.
[23]	D. Dominissini, S. Moshitch-Moshkovitz, S. Schwartz, et al., Topology of the human and mouse m⁶A RNA methylomes revealed by m⁶A-seq, Nature, 485(2012), U201–U284.
[24]	L. Fu, B. Niu, Z. Zhu, et al., CD-HIT: Accelerated for clustering the next-generation sequencing data, Bioinformatics, 28(2012), 3150–3152.
[25]	W. Z. Li and A. Godzik, Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22(2006), 1658–1659.
[26]	L. Zhu, H. B. Zhang and D. S. J. B. Huang, Direct AUC optimization of regulatory motifs, Bioinformatics, 33(2017), i243.
[27]	H. Zhang, L. Zhu and D. S. J. S. R. Huang, WSMD: Weakly-supervised motif discovery in transcription factor ChIP-seq data, Sci. Rep., 7(2017).
[28]	G. H. Chuai, H. H. Ma, J. F. Yan, et al., DeepCRISPR: Optimized CRISPR guide RNA design by deep learning, Genome Biol., 19(2018).
[29]	Q. Zhang, L. Zhu and D. S. Huang, High-order convolutional neural network architecture for predicting DNA-protein binding sites, IEEE/ACM Transact. Comput. Biol. Bioinform., (2018), 1.
[30]	Q. Zhang, L. Zhu, W. Bao, et al., Weakly-supervised convolutional neural network architecture for predicting protein-DNA binding, IEEE/ACM Transact. Comput. Biol. Bioinform., (2018), 1.
[31]	A. Krizhevsky, I. Sutskever and G. E. Hinton, ImageNet classification with deep convolutional neural networks, NIPS. Curran Assoc. Inc., (2012).
[32]	D. P. Kingma and J. J. C. S. Ba, Adam: A method for stochastic optimization, (2014).
[33]	C. Tan, F. Sun, K. Tao, et al., A survey on deep transfer learning, (2018).
[34]	G. Litjens, T. Kooi, B. E. Bejnordi, et al., A survey on deep learning in medical image analysis, Med. Image Anal., 42(2017), 60–88.
[35]	S. Liang, R. G. Zhang, D. Y. Liang, et al., Multimodal 3D denseNet for IDH genotype prediction in gliomas, Genes, 9(2018).
[36]	L. Zhu, W. L. Guo, C. Lu, et al., Collaborative completion of transcription factor binding profiles via local sensitive unified embedding, IEEE Transact. NanoBiosci., (2016), 1.
[37]	J. X. Wang, L. Chen, Y. Wang, et al., A computational systems biology study for understanding salt tolerance mechanism in rice, Plos One, 8(2013), 177–194.

This article has been cited by:

1.	Matthias R. Schaefer, The Regulation of RNA Modification Systems: The Next Frontier in Epitranscriptomics?, 2021, 12, 2073-4425, 345, 10.3390/genes12030345
2.	Muhammad Tahir, Maqsood Hayat, Kil To Chong, A convolution neural network-based computational model to identify the occurrence sites of various RNA modifications by fusing varied features, 2021, 211, 01697439, 104233, 10.1016/j.chemolab.2021.104233
3.	Kunqi Chen, Bowen Song, Yujiao Tang, Zhen Wei, Qingru Xu, Jionglong Su, João Pedro de Magalhães, Daniel J Rigden, Jia Meng, RMDisease: a database of genetic variants that affect RNA modifications, with implications for epitranscriptome pathogenesis, 2021, 49, 0305-1048, D1396, 10.1093/nar/gkaa790
4.	Chunyan Ao, Liang Yu, Quan Zou, Prediction of bio-sequence modifications and the associations with diseases, 2021, 20, 2041-2649, 1, 10.1093/bfgp/elaa023
5.	Mauno Vihinen, Systematics for types and effects of RNA variations, 2021, 18, 1547-6286, 481, 10.1080/15476286.2020.1817266
6.	Jia Zou, Hui Liu, Wei Tan, Yi-qi Chen, Jing Dong, Shu-yuan Bai, Zhao-xia Wu, Yan Zeng, Dynamic regulation and key roles of ribonucleic acid methylation, 2022, 16, 1662-5102, 10.3389/fncel.2022.1058083
7.	Juexin Wang, Yan Wang, Towards Machine Learning in Molecular Biology, 2020, 17, 1551-0018, 2822, 10.3934/mbe.2020156
8.	Megan L. Van Horn, Anna M. Kietrys, 2021, Chapter 6, 978-3-030-71611-0, 165, 10.1007/978-3-030-71612-7_6
9.	Lian Liu, Bowen Song, Jiani Ma, Yi Song, Song-Yao Zhang, Yujiao Tang, Xiangyu Wu, Zhen Wei, Kunqi Chen, Jionglong Su, Rong Rong, Zhiliang Lu, João Pedro de Magalhães, Daniel J. Rigden, Lin Zhang, Shao-Wu Zhang, Yufei Huang, Xiujuan Lei, Hui Liu, Jia Meng, Bioinformatics approaches for deciphering the epitranscriptome: Recent progress and emerging topics, 2020, 18, 20010370, 1587, 10.1016/j.csbj.2020.06.010
10.	Gangqiang Guo, Kan Pan, Su Fang, Lele Ye, Xinya Tong, Zhibin Wang, Xiangyang Xue, Huidi Zhang, Advances in mRNA 5-methylcytosine modifications: Detection, effectors, biological functions, and clinical relevance, 2021, 26, 21622531, 575, 10.1016/j.omtn.2021.08.020
11.	Muhammad Taseer Suleman, Yaser Daanial Khan, m1A-pred: Prediction of Modified 1-methyladenosine Sites in RNA Sequences through Artificial Intelligence, 2022, 25, 13862073, 2473, 10.2174/1386207325666220617152743
12.	Yifan Wang, Pingxian Zhang, Weijun Guo, Hanqing Liu, Xiulan Li, Qian Zhang, Zhuoying Du, Guihua Hu, Xiao Han, Li Pu, Jian Tian, Xiaofeng Gu, A deep learning approach to automate whole‐genome prediction of diverse epigenomic modifications in plants, 2021, 232, 0028-646X, 880, 10.1111/nph.17630
13.	A. El Allali, Zahra Elhamraoui, Rachid Daoud, Machine learning applications in RNA modification sites prediction, 2021, 19, 20010370, 5510, 10.1016/j.csbj.2021.09.025
14.	Boyang Wang, Wenyu Zhang, MARnet: multi-scale adaptive residual neural network for chest X-ray images recognition of lung diseases, 2022, 19, 1551-0018, 331, 10.3934/mbe.2022017
15.	Shaheena Khanum, Muhammad Adeel Ashraf, Asim Karim, Bilal Shoaib, Muhammad Adnan Khan, Rizwan Ali Naqvi, Kamran Siddique, Mohammed Alswaitti, Gly-LysPred: Identification of Lysine Glycation Sites in Protein Using Position Relative Features and Statistical Moments Via Chou’s 5 Step Rule, 2021, 66, 1546-2226, 2165, 10.32604/cmc.2020.013646
16.	Mohsen Hesami, Milad Alizadeh, Andrew Maxwell Phineas Jones, Davoud Torkamaneh, Machine learning: its challenges and opportunities in plant system biology, 2022, 106, 0175-7598, 3507, 10.1007/s00253-022-11963-6
17.	Sumin Yang, Sung-Hyun Kim, Eunjeong Yang, Mingon Kang, Jae-Yeol Joo, Molecular insights into regulatory RNAs in the cellular machinery, 2024, 56, 2092-6413, 1235, 10.1038/s12276-024-01239-6
18.	Enrico Bortoletto, Umberto Rosani, Bioinformatics for Inosine: Tools and Approaches to Trace This Elusive RNA Modification, 2024, 15, 2073-4425, 996, 10.3390/genes15080996
19.	Irum Aslam, Sajid Shah, Saima Jabeen, Mohammed ELAffendi, Asmaa A. Abdel Latif, Nuhman Ul Haq, Gauhar Ali, A CNN based m5c RNA methylation predictor, 2023, 13, 2045-2322, 10.1038/s41598-023-48751-9
20.	Muhammad Taseer Suleman, Fahad Alturise, Tamim Alkhalifah, Yaser Daanial Khan, m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models, 2024, 17, 1756-0381, 10.1186/s13040-023-00353-x
21.	Mingzhao Wang, Haider Ali, Yandi Xu, Juanying Xie, Shengquan Xu, BiPSTP: Sequence feature encoding method for identifying different RNA modifications with bidirectional position-specific trinucleotides propensities, 2024, 300, 00219258, 107140, 10.1016/j.jbc.2024.107140
22.	Ying Zhang, Fang Ge, Fuyi Li, Xibei Yang, Jiangning Song, Dong-Jun Yu, Prediction of Multiple Types of RNA Modifications via Biological Language Model, 2023, 20, 1545-5963, 3205, 10.1109/TCBB.2023.3283985
23.	Manmeet Kaur, Vandana Singh, Arshiya Khan, Khushboo Sharma, Francisco Jaime Bezerra Mendoonca Junior, Anuraj Nayarisseri, 2025, 9780443275746, 185, 10.1016/B978-0-443-27574-6.00006-0
24.	Chelsea Chen Yuge, Ee Soon Hang, Madasamy Ravi Nadar Mamtha, Shashikant Vishwakarma, Sijia Wang, Cheng Wang, Nguyen Quoc Khanh Le, RNA-ModX: a multilabel prediction and interpretation framework for RNA modifications, 2024, 26, 1467-5463, 10.1093/bib/bbae688
25.	Linhui Zhang, Yuelong Li, Liqing Li, Fei Yao, Maoping Cai, Dingwei Ye, Yuanyuan Qu, Detection, molecular function and mechanisms of m5C in cancer, 2025, 15, 2001-1326, 10.1002/ctm2.70239

Reader Comments

Your name:*

Email:*
© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Mathematical Biosciences and Engineering

3.9

Metrics

Article views(5667) PDF downloads(927) Cited by(25)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(4) / Tables(3)

Mathematical Biosciences and Engineering

DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning

Related Papers:

Abstract

1. Introduction

2. Materials and method

2.1. Benchmark datasets

2.2. Encoding of RNA segments

2.3. Deep network and transfer learning

2.3.1. Deep network structure

2.3.2. Transfer learning strategy

2.3.3. Evaluation indicators

3. Results

3.1. Compare models with and without transfer learning

3.2. Compare models using different networks

3.3. Comparison with other tools

4. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Mathematical Biosciences and Engineering

DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning

Related Papers:

Abstract

1. Introduction

2. Materials and method

2.1. Benchmark datasets

2.2. Encoding of RNA segments

2.3. Deep network and transfer learning

2.3.1. Deep network structure

2.3.2. Transfer learning strategy

2.3.3. Evaluation indicators

3. Results

3.1. Compare models with and without transfer learning

3.2. Compare models using different networks

3.3. Comparison with other tools

4. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog