Research article

Evaluation of open Digital Elevation Models: estimation of topographic indices relevant to erosion risk in the Wadi M’Goun watershed, Morocco

  • Various Global Digital Elevation Models (DEM) are available freely on the Web. The main objective of this work is to evaluate the latest digital elevation models towards the estimation of morphological and topographic erosion parameters in the Wadi M’Goun watershed. We have evaluated multiple DEMs: SRTM (3-arcsec resolution, 90 m), ASTER GDEM (1-arcsec resolution, 30 m), SRTMGL1 V003 (30 m), and ALOS-PALSAR (12.5 m). We have applied for this purpose open source GIS software. To compare and evaluate each DEM, different processing methods have been applied to estimate the Wadi M’Goun watershed characteristics, namely Hypsometry, topographic slope extraction, retrieval of Slope Length and Steepness factor (LS-factor) and topographic wetness index. The accuracy of the ALOS-PALSAR and SRTMGL1 V003 (30 m) DEMs met the requirements applying to the required morphometric parameters. DEMs vertical accuracy has been evaluated by applying the root mean square error (RMSE) metric to DEM elevations vs. actual heights of 353 sample points extracted from an accurate survey-based map (toposheet). The RMSE was 1718 mm for ALOS-PALSAR, 1736 for SRTM 1-arcsec, 1958 for ASTER GDEM 1-arcsec and, 3189 for SRTM 3-arcsec. These results indicate that best accuracy is achieved with the high-resolution of the ALOS PALSAR DEM. This study suggests potential uncertainties in the open-source DEMs, which should be taken into account when estimating topographical and morphometric parameters related to erosion risk in the Wadi M’Goun watershed.

    Citation: Maryam Khal, Abdellah Algouti, Ahmed Algouti, Nadia Akdim, Sergey A. Stankevich, Massimo Menenti. Evaluation of open Digital Elevation Models: estimation of topographic indices relevant to erosion risk in the Wadi M’Goun watershed, Morocco[J]. AIMS Geosciences, 2020, 6(2): 231-257. doi: 10.3934/geosci.2020014

    Related Papers:

    [1] Geovanni Alberto Ruiz-Romero, Carolina Álvarez-Delgado . Effects of estrogens in mitochondria: An approach to type 2 diabetes. AIMS Molecular Science, 2024, 11(1): 72-98. doi: 10.3934/molsci.2024006
    [2] Fumiaki Uchiumi, Makoto Fujikawa, Satoru Miyazaki, Sei-ichi Tanuma . Implication of bidirectional promoters containing duplicated GGAA motifs of mitochondrial function-associated genes. AIMS Molecular Science, 2014, 1(1): 1-26. doi: 10.3934/molsci.2013.1.1
    [3] Naba Hasan, Waleem Ahmad, Feroz Alam, Mahboob Hasan . Ferroptosis-molecular mechanisms and newer insights into some diseases. AIMS Molecular Science, 2023, 10(1): 22-36. doi: 10.3934/molsci.2023003
    [4] Jian Zou, Fulton T. Crews . Glutamate/NMDA excitotoxicity and HMGB1/TLR4 neuroimmune toxicity converge as components of neurodegeneration. AIMS Molecular Science, 2015, 2(2): 77-100. doi: 10.3934/molsci.2015.2.77
    [5] Fumiaki Uchiumi, Akira Sato, Masashi Asai, Sei-ichi Tanuma . An NAD+ dependent/sensitive transcription system: Toward a novel anti-cancer therapy. AIMS Molecular Science, 2020, 7(1): 12-28. doi: 10.3934/molsci.2020002
    [6] Yutaka Takihara, Ryuji Otani, Takuro Ishii, Shunsuke Takaoka, Yuki Nakano, Kaori Inoue, Steven Larsen, Yoko Ogino, Masashi Asai, Sei-ichi Tanuma, Fumiaki Uchiumi . Characterization of the human IDH1 gene promoter. AIMS Molecular Science, 2023, 10(3): 186-204. doi: 10.3934/molsci.2023013
    [7] Amena W. Smith, Swapan K. Ray, Arabinda Das, Kenkichi Nozaki, Baerbel Rohrer, Naren L. Banik . Calpain inhibition as a possible new therapeutic target in multiple sclerosis. AIMS Molecular Science, 2017, 4(4): 446-462. doi: 10.3934/molsci.2017.4.446
    [8] Dora Brites . Cell ageing: a flourishing field for neurodegenerative diseases. AIMS Molecular Science, 2015, 2(3): 225-258. doi: 10.3934/molsci.2015.3.225
    [9] Giulia Ambrosi, Pamela Milani . Endoplasmic reticulum, oxidative stress and their complex crosstalk in neurodegeneration: proteostasis, signaling pathways and molecular chaperones. AIMS Molecular Science, 2017, 4(4): 424-444. doi: 10.3934/molsci.2017.4.424
    [10] Tsuyoshi Inoshita, Yuzuru Imai . Regulation of vesicular trafficking by Parkinson's disease-associated genes. AIMS Molecular Science, 2015, 2(4): 461-475. doi: 10.3934/molsci.2015.4.461
  • Various Global Digital Elevation Models (DEM) are available freely on the Web. The main objective of this work is to evaluate the latest digital elevation models towards the estimation of morphological and topographic erosion parameters in the Wadi M’Goun watershed. We have evaluated multiple DEMs: SRTM (3-arcsec resolution, 90 m), ASTER GDEM (1-arcsec resolution, 30 m), SRTMGL1 V003 (30 m), and ALOS-PALSAR (12.5 m). We have applied for this purpose open source GIS software. To compare and evaluate each DEM, different processing methods have been applied to estimate the Wadi M’Goun watershed characteristics, namely Hypsometry, topographic slope extraction, retrieval of Slope Length and Steepness factor (LS-factor) and topographic wetness index. The accuracy of the ALOS-PALSAR and SRTMGL1 V003 (30 m) DEMs met the requirements applying to the required morphometric parameters. DEMs vertical accuracy has been evaluated by applying the root mean square error (RMSE) metric to DEM elevations vs. actual heights of 353 sample points extracted from an accurate survey-based map (toposheet). The RMSE was 1718 mm for ALOS-PALSAR, 1736 for SRTM 1-arcsec, 1958 for ASTER GDEM 1-arcsec and, 3189 for SRTM 3-arcsec. These results indicate that best accuracy is achieved with the high-resolution of the ALOS PALSAR DEM. This study suggests potential uncertainties in the open-source DEMs, which should be taken into account when estimating topographical and morphometric parameters related to erosion risk in the Wadi M’Goun watershed.


    With the advent of technologies allowing for large-scale, high throughput data, a much clearer understanding of the genomic mechanisms behind gene regulation have been gained. The scientists found that there are unexpected far more noncoding RNAs comparing with protein-coding genes and, and these noncoding regions play important roles in determining the complexity observed in the human genome [1,2]. Within these noncoding regions, long noncoding RNAs (lncRNAs), which are functionally defined as noncoding regions of RNA that are at least 200 base-pairs in length, have attracted lots of attention. Certain lncRNAs appear to act locally, while others have more distal regulatory effects, even acting across multiple chromosomes [3]. Many studies have identified specific functions of particular lncRNAs, including embryonic mechanisms, cell cycle functions, innate immunity, and disease processes. However, there are still thousands of lncRNAs have no identified functions [1,3,4,5,6]. Some studies have been performed that produce relatively few numbers of lncRNA functions [7], and have shown that the function of lncRNAs is highly cell-type-specific: one lncRNA may inhibit particular genes in one type of cell while promoting the same gene in another. This phenomenon makes it even more difficult to identify lncRNA functions on a large scale. Due to this specificity, researchers propose that future lncRNA studies should be performed on specific cell types to identify particular regulatory mechanisms.

    One of the most prominent and intriguing applications of lncRNA regulatory investigation comes from cancer studies [8,9]. It has been shown that lncRNAs appear to have high connectivity with numerous diseases, especially cancer. Because of the highly cell type-specific nature of lncRNA regulatory functions and the irregularity of cancer cell genetic information, studying lncRNA regulation in specific cancer types may provide promising insight into specific genomic regulations of common cancer cells. In a few documented cases, specific lncRNAs have been shown to be significantly differentially expressed in specific cancer types, such as prostate cancer and breast cancer [1]. For these reasons, it seems appropriate to further investigate lncRNA-gene interactions in particular cancer cells.

    The wealth of gene expression datasets available provides an opportunity to computationally identify co-expressed gene modules(CEMs), each of which is defined as a highly structured expression pattern on a specific gene set [10,11]. These CEMs tend to be functionally related or co-regulated by the same transcriptional regulatory signals (e.g., transcription factors, lncRNA and so on) under a specific condition or in a particular disease cell type. Overall, successful derivation of the CEMs may grant a higher-level interpretation of large-scale gene expression data, improve functional annotation of condition-specific gene activities, facilitate inference of gene regulatory relationships, hence, provide a better mechanism level understanding of complex diseases.

    The computational identification of CEMs can be solved by a biclustering approach [12], which is a two-dimensional data mining technique that simultaneously identifies co-expressed genes under a subset of conditions. a high proportion of enriched biclusters on real datasets. Within this study, we try to identify new lncRNA-gene interactions and transcription factor-lncRNA partnerships from cancer RNA-seq data using a biclustering approach. The biclustering method will allow for the identification of particular expression patterns across multiple datasets, indicating networks of lncRNA and gene interactions. This developed method will also provide a framework for future lncRNA interaction studies. We applied this method on two sets of TCGA breast cancer RNA-seq data to generated CEMs based on known lncRNA-gene interactions. Then, the predicted CEMs are linked to lncRNA by a statistic p-value and the new lncRNA-gene relationship are generated. The evaluation on the predicted results showed that the pipeline can find some target genes for given lncRNA, and meanwhile the performance still has some space to be improved. We further conducted a TF motif analysis on the predicted CEMs and provide potential regulation cooperation between TFs and lncRNAs. The related original data with codes, results and supplementary data can be downloaded on https://github.com/IvesG/sGavin.git.

    Two sets of TCGA (The Cancer Genome Atlas) breast cancer RNA-seq data, one from the normal cell (referred as normal data) and the other from tumor cell (referred as tumor data) were downloaded from https://portal.gdc.cancer.gov/. The normal and tumor data consist of 113 and 1091 samples, respectively. And of the 113 normal samples, 112 of them are from the same patient among the tumors. Both datasets contain 60,483 genes, among which there are 19,824 protein-coding genes and 7,399 long intergenic noncoding RNAs (lincRNAs) genes. The RNA-seq data are all Upper Quartile normalized FPKM (UQ-FPKM) values.

    A total of 1,081 experimentally validated lncRNA-associated regulatory entries were downloaded from LncReg [13], describing the comprehensive regulatory relationships among 258 lncRNAs and 571 genes. All these relationships were manually collected from PubMed with focus on the data generated by laboratory methods, and can be categorized into up/down/active/inactive based on regulatory relationships or transcription/post-transcription/translation/post-translation based on regulatory mechanisms.

    As we focus on lncRNA-gene interactions, the relationships downloaded from LncReg were filtered to retain only relationships describing genes regulated by lncRNAs with specified species information (constrained to Homo sapiens and Mus musculus), resulting 925 relationships in total for the downstream analysis, covering interactions between 309 unique human genes and 103 human lncRNAs, as well as between 199 mouse genes and 100 mouse lncRNAs. It is noteworthy that these 925 relationships include 28 post-transcriptional regulations, 41 post-translational regulations, 714 transcriptional regulations, 23 translational regulations, 1 transcriptional & translational regulation and 118 unspecified relationships.

    As the table from LncReg [13] only provides gene symbols, while the RNA-seq dataset uses Ensembl ID as gene's identifiers, we use Ensembl BioMart [14] to match gene symbols with Ensembl IDs for all the genes and lncRNAs. Then we got orthologous genes between mouse and human also using BioMart; we found orthologous human genes for all 199 mouse genes, and 38 overlapped with original human genes. For convenience, we recorded human genes, mouse genes that don't overlap with human genes, human lncRNAs and mouse lncRNAs that don't overlap with human lncRNAs as HG, MG, HL, and ML, respectively.

    We combined the normal and tumor RNA-seq dataset together, then extracted expression values for all the HG, MG, HL, ML, protein-coding genes (PC, the remaining protein-coding genes except HG and MG) and lincRNAs (linc, the remaining lincRNA except HL and ML). Taking the genes as rows and the conditions as columns, we obtained the RNA-seq expression matrix on which biclustering will be performed to detect CEMs.

    QUBIC is a biclustering analysis tool designed for co-expression analyses of genes based on their gene-expression patterns under multiple conditions. The software can generally identify all statistically significant groups, or biclusters, of genes with similar expression patterns under at least a specific number of experimental conditions, which tend to be more sensitive and more specific than other biclustering tools [15]. We use a quantile-based discretization method of QUBIC to generate a qualitative representing matrix for the RNA-seq expression matrix. Then we extracted the rows of known lncRNA regulated HG and MG from this representing matrix as seed 1 and HG, MG, HL, and ML rows as seed 2. Next bi-clustering analysis was performed on these two seeds to predict co-expressed gene modules (CEMs) in the qualitative representing matrix, respectively.

    For an identified CEMs, we calculated the P-value of a bicluster enriched with genes regulated by a lncRNA using the hypergeometric function [16],

    where r is the number of genes in a CEMs (with size n) that regulated by certain lncRNA, N is the total number of known lncRNA regulated genes in the whole genome, K is the number of genes regulated by that lncRNA in the whole genome.

    We assumed that, if the known target genes of a given lncRNA are highly covered by a CEM with a significant p-value, the other genes in this CEM have high possibilities regulated by the given lncRNA. Thus, we used the smallest P-value for all possible lncRNAs as the p-value of the current bicluster and the relationships between lncRNA and genes in the bicluster are predicted.

    To evaluate the performance of the new methods on the prediction of new relationships between lncRNA and genes, we randomly separate seed2 into two parts with equal size named seedpart1 and seedpart2, for multiple times. Then bi-clustering analysis will be performed on seedpart2 to predict co-expressed gene modules (CEMs). For seedpart1 we find its part which is covered by co-expressed gene modules (CEMs) from seedpart2. We calculate the cover ratios by the size of seedpart1 to be divided by the size of the covered part by CEMs generated from seedpart2. Also, we calculate the p-values for the coverage rates to present the statistical significance of them.

    We choose several significant CEMs with sizes or conditions below 100, to conducted TF motif analysis. The promoter regions of the corresponding genes are inputted into the sub-routine findMotifs.pl of Homer [17], respectively. The script findMotifs.pl can firstly search for the upstream promoter sequences of a certain length automatically, and then perform motif finding on the promoters. For each run of findMotifs.pl on the datasets, we let the program output at most 5 top-ranking motifs, i.e. there will be up to 5 motifs discovered by findMotifs.pl for each CEMs. To evaluate the validity of the discovered motifs, findMotifs.pl automatically compares the similarity between the discovered motif profiles and the motif profiles archived in JASPAR [18] v2018 (http://jaspar.genereg.net/) under its default parameter setting. For each discovered motif having similarity with at least one motif archived in JASPAR, we present its motif logo as well as the information of its most similar motif in JASPAR.

    All the known interactions between lncRNAs and genes are showcased in Figure 1A. The related data can be download from https://github.com/IvesG/sGavin.git data/LncReg0419 and more details are written in data/readme.txt. In figure 1A, dark-blue nodes represent LncRNAs, light-blue nodes represent proteins, pink edges represent interactions documented in Homo, green edges represent interactions documented in Mus, orange edges represent interactions documented in both Homo and Mus. Meanwhile, there are some labels on the edges, categorized based on regulatory mechanisms including PTL (post-translational regulation), TC (transcriptional regulation), PTC (post-transcriptional regulation), TL (translational regulation), and NS (not sure). The distribution above is displayed in Figure 1B and nearly three fourth (714/925) of them are identified at the transcriptional level. Other labels on the edges are categorized based on regulatory relationships including down, up, active and inactive. The distribution above is displayed in Figure 1C. The down relationships (575) are more than up relationships (308), and the proportion of active/inactive is scarce (4.5%).

    Figure 1.  Correlation Analysis of the known interactions of genes and lncRNA. (A) The network of known interactions between genes and lncRNA. The distribution of the regulatory mechanisms of the network. (C) The distribution of the regulatory relationships of the network, (D & E) The number of lncRNA that regulate genes in the scatter diagram and bar chart respectively.

    Figure 1E showed the distribution of a number of genes regulated by each lncRNA. It can be found that most lncRNA (~78%) regulate less than 5 genes. To show the specific details of the number of genes regulated by each lncRNA, Figure 1D is made, each point in the Figure 1D reflect the number of lncRNA (horizontal coordinate) that regulate certain number of genes (longitudinal coordinates) e.g. the point with coordinate (4, 14) in Figure 1D indicate that there are 14 lncRNA and each of them regulate 4 genes. The lncRNA that regulate more genes in Figure 1D belongs to the more concentrated parts in Figure 1(A).

    With the quantile-based discretization method and biclustering analysis, there are some co-expressed gene modules (CEMs) are found. The details of the way we identify CEMs are showcased in Figure 2C. Figure 2A shows the number of co-expressed gene modules(CEMs) we have got from seed1 and seed2 processed by max(min [19])-based (QUBIC1.0, [15,20]) and KL-based bi-clustering analysis (QUBIC2.0 [21]) respectively. And the distributions of numbers of genes and conditions for each CEM can be found in Figure 2D and Figure 2E. For instance, label seed1_1genes represent that Qubic1.0 is performed on seed1. To better illustrate the distributions, we have constrained the number of each gene and the number of each condition below 100 (in Figure 2B). It is found that the KL-based biclustering method tends to generate CEMs that contain fewer genes while more conditions than max(min [22])-based biclustering method from Figure 2D and Figure 2E. The difference between the results (whether the distribution of genes sizes and condition sizes or the number of CEMs that we have predicted) we get from seed1 and seed2 is subtle.

    Figure 2.  Correlation Analysis of the predicted co-expressed gene modules. (A) The number of CEMs that we have obtained. (B) The number of CEMs with size (both numbers of conditions and genes) constrained below 100. (C) The flow-process diagram describing the way we find CEMs. (D) The distribution of numbers of conditions of CEMs. (E) The distribution of the number of genes of CEMs.

    The proportion of CEMs that have significant P-values (below a pre-selected P-value cutoff) as well as proportions of the number of unique enriched lncRNA in each bicluster that belong to certain categories (i.e., number of lncRNA = 2, 3 or > 3) are calculated and shown in Figure 3A and Figure 3B. In the figures, Seed1_qubic1 represent the proportions from the results obtained using quantile discretization and using max(min [23])-based biclustering on seed 1, seed1_qubic2 represent using quantile discretization and using KL-based biclustering on seed1. In Figure 3A, it can be found that most CEMs have P-value more than 0.001 and seed1_qubic1 seems to have more significant P-value. Constrain P-value below 0.00001 and there is barely CEMs remained (less than 7%). In Figure 3B the majority of (more than70%) CEMs are with enriched lncRNA more than 3 and especially most (around 85%) of CEMs from seed1_qubic1.

    Figure 3.  Correlation Analysis of the enriched lncRNA and P-value in CEMs. (A)Proportions of CEMs that significantly enriched with lncRNAs and proportions of the number of enriched lncRNA for seed1. (B)Proportions of CEMs that significantly enriched with lncRNAs and proportions of the number of enriched lncRNA for seed2.

    For validation, we separated the HG + MG genes into two parts randomly and equally for 10 times and obtained 10 cover ratios correspondingly to check the accuracy of the previously predicted genes. The results of our validation are calculated and shown in Table 1. From Table 1 it can be found that all of the cover ratios are under 25% and the average ratio is 16.25%. We further calculated the p-value of the coverage rates. The results indicated that even the coverage date has a lot of space to be improved, the statistical significance of them are acceptable.

    Table 1.  Groups refer to genes that we extracted.
    Group 1 2 3 4 5 6 7 8 9 10
    Ratio 14.00% 19.60% 14.00% 15.70% 14.50% 21.30% 9.80% 24.30% 14.00% 15.30%
    P-value 7.58E-07 5.14E-14 7.58E-07 8.11E-09 2.56E-07 1.24E-16 5.37E-03 1.27E-21 7.58E-07 2.65E-08

     | Show Table
    DownLoad: CSV

    Since lncRNA plays an important role in regulation, they should have cooperation with transcription factor [23,24]. Thus we conduct the analysis about the DNA binding sites of related to CEMs [19,22,25]. As described in the Method section, we choose five CEMs to conducted TF motif analysis. The corresponding gene list files each containing 3, 6, 14, 20 and 21 genes. The predicted motifs and the comparison between them and JASPAR motifs are listed in Table 2, along with the function of the target TFs. In Table 2, the second column has the name of lncRNA related with this CEMs and the p-value of their correlations; The third column contains the motif consensus by Homer; The fourth column provides TF names of the most similar motifs in JASPAR, along with the similarity scores in the fifth column. These TFs may have cooperation with corresponding lncRNAs. In the first column, all the P-value of the LncRNA from the CEM is below 0.01 and the least P-value is from LncRNA HOTAIR. The supplementary table S1 with more details, including and the logo of discovered motifs and the functions of corresponding TFs, can be downloaded by visiting the GitHub link.

    Table 2.  Comparison between discovered motifs and JASPAR motifs.
    lncRNA (p-value) Homer motifs JASPAR TFs Scores
    1 FOXCUT
    3.6e-3
    AACCAVTTHDCG TFCP2 0.64
    TCCTATCACACR MEIS2 0.62
    TTTTHAAAGGGG CHR 0.67
    ARTGGTTGTWGA FOXJ2 0.58
    GCAATCTCGC IRF4 0.66
    2 ANCR
    1.1e-3
    AGGGTGACAG SPZ1 0.80
    GGTATCTTAC GATA5 0.64
    CTCATAGGAG GCM1 0.65
    TAAGTGAAAG PRDM1 0.86
    CTTTTGGAAC CHR 0.65
    3 250-280
    2.2e-4
    WYTRTCTTTGCG RXR 0.61
    TCTTACGG ELK1 0.71
    GGCAAGGA SD 0.76
    GAGGTATGTT TEAD1 0.70
    TGCCGGGAGCGT POL 0.64
    4 HOXD-AS1
    6.1e-3
    CTCGAGTAGG PB0114 0.63
    GCCCCCTGCA PB0076 0.74
    ACGYMYATKYCC GFY 0.59
    AGCGGGTT PH 0.68
    AGGCGCCGCGCC SP1 0.69
    5 HOTAIR
    5e-6
    TGGCGCAGCGCG PB 0.67
    GTACAACTTT PB 0.66
    CMTSTGTCWCYK NeuroG2 0.66
    GTGATCCATT RHOXF1 0.68
    GGTMGRRGTGMW TBX20 0.58

     | Show Table
    DownLoad: CSV

    In order to further evaluate the biological significance of the identified CEMs, we tested the enrichment of the genes in each CEM in Gene ontology terms and KEGG pathways using clusterProfiler package of R project BioConductor under q-value cutoff 0.05, of which the description of the GO terms and KEGG pathways that the CEMs are enriched in are presented in Table 3 and Table 4 respectively. And the supplementary table S2 with more details, including original and adjusted P-value, proportion of the matched genes, gene's ID, etc., can be downloaded on GitHub link.

    Table 3.  Gene ontology information of selected CEMs.
    LncRNA ID Description q-value
    FOXCUT GO:0033613 transmembrane receptor protein tyrosine kinase activity 1.2937E-02
    GO:0033613 activating transcription factor binding 1.2937E-02
    GO:0019199 transmembrane receptor protein kinase activity 1.2937E-02
    GO:0001085 RNA polymerase Ⅱ transcription factor binding 2.0767E-02
    250-280 GO:0003735 structural constituent of ribosome 3.1600E-06
    GO:0003729 mRNA binding 1.1140E-03
    GO:0008483 transaminase activity 1.1140E-03
    GO:0048027 mRNA 5'-UTR binding 1.1140E-03
    GO:0016769 transferase activity, transferring nitrogenous groups 1.1140E-03
    GO:0045182 translation regulator activity 1.5826E-03
    GO:0030170 pyridoxal phosphate binding 1.5826E-03
    GO:0070279 vitamin B6 binding 1.5826E-03
    GO:0019843 rRNA binding 1.5826E-03
    GO:0019842 vitamin binding 3.1903E-03
    HOXD-AS1 GO:0004714 transmembrane receptor protein tyrosine kinase activity 1.2937E-02
    GO:0033613 activating transcription factor binding 1.2937E-02
    GO:0019199 transmembrane receptor protein kinase activity 1.2937E-02
    GO:0001085 RNA polymerase Ⅱ transcription factor binding 2.0767E-02
    HOTAIR GO:0005109 frizzled binding 1.9097E-03
    GO:0001227 transcriptional repressor activity, RNA polymerase Ⅱ transcription regulatory region sequence-specific binding 1.9097E-03
    GO:0001664 G-protein coupled receptor binding 1.9686E-03
    GO:0001078 transcriptional repressor activity, RNA polymerase Ⅱ core promoter proximal region sequence-specific binding 1.1196E-02
    GO:0008201 heparin binding 1.2885E-02
    GO:0005539 glycosaminoglycan binding 1.8790E-02
    GO:1901681 sulfur compound binding 1.9016E-02
    GO:0045236 CXCR chemokine receptor binding 2.1785E-02
    GO:0008301 DNA binding, bending 2.1785E-02
    GO:0001223 transcription coactivator binding 2.1785E-02
    GO:0042813 Wnt-activated receptor activity 2.1785E-02
    GO:0035198 miRNA binding 2.2807E-02
    GO:0017147 Wnt-protein binding 2.5258E-02
    GO:1990841 promoter-specific chromatin binding 2.5258E-02
    GO:0000982 transcription factor activity, RNA polymerase Ⅱ core promoter proximal region sequence-specific binding 2.5258E-02
    GO:0001221 transcription cofactor binding 2.5587E-02

     | Show Table
    DownLoad: CSV
    Table 4.  KEGG pathway information of selected CEMs.
    LncRNA ID Description q-value
    FOXCUT hsa05216 Thyroid cancer 8.8587E-03
    hsa04510 Focal adhesion 8.8587E-03
    hsa05205 Proteoglycans in cancer 8.8587E-03
    hsa05218 Melanoma 1.7683E-02
    hsa05214 Glioma 1.7683E-02
    hsa04151 PI3K-Akt signaling pathway 1.8745E-02
    hsa05215 Prostate cancer 1.8745E-02
    hsa01522 Endocrine resistance 1.8745E-02
    hsa04919 Thyroid hormone signaling pathway 2.2317E-02
    hsa04152 AMPK signaling pathway 2.2317E-02
    hsa04068 FoxO signaling pathway 2.4442E-02
    hsa04550 Signaling pathways regulating pluripotency of stem cells 2.5129E-02
    hsa05224 Breast cancer 2.5129E-02
    hsa04218 Cellular senescence 2.7471E-02
    hsa05225 Hepatocellular carcinoma 2.7471E-02
    hsa04530 Tight junction 2.7471E-02
    250-280 hsa03010 Ribosome 2.9900E-05
    hsa01210 2-Oxocarboxylic acid metabolism 3.7322E-03
    hsa00220 Arginine biosynthesis 3.7322E-03
    hsa00250 Alanine, aspartate and glutamate metabolism 4.7849E-03
    hsa01230 Biosynthesis of amino acids 7.9156E-03
    HOXD-AS1 hsa05216 Thyroid cancer 8.8587E-03
    hsa04510 Focal adhesion 8.8587E-03
    hsa05205 Proteoglycans in cancer 8.8587E-03
    hsa05218 Melanoma 1.7683E-02
    hsa05214 Glioma 1.7683E-02
    hsa04151 PI3K-Akt signaling pathway 1.8745E-02
    hsa05215 Prostate cancer 1.8745E-02
    hsa01522 Endocrine resistance 1.8745E-02
    hsa04919 Thyroid hormone signaling pathway 2.2317E-02
    hsa04152 AMPK signaling pathway 2.2317E-02
    hsa04068 FoxO signaling pathway 2.4442E-02
    hsa04550 Signaling pathways regulating pluripotency of stem cells 2.5129E-02
    hsa05224 Breast cancer 2.5506E-02
    hsa04218 Cellular senescence 2.7471E-02
    hsa05225 Hepatocellular carcinoma 2.7471E-02
    hsa04530 Tight junction 2.7471E-02
    HOTAIR hsa04310 Wnt signaling pathway 5.7295E-03

     | Show Table
    DownLoad: CSV

    Within this study, we have developed a method for elucidating lncRNA-gene and transcription factor-lncRNA interactions using a biclustering approach. The method was performed on 2 breast cancer RNA-seq datasets from TCGA. The bicluster method allows for the identification of particular expression patterns across multiple datasets, indicating networks of lncRNA and gene interactions. The developed method will also provide a way for future lncRNA interaction studies. Certainly, the predict performance still far from satisfactory, which is not unexpected since we only used RNA-Seq data. Actually, the interaction mechanism between lncRNA and genes are far more complex, and more data should be involved if we want to capture the whole picture of them. We are planning to include some other data, like proteomics and chromatin accessibility information, to improve the prediction. Besides, the evaluation on the relationship between lncRNA and predicted CEMs also has the potential to be improved, e.g. calculating the adjusted P-value or overall P-value in place of the original P-values used in this study. In view of the application, we will work on more specific examples of the regulatory functions of some particular lncRNAs and identify some hypothesized mechanisms of these regulatory functions. Also, the further analysis of the difference of lncRNA related genes between tumor and normal samples could provide more information for studying the process and mechanism of cancer occurrence and development, e.g. determination of the stage of developed tumors, which will be our concern in the future research.

    This work was supported by the National Nature Science Foundation of China (NSFC) [61772313 and 61432010], Young Scholars Program of Shandong University [YSPSDU, 2015WLJH19], the Innovation Method Fund of China (2018IM020200), and Shanghai Municipal Science and Technology Major Project (2018SHZDZX01) and ZHANGJIANGLAB. Qin Ma's work was supported by an R01 Award from the National Institute of General Medical Sciences of the National Institutes of Health [GM131399-01]. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation [ACI-1548562]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health and the National Science Foundation.

    All authors declare no conflicts of interest in this paper.



    [1] Forsberg R (1984) A study of terrain reductions, density anomalies and geophysical inversion methods in gravity field modelling. No. OSU/DGSS-355, Ohio State University.
    [2] Müller-wohlfeil DI, lahmer W, krysanova V, et al. (1996) Topography-based hydrological modeling in the Elbe River drainage basin. In: Third International Conference/Workshop on Integrating GIS and Environmental Modeling, National Center for Geographic Information and Analysis, C.A, Santa Fe.
    [3] Mark DM, Smith B (2004) A science of topography: from qualitative ontology to digital representations. In: Bishop MP, Shroder JF (Eds.), Geographic Information Science and Mountain Geomorphology, Springer-Praxis, Chichester, England, 75-97.
    [4] Khal M, Algouti Ab, Algouti A (2018) Modeling of Water Erosion in the M'Goun Watershed Using OpenGIS Software. In: World Academy of Science, Engineering and Technology International Journal of Computer and Systems Engineering, 12: 1102-1106.
    [5] Mcluckie D, NFRAC (2008) Flood risk management in Australia. Aust J Emerg Manag 23: 21-27.
    [6] Ait Mlouk M, Algouti Ab, Algouti Ah, et al. (2018) Assessment of river bank erosion in semi-arid climate regions using remote sensing and GIS data: a case study of Rdat River, Marrakech, Morocco. Estud Geol 74: 81. doi: 10.3989/egeol.43217.493
    [7] Williams J (2009) Weather Forecasting. The AMS Weather Book: The Ultimate Guide to America's Weather. American Meteorological Society, Boston, MA. doi: 10.1007/978-1-935704-55-3
    [8] Da ros D, Borga M (1997) Use of digital elevation model data for the derivation of the geomorphological instantaneous unit hydrograph. Hydrol Process 11: 13-33. doi: 10.1002/(SICI)1099-1085(199701)11:1<13::AID-HYP400>3.0.CO;2-M
    [9] Tesfa TK, Tarboton DG, Watson DW, et al. (2011) Extraction of hydrological proximity measures from DEMs using parallel processing. Environ Model Softw 26: 1696-1709. doi: 10.1016/j.envsoft.2011.07.018
    [10] Jobin T, Prasannakumar V (2015) Comparison of basin morphometry derived from topographic maps, ASTER and SRTM DEMs: an example from Kerala, India. Geocarto Int 30: 346-364. doi: 10.1080/10106049.2014.955063
    [11] Kishan SR, Anil KM, Vinay KS, et al. (2012) Comparative evaluation of horizontal accuracy of elevations of selected ground control points from ASTER and SRTM DEM with respect to CARTOSAT-1 DEM: a case study of Shahjahanpur district, Uttar Pradesh, India. Geocarto Int 28: 439-452.
    [12] Tian Y, Lei S, Bian Z, et al. (2018) Improving the Accuracy of Open Source Digital Elevation Models with Multi-Scale Fusion and a Slope Position-Based Linear Regression Method. Remote Sens 10: 1861. doi: 10.3390/rs10121861
    [13] Cuartero A, Felicsimo AM, Ariza FJ (2004) Accuracy of DEM generation from TERRA-ASTER stereo data. Int Arch Photogramm Remote Sens 35: 559-563.
    [14] Day T, Muller J (1988) Quality assessment of digital elevation models produced by automatic stereo-matchers from SPOT image pairs. Photogramm Rec 12: 797-808. doi: 10.1111/j.1477-9730.1988.tb00630.x
    [15] Fujisada H (1994) Overview of ASTER instrument on EOS-AM1 platform. In: Proceedings of SPIE, 2268: 14-36. doi: 10.1117/12.185838
    [16] Toutin T (2008) ASTER DEMs for geomatic and geoscientific applications. Int J Remote Sens 29: 1855-1875. doi: 10.1080/01431160701408477
    [17] Bolstad PV, Stowe T (1994) An evaluation of DEM accuracy: elevation, slope, and aspect. Photogramm Eng Remote Sens 60: 1327-1332.
    [18] Blöschl G, Sivapalan M (1995) Scale issues in hydrological modelling: A review. Hydrol Process 9: 251-290. doi: 10.1002/hyp.3360090305
    [19] Vijith H, Seling LW, Dodge-Wan D (2015) Comparison and Suitability of SRTM and ASTER Digital Elevation Data for Terrain Analysis and Geomorphometric Parameters: Case Study of Sungai PatahSubwatershed (Baram River, Sarawak, Malaysia). Environ Res Eng Manag 71: 23-35. doi: 10.5755/j01.erem.71.3.12566
    [20] Beven KJ, Moore ID (1993) Terrain analysis and distributed modelling in hydrology. New York: Wiley.
    [21] Wang XH, Yin ZY (1998) A comparison of drainage networks derived from digital elevation models at two scales. J Hydrol 210: 221-241. doi: 10.1016/S0022-1694(98)00189-9
    [22] Wang W, Yang X, Yao T (2012) Evaluation of ASTER GDEM and SRTM and their suitability in hydraulic modelling of a glacial lake outburst flood in southeast Tibet. Hydrol Process 26: 213-225. doi: 10.1002/hyp.8127
    [23] Nikolakopoulos KG, Kamaratakis EK, Chrysoulakis N (2006) SRTM vs ASTER elevation products. Comparison for two regions in Crete, Greece. Int J Remote Sens 27: 4819-4838. doi: 10.1080/01431160600835853
    [24] Pryde JK, Osorio J, Wolfe ML, et al. (2007) USGS. An ASABE Meeting Presentation Paper Number: 072093, Minneapolis Convention Center Minneapolis, Minnesota, June; 072093.
    [25] Jing C, Shortridge A, Lin S, et al. (2014) Comparison and validation of SRTM and ASTER GDEM for a subtropical landscape in Southeastern China. Int J Digit Earth 7: 969-992. doi: 10.1080/17538947.2013.807307
    [26] Dewitt JD, Warner TA, Conley JF (2015) Comparison of DEMS derived from USGS DLG, SRTM, a statewide photogrammetry program, ASTER GDEM and LiDAR: implications for change detection. GIScience Remote Sens 52: 179-197. doi: 10.1080/15481603.2015.1019708
    [27] Moudrý V, Lecours V, Gdulová K, et al. (2018) On the use of global DEMs in ecological modelling and the accuracy of new bare-earth DEMs. Ecol Modell 383: 3-9. doi: 10.1016/j.ecolmodel.2018.05.006
    [28] Zhang K, Gann D, Ross M, et al. (2019) Comparison of TanDEM-X DEM with LiDAR Data for Accuracy Assessment in a Coastal Urban Area. Remote Sens 11: 876. doi: 10.3390/rs11070876
    [29] Kinsey-Henderson AE, Wilkinson SN (2012) Evaluating Shuttle radar and interpolated DEMs for slope gradient and soil erosion estimation in low relief terrain. Environ Modell Softw 40: 128-139. doi: 10.1016/j.envsoft.2012.08.010
    [30] Lin S, Jing C, Coles NA, et al. (2013) Evaluating DEM source and resolution uncertainties in the Soil and Water Assessment Tool. Stoch Environ Res Risk Assess 27: 209-221. doi: 10.1007/s00477-012-0577-x
    [31] Williams JR, Berndt HD (1977) Sediment yield prediction based on watershed hydrology. Transactions of the American Society of Agricultural and Biological Engineers. Trans ASAE 20: 1100-1104. doi: 10.13031/2013.35710
    [32] Rexer M, Hirt C (2014) Comparison of free high-resolution digital elevation data sets (ASTER GDEM2, SRTM v2.1/v4.1) and validation against accurate heights from the Australian National Gravity Database. Aust J Earth Sci 61: 213-226.
    [33] Renard KG, Foster GR, Weesies GA, et al. (1997) Predicting soil erosion by water: a guide to conservation planning with the revised universal soil loss equation (RUSLE). Agriculture Handbook, U.S. Department of Agriculture, No 703, 404.
    [34] Prasuhn V, Liniger H, Gisler S, et al. (2013) A high-resolution soil erosion risk map of Switzerland as strategic policy support system. Land Use Policy 32: 281-291. doi: 10.1016/j.landusepol.2012.11.006
    [35] Mondal A, Khare D, Kundu S, et al. (2016) Uncertainty of soil erosion modelling using open source high resolution and aggregated DEMs. Geosci Front 8: 425-436. doi: 10.1016/j.gsf.2016.03.004
    [36] Mondal A, Khare D, Kundu S (2017) Uncertainty analysis of soil erosion modelling using different resolution of open-source DEMs. Geocarto Int 32: 334-349. doi: 10.1080/10106049.2016.1140822
    [37] Uhlemann S, Thieken AH, Merz B (2014) A quality assessment framework for natural hazard event documentation: application to trans-basin flood reports in Germany. Nat Hazards Earth Syst Sci 14: 189-208. doi: 10.5194/nhess-14-189-2014
    [38] USGS (2006) Earth Resources Observation and Science. Available from: https://www.usgs.gov/centers/eros.
    [39] Wang L, Liu H (2006) An efficient method for identifying and filling surface depressions in digital elevation models for hydrologic analysis and modelling. Int J Geogr Inf Sci 20: 193-213. doi: 10.1080/13658810500433453
    [40] Das A, Agrawala R, Mohan S (2015) Topographic correction of ALOS-PALSAR images using InSAR-derived DEM. Geocarto Int 30: 145-153.
    [41] Jäger R, Kaminskis J, Balodis J (2012) Determination of Quasi-geoid as Height Component of the Geodetic Infrastructure for GNSS-Positioning Services in the Baltic States. Latv J Phys Tech Sci 49: 2.
    [42] Ghilani CD, Wolf PR (2006) Adjustment Computations: Spatial Data Analysis, 4th Edition, John Wiley & Sons, Hoboken.
    [43] Al-Fugara A (2015) Comparison and Validation of the Recent Freely Available DEMs over Parts of the Earth's Lowest Elevation Area: Dead Sea, Jordan. Int J Geosci 6: 1221-1232. doi: 10.4236/ijg.2015.611096
    [44] Shaw EM (1988) Van Nostrand Reinhold International, London, United Kingdom. Hydrology in practice.
    [45] Strahler AN (1964) Quantitative geomorphology of drainage basin and channel network. In Chow VT (ed), Handbook of Applied Hydrology, McGrawHill, NewYork, NY, USA.
    [46] Wanielista MP, Kersten R, Eaglin R (1997) Hydrology: Water Quantity and Quality Control, Wiley, New York.
    [47] Musy A (2001) Ecole Polytechnique Fédérale, Lausanne, Suisse, e-drologie.
    [48] Roche M (1963) Hydrologie de Surface. Gauthier-Villars, Paris, 140: 659.
    [49] Horton RE (1945) Erosional development of streams and their drainage basins: hydro physical approach to quantitative morphology. Geol Soc Am Bull 56: 275-370. doi: 10.1130/0016-7606(1945)56[275:EDOSAT]2.0.CO;2
    [50] Strahler AN (1952) Hypsometric analysis of erosional topography. Bull Geol Soc Am 63: 1117-1142. doi: 10.1130/0016-7606(1952)63[1117:HAAOET]2.0.CO;2
    [51] Schumm SA (1956) Evolution of drainage systems and slopes in badlands at perth amboy, new jersey. Geol Soc Am Bull 67: 597-646. doi: 10.1130/0016-7606(1956)67[597:EODSAS]2.0.CO;2
    [52] Beven KJ, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology. Hydrol Sci Bull 24: 43-69. doi: 10.1080/02626667909491834
    [53] Pandey A, Chowdary VM, Mal BC (2007) Identification of critical erosion prone areas in the small agricultural watershed using USLE, GIS and remote sensing. Water Resour Manage 21: 729-746. doi: 10.1007/s11269-006-9061-z
    [54] Freeman TG (1991) Calculating Catchment Area With Divergent Flow Based on a Regular Grid. Comput Geosci 17: 413-422. doi: 10.1016/0098-3004(91)90048-I
    [55] Kamp U, Bolch T, Olsenholler J (2005) Geomorphometry of Cerro Sillajhuay (Andes, Chile/Bolivia): Comparison of digital elevation models (DEMs) from ASTER remote sensing data and contour maps. Geocarto Int 20: 23-33. doi: 10.1080/10106040508542333
    [56] Datta PS, Schack-Kirchner H (2010) Erosion Relevant Topographical Parameters Derived from Different DEMs-A Comparative Study from the Indian Lesser Himalayas. Remote Sens 2: 1941-1961. doi: 10.3390/rs2081941
    [57] Luo W (1998) Hypsometric analysis with a geographic information system. Comput Geosci 24: 815-821. doi: 10.1016/S0098-3004(98)00076-4
    [58] Vaze J, Teng J, Spencer G (2010) Impact of DEM accuracy and resolution on topographic indices. Environ Modell Softw 25: 1086-1098. doi: 10.1016/j.envsoft.2010.03.014
    [59] Holmes KW, Chadwick OA, Kyriankidis PC (2000) Error in USGS 30-meter digital elevation model and its impact on terrain modelling. J Hydrol 233: 154-173. doi: 10.1016/S0022-1694(00)00229-8
    [60] Huggel C, Schneider D, Miranda PJ, et al. (2008) Evaluation of ASTER and SRTM DEM data for lahar modelling: A case study on lahars from Popocatépetl volcano, Mexico. J Volcanol Geotherm Res 170: 99-110. doi: 10.1016/j.jvolgeores.2007.09.005
    [61] De Vente J, Poesen J, Govers G, et al. (2009) The implications of data selection for regional erosion and sediment yield modelling. Earth Surf Process Landf 34: 1994-2007. doi: 10.1002/esp.1884
    [62] Nitheshnirmal S, Thilagaraj P, Abdul Rahaman S, et al. (2019) Erosion risk assessment through morphometric indices for prioritisation of Arjuna watershed using ALOS-PALSAR DEM. Model Earth Syst Environ 5: 907-924. doi: 10.1007/s40808-019-00578-y
    [63] Bhakar R, Srivastav SK, Punia M (2010) Assessment of the relative accuracy of aster and SRTM digital elevation models along irrigation channel banks of Indira Gandhi Canal. J Water Land-use Manage 10: 1-11.
    [64] Hasan A, Pilesjo P, Persson A (2011) The use of LIDAR as a data source for digital elevation models-a study of the relationship between the accuracy of digital elevation models and topographical attributes in northern peatlands. Hydrol Earth Syst Sci Discuss 8: 5497-5522. doi: 10.5194/hessd-8-5497-2011
  • This article has been cited by:

    1. Sen Yang, Yan Wang, Shuangquan Zhang, Xuemei Hu, Qin Ma, Yuan Tian, NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences, 2020, 11, 1664-8021, 10.3389/fgene.2020.00090
    2. Lijun Dou, Xiaoling Li, Hui Ding, Lei Xu, Huaikun Xiang, Is There Any Sequence Feature in the RNA Pseudouridine Modification Prediction Problem?, 2020, 19, 21622531, 293, 10.1016/j.omtn.2019.11.014
    3. Guangmin Liang, Jin Wu, Lei Xu, A prognosis-related based method for miRNA selection on liver hepatocellular carcinoma prediction, 2021, 91, 14769271, 107433, 10.1016/j.compbiolchem.2020.107433
    4. Hua-Sheng Chiu, Sonal Somvanshi, Ting-Wen Chen, Pavel Sumazin, 2021, Chapter 22, 978-1-0716-1696-3, 263, 10.1007/978-1-0716-1697-0_22
    5. Juexin Wang, Yan Wang, Towards Machine Learning in Molecular Biology, 2020, 17, 1551-0018, 2822, 10.3934/mbe.2020156
    6. Consolata Gakii, Paul O. Mireji, Richard Rimiru, Graph Based Feature Selection for Reduction of Dimensionality in Next-Generation RNA Sequencing Datasets, 2022, 15, 1999-4893, 21, 10.3390/a15010021
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(6501) PDF downloads(578) Cited by(12)

Figures and Tables

Figures(13)  /  Tables(3)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog