Research article Special Issues

Proposing a novel community detection approach to identify cointeracting genomic regions

  • Received: 15 May 2019 Accepted: 08 October 2019 Published: 13 January 2020
  • Modern next generation sequencing technologies produce huge amounts of genome-wide data that allow researchers to have a deeper understanding of genomics of organisms. Despite these huge amounts of data, our understanding of the transcriptional regulatory networks is still incomplete. Conformation dependent chromosome interaction maps technologies (Hi-C) have enabled us to detect elements in the genome which interact with each other and regulate the genes. Summarizing these interactions as a data network leads to investigation of the most important properties of the 3D genome structure such as gene co-expression networks. In this work, a Pareto-Based Multi-Objective Optimization algorithm is proposed to detect the co-expressed genomic regions in Hi-C interactions. The proposed method uses fixed sized genomic regions as the vertices of the graph. Number of read between two interacting genomic regions indicate the weight of each edge. The performance of our proposed algorithm was compared to the Multi-Objective PSO algorithm on five networks derived from cis genomic interactions in three Hi-C datasets (GM12878, CD34+ and ESCs). The experimental results show that our proposed algorithm outperforms Multi-Objective PSO technique in the identification of co-interacting genomic regions.

    Citation: Mohammadjavad Hosseinpoor, Hamid Parvin, Samad Nejatian, Vahideh Rezaie, Karamollah Bagherifard, Abdollah Dehzangi, Amin Beheshti, Hamid Alinejad-Rokny. Proposing a novel community detection approach to identify cointeracting genomic regions[J]. Mathematical Biosciences and Engineering, 2020, 17(3): 2193-2217. doi: 10.3934/mbe.2020117

    Related Papers:

  • Modern next generation sequencing technologies produce huge amounts of genome-wide data that allow researchers to have a deeper understanding of genomics of organisms. Despite these huge amounts of data, our understanding of the transcriptional regulatory networks is still incomplete. Conformation dependent chromosome interaction maps technologies (Hi-C) have enabled us to detect elements in the genome which interact with each other and regulate the genes. Summarizing these interactions as a data network leads to investigation of the most important properties of the 3D genome structure such as gene co-expression networks. In this work, a Pareto-Based Multi-Objective Optimization algorithm is proposed to detect the co-expressed genomic regions in Hi-C interactions. The proposed method uses fixed sized genomic regions as the vertices of the graph. Number of read between two interacting genomic regions indicate the weight of each edge. The performance of our proposed algorithm was compared to the Multi-Objective PSO algorithm on five networks derived from cis genomic interactions in three Hi-C datasets (GM12878, CD34+ and ESCs). The experimental results show that our proposed algorithm outperforms Multi-Objective PSO technique in the identification of co-interacting genomic regions.


    加载中


    [1] J. Wang, J. Xie, Z. Tu, J. Wang, W. Pan, J. Hu, et al., Cloning and expression analysis of the nuclear factor erythroid 2- related factor 2 (Nrf2) gene of grass carp (Ctenopharyngodon idellus) and the dietary effect of Eucommia ulmoides on gene expression, Aquacult. Fish., 3 (2018), 196-203.
    [2] C. Essien, B. K. Via, G Acquah, T. Gallagher, T. McDonald, L. Eckhardt, Effect of genetic sources on anatomical, morphological, and mechanical properties of 14-year-old genetically improved loblolly pine families from two sites in the southern United States, J. For. Res., 29 (2018), 1519-1531.
    [3] I. Cabreros, E. Abbe, A. Tsirigos, Detecting community structures in Hi-C genomic data, 2016 Annual Conference on Information Science and Systems (CISS), 2016, 584-589. Available from: https://ieeexplore_ieee.xilesou.top/abstract/document/7460568.
    [4] A. F.Siahpirani, F. Ay, S. Roy, A multi-task graph-clustering approach for chromosome conformation capture data sets identifies conserved modules of chromosomal interactions, Genome biol., 17 (2016), 114.
    [5] Z. Li, L. He, Y. Li, A novel multiobjective particle swarm optimization algorithm for signed network community detection, Appl. Intell., 44 (2016), 621-633.
    [6] A. Beheshti, B. Benatallah, A. Tabebordbar, H. R. Motahari-Nezhad, M. C. Barukh, R. Nouri, Datasynapse: A social data curation foundry, Distrib. Parallel Databases, 37 (2019), 351-384.
    [7] V. Kawadia, S. Sreenivasan, Sequential detection of temporal communities by estrangement confinement, Sci. Rep., 2 (2012), 794.
    [8] Q. C. Zhang, D. Petrey, J. I. Garzón, J. I. Garzón, L. Deng, B. Honig, PrePPI: A structure-informed database of protein-protein interactions, Nucleic Acids Res., 41 (2013), D828-D833.
    [9] G. Pan, W. Zhang, Z. Wu, S. Li, Online community detection for large complex networks, Plos One, 9 (2014), e102799.
    [10] N. K. Fox, S. E. Brenner, J. M. Chandonia, SCOPe: Structural Classification of Proteins-extended, integrating SCOP and ASTRAL data and classification of new structures, Nucleic Acids Res., 42 (2014), D304-D309.
    [11] J. W. Hoskins, J. Jia, M. Flandez, H. Parikh, W. Xiao, I. Collins, et al., Transcriptome analysis of pancreatic cancer reveals a tumor suppressor function for HNF1A, Carcinogenesis, 35 (2014), 2670-2678.
    [12] I. Masoudiasl, S. Vahdat, S. Hessam, S. Shamshirband, H. Alinejad-Rokny, Proposing an Integrated Method based on Fuzzy Tuning and ICA Techniques to Identify the Most Influencing Features in Breast Cancer, Iran. Red Crescent Med. J., 21 (2019), e92077.
    [13] M. Yasrebi, A. Eskandar-Baghban, H. Parvin, M. Mohammadpour, Optimisation inspiring from behaviour of raining in nature: Droplet optimisation algorithm, Int. J. Bio-inspired Comput., 12 (2018), 152-163.
    [14] H. Parvin, H. Alinezad, N. Seyedaghaee, S. Parvin, A heuristic scalable classifier ensemble of binary classifier ensembles, J. Bioinf. Intell. Control, 1 (2012), 163-170.
    [15] B. Minaei-Bidgoli, H. Parvin, H. Alinejad-Rokny, H. Alizadeh, W. F. Punch, Effects of resampling method and adaptation on clustering ensemble efficacy, Artif. Intell. Rev., 41 (2014), 27-48.
    [16] J. S. Bernardes, F. R. J. Vieira, L. M. M. Costa, G. Zaverucha, Evaluation and improvements of clustering algorithms for detecting remote homologous protein families, BMC Bioinf., 16 (2015), 34.
    [17] H. Parvin, H. Alinejad-Rokny, B. Minaei-Bidgoli, S. Parvin, A new classifier ensemble methodology based on subspace learning, J. Exp. Theor. Artif. Intell., 25 (2013), 227-250.
    [18] J. Creusefond, T. Largillier, S. Peyronnet, On the evaluation potential of quality functions in community detection for different contexts, International Conference and School on Network Science, Springer, Cham, 2016, 111-125. Available from: https://link_springer.xilesou.top/chapter/10.1007/978-3-319-28361-6_9.
    [19] J. Chowdhary, F. E. Löffler, J. C. Smith, Community detection in sequence similarity networks based on attribute clustering, Plos One, 12 (2017), e0178650.
    [20] H. Parvin, M. MirnabiBaboli, H. Alinejad-Rokny, Proposing a Classifier Ensemble Framework Based on Classifier Selection and Decision Tree, Eng. Appl. Artif. Intell., 37 (2015), 34-42.
    [21] G. B. Orgaz, S. Salcedo-Sanz, D. Camacho, A Multi-Objective Genetic Algorithm for overlapping community detection based on edge encoding, Inf. Sci., 462 (2018), 290-314.
    [22] B. Kong, W. Wu, N. Valkovska, C. Jäger, X. Hong, U. Nitsche, et al., A common genetic variation of melanoma inhibitory activity-2 labels a subtype of pancreatic adenocarcinoma with high endoplasmic reticulum stress levels, Sci. Rep., 5 (2015), 8109.
    [23] U. Maulik, S. Mallik, A. Mukhopadhyay, S. Bandyopadhyay, Analyzing large gene expression and methylation data profiles using StatBicRM: Statistical biclustering-based rule mining, PloS One, 10 (2015), e0119448.
    [24] G. Reali, M. Femminella, E. Nunzi, D. Valocchi, Genomics as a service: A joint computing and networking perspective, Comput. Networks, 145 (2018), 27-51.
    [25] S. Nejatian, R. Omidvar, H. Mohamadi, A. E. Baghbani, V. Rezaie, H. Parvin, An optimization algorithm based on behavior of see-see partridge chicks, J. Intell. Fuzzy Syst., 33 (2017), 3227-3240.
    [26] M. M. Jenghara, H. E. Komleh, H. Parvin, Dynamic protein-protein interaction networks construction using firefly algorithm, Pattern Anal. Appl., 21 (2018), 1067-1081.
    [27] N. Servant, N. Varoquaux, B. R. Lajoie, E. Viara, C. J. Chen, J. P. Vert, et al., HiC-Pro: An Optimized and Flexible Pipeline for Hi-C Data Processing, Genome Biol., 16 (2015), 259.
    [28] M. E. J. Newman, Detecting community structure in networks, Eur. Phys. J. B, 38 (2004), 321-330.
    [29] H. Parvin, B. Minaei-Bidgoli, H. Alinejad-Rokny, A New Imbalanced Learning and Dictions Tree Method for Breast Cancer Diagnosis, J. Bionanosci., 7 (2013), 673-678.
    [30] H. Parvin, B. Minaei-Bidgoli, H. Alinejad-Rokny, W. F. Punch, Data weighing mechanisms for clustering ensembles, Comput. Electr. Eng., 39 (2013), 1433-1450.
    [31] R. Javanmard, K. Jeddisaravi, H. Rokny, Proposed a New Method for Rules Extraction Using Artificial Neural Network and Artificial Immune System in Cancer Diagnosis, J. Bionanosci., 7 (2013), 665-672.
    [32] T. Sureshkumar, M. Lingaraj, B. Anand, T. Premkumar, Non-dominated sorting particle swarm optimization (NSPSO) and network security policy enforcement for Policy Space Analysis, Int. J. Commun. Syst., 31 (2018), e3554.
    [33] T. Sureshkumar, B. Anand, T. Premkumar, Efficient Non-Dominated Multi-Objective Genetic Algorithm (NDMGA) and network security policy enforcement for Policy Space Analysis (PSA), Comput. Commun., 138 (2019), 90-97.
    [34] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
    [35] H. Alinejad-Rokny, E. Sadroddiny, V. Scaria, Machine learning and data mining techniques for medical complex data analysis, Neurocomputing, 276 (2018), 1.
    [36] H. Alinejad-Rokny, Proposing on optimized homolographic motif mining strategy based on parallel computing for complex biological networks, J. Med. Imaging Health Inf., 6 (2016), 416-424.
    [37] H. Alinejad-Rokny, H. Pourshaban, A. G. Orimi, M. M. Baboli, Network motifs detection strategies and using for bioinformatic networks, J. Bionanosci., 8 (2015), 353-359.
    [38] H. Parvin, H. Alizadeh, S. Parvin, H. Shirgahi, A new conditional invariant detection framework (CIDF), Sci. Res. Essays, 8 (2013), 265-273.
    [39] H. Motameni, H. Alizadeh, M. M. Pedram, Using sequential pattern mining in discovery DNA sequences contain gap, Am. J. Sci. Res., 14 (2011), 72-78.
    [40] A. Amir, L. Dey, A k-mean clustering algorithm for mixed numeric and categorical data, Data Knowl. Eng., 63 (2007), 503-527.
    [41] M. Ahmadinia, M. R. Meybodi, M. Esnaashari, H. Alinejad-Rokny, Energy-efficient and multi-stage clustering algorithm in wireless sensor networks using cellular learning automata, IETE J. Res., 59 (2013), 774-782.
    [42] A. Beheshti, B. Benatallah, R. Nouri, A. Tabebordbar, CoreKG: A knowledge lake service, Proc. VLDB Endowment, 11 (2018), 1942-1945.
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4073) PDF downloads(391) Cited by(9)

Article outline

Figures and Tables

Figures(20)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog