Research article

Study of the sterile insect release technique for a two-sex mosquito population model

  • In this paper, to study the large-scale time control and limited-time control of mosquito population in a field, a two-sex mosquito population model with stage structure and impulsive releases of sterile males is proposed. For the large-scale time control, a wild mosquito-free periodic solution is given and conditions under which it is globally stable are obtained by the use of the monotone system theory. Besides, based on the stability analysis, threshold conditions under which the wild mosquito population is eliminated or not are obtained. Then we study three different optimal release strategies for the limited-time control, which takes into account both of the population control level of wild mosquitoes and the economic input. To solve technical problems in optimal impulsive control, a time rescaling technique is applied and the gradients of cost function with respect to all control parameters are obtained. In addition, by the aid of numerical simulation, we get the optimal release amounts and release timings for each release strategy. Our study indicates that the optimal release timing control is superior to the optimal release amount control. However, simultaneous optimal selection of release amount and release timing leads to the best control performance.

    Citation: Mingzhan Huang, Shouzong Liu, Xinyu Song. Study of the sterile insect release technique for a two-sex mosquito population model[J]. Mathematical Biosciences and Engineering, 2021, 18(2): 1314-1339. doi: 10.3934/mbe.2021069

    Related Papers:

    [1] Jingren Niu, Qing Tan, Xiufen Zou, Suoqin Jin . Accurate prediction of glioma grades from radiomics using a multi-filter and multi-objective-based method. Mathematical Biosciences and Engineering, 2023, 20(2): 2890-2907. doi: 10.3934/mbe.2023136
    [2] Hakan Özcan, Bülent Gürsel Emiroğlu, Hakan Sabuncuoğlu, Selçuk Özdoğan, Ahmet Soyer, Tahsin Saygı . A comparative study for glioma classification using deep convolutional neural networks. Mathematical Biosciences and Engineering, 2021, 18(2): 1550-1572. doi: 10.3934/mbe.2021080
    [3] Sonam Saluja, Munesh Chandra Trivedi, Shiv S. Sarangdevot . Advancing glioma diagnosis: Integrating custom U-Net and VGG-16 for improved grading in MR imaging. Mathematical Biosciences and Engineering, 2024, 21(3): 4328-4350. doi: 10.3934/mbe.2024191
    [4] Yutao Wang, Qian Shao, Shuying Luo, Randi Fu . Development of a nomograph integrating radiomics and deep features based on MRI to predict the prognosis of high grade Gliomas. Mathematical Biosciences and Engineering, 2021, 18(6): 8084-8095. doi: 10.3934/mbe.2021401
    [5] Sonam Saluja, Munesh Chandra Trivedi, Ashim Saha . Deep CNNs for glioma grading on conventional MRIs: Performance analysis, challenges, and future directions. Mathematical Biosciences and Engineering, 2024, 21(4): 5250-5282. doi: 10.3934/mbe.2024232
    [6] Xiaowei Zhang, Jiayu Tan, Xinyu Zhang, Kritika Pandey, Yuqing Zhong, Guitao Wu, Kejun He . Aggrephagy-related gene signature correlates with survival and tumor-associated macrophages in glioma: Insights from single-cell and bulk RNA sequencing. Mathematical Biosciences and Engineering, 2024, 21(2): 2407-2431. doi: 10.3934/mbe.2024106
    [7] Hongwei Sun, Qian Gao, Guiming Zhu, Chunlei Han, Haosen Yan, Tong Wang . Identification of influential observations in high-dimensional survival data through robust penalized Cox regression based on trimming. Mathematical Biosciences and Engineering, 2023, 20(3): 5352-5378. doi: 10.3934/mbe.2023248
    [8] Moxuan Zhang, Quan Zhang, Jilin Bai, Zhiming Zhao, Jian Zhang . Transcriptome analysis revealed CENPF associated with glioma prognosis. Mathematical Biosciences and Engineering, 2021, 18(3): 2077-2096. doi: 10.3934/mbe.2021107
    [9] Yuan Yang, Lingshan Zhou, Xi Gou, Guozhi Wu, Ya Zheng, Min Liu, Zhaofeng Chen, Yuping Wang, Rui Ji, Qinghong Guo, Yongning Zhou . Comprehensive analysis to identify DNA damage response-related lncRNA pairs as a prognostic and therapeutic biomarker in gastric cancer. Mathematical Biosciences and Engineering, 2022, 19(1): 595-611. doi: 10.3934/mbe.2022026
    [10] Wei Niu, Lianping Jiang . A seven-gene prognostic model related to immune checkpoint PD-1 revealing overall survival in patients with lung adenocarcinoma. Mathematical Biosciences and Engineering, 2021, 18(5): 6136-6154. doi: 10.3934/mbe.2021307
  • In this paper, to study the large-scale time control and limited-time control of mosquito population in a field, a two-sex mosquito population model with stage structure and impulsive releases of sterile males is proposed. For the large-scale time control, a wild mosquito-free periodic solution is given and conditions under which it is globally stable are obtained by the use of the monotone system theory. Besides, based on the stability analysis, threshold conditions under which the wild mosquito population is eliminated or not are obtained. Then we study three different optimal release strategies for the limited-time control, which takes into account both of the population control level of wild mosquitoes and the economic input. To solve technical problems in optimal impulsive control, a time rescaling technique is applied and the gradients of cost function with respect to all control parameters are obtained. In addition, by the aid of numerical simulation, we get the optimal release amounts and release timings for each release strategy. Our study indicates that the optimal release timing control is superior to the optimal release amount control. However, simultaneous optimal selection of release amount and release timing leads to the best control performance.


    Low-grade glioma (LGG) is a uniformly fatal tumor, and the survival from this tumor is approximately 7 years [1]. Because of the heterogeneity in LGG patients, different LGG subtypes increase the difficulty of optimizing management of adult low-grade gliomas [2,3]. Magnetic Resonance Imaging (MRI) is an imaging technique that can capture tumors of the brain clearly [4]. Clinicians often use MRI images to diagnose the agammaessiveness of the tumor. Therefore, the analysis of MRI data and feature extraction are becoming more challenging. To address these issues, many studies have used MRI data to extract prognostic factors for LGG patients. In a study by Pignatti et al. [5], the authors established a score system that can be used to determine the prognostic score. In adult patients with LGG, the age of the patients, the astrocytoma histology, the largest diameter of the tumor, the tumor crossing the midline and the presence of a neurologic deficit before surgery are all important prognostic factors for survival. These factors can be used to identify low-risk and high-risk patients. In a study by Chen et al. [6], the authors developed a computer-assisted algorithm for tumor segmentation and characterization using both kinetic information and morphological features of 3-D DCE-MRI. They differentiated benign and malignant lesions by analyzing 3-D morphological features including shape features and texture features of the segmented tumor. In a study by Agravat et al. [7], the authors implemented the DeepMedic CNN architecture for tumor segmentation and the extracted features are fed to a random forest classifier to obtain 59% overall survival accuracy. In another study by Shboul et al. [8], 40 features were extracted from the predicted brain tumor mask and fed to a random forest regression to predict the overall survival of a glioma patient, with an accuracy of 67% on the training dataset and 57.9% on the testing dataset. In an attempt at prediction of survival [9], the authors extracted 26 image-derived geometrical features and used SVM to predict the risk of death and classify glioma patients into three groups, with an accuracy of 56.8%. In another attempt [10], hundreds of intensity and texture features were extracted from MR images of glioblastoma multiforme, and principal component analysis (PCA) was used to reduce dimensionality. Then, these features were fed to an artificial neural network (ANN). A result with accuracy of 65.1% was obtained based on two classes: short-overall survivor and long-overall survivor. In another study [11], Chato et al. attempted the use of support vector machines (SVMs), k-nearest neighbors (KNNs), linear discriminants, tree, ensembles and logistic regression to classify survivors into two or three classes. The features from segmentations are used to train the linear discriminant for prediction of survival. The texture features resulted in the accuracy of 46%, and histogram features achieved an accuracy of 68.5% for the test dataset.

    The above methods predicted survival by using only image information or clinical information. However, the tumor heterogeneity possibly comes from strong phenotypic differences, and it is difficult to predict prognosis accurately by using only medical imaging analysis (see Figure 1), thus motivating the need for integrating another kind of data. Along with the rapid development of deep-sequencing technology, the output of sequencing has made huge progress not only in equality but also in speed [12]. If radiomic data and genomic data can be integrated, this integration will build a bridge between micro and macro and increase the accuracy of the precision diagnosis and treatment of the brain tumor [13]. Grossmann et al. [14] found that prognostic biomarkers performed better in lung cancer when radiomic, genetic, and clinical information was combined. The C-index was 0.73, while the result is only 0.66 when lacking genetic information. Xia et al. [15] created a radiogenomic strategy that can obtain significant associations between imaging features and gene expression patterns in hepatocellular carcinoma. However, similar work is lacking in LGG. Therefore, in this study we integrated two different types of data, i.e., radiomic features of MRI and gene signatures, to develop a new integrated survival prediction measure for LGG.

    Figure 1.  A diagram illustrates why we need to integrate radiomic data and genomic data. Low-risk and high-risk patients are marked in green and blue colors, respectively. Integration will increase the accuracy of recognition of high-risk patients. However, only radiomic data possibly leads to error classification.

    The framework of this study is shown in Figure 2. First, we used gene expression data to construct a gene regulatory network and identify network modules and then used imaging data to extract significant radiomic biomarkers that are associated with the survival of the patient (Parts (a) and (b)), respectively. Then, we calculated the correlation between gene modules and image features to obtain a small number of gene signatures that are connected with these image features (Part (c)). Furthermore, we established a Lasso (least absolute shrinkage and selection operator) model to predict the image features with only gene expression values (Parts (d) and (e)). Based on gene expression data, we used support vector machines (SVMs) to identify the gene signatures (Parts (f) and (g)). We combined the predicted image features and the gene signatures to establish an integrated measure that can predict survival of the LGG patient (Parts (h) and (i)). The results show that the integrated measure performed better on survival prediction than any other single index.

    Figure 2.  The framework of this study. (a) Construction of a gene regulatory network and identification of modules. (b) Extracting image features associated with patient survival. (c) Module analysis to select gene modules that have a connection with significant image features. (d) Establishing a Lasso model to identify gene signatures. (e) Predicting the significant image features using Lasso. (f) Identifying gene signatures using the SVM-based recursive feature elimination method and training the SVM model. (g) Survival prediction by SVM. The result could be treated as a survival prediction index. (h) A new integrated measure (IM) for combining image features and gene features is obtained through particle swarm optimization (PSO). (i) The IM that is obtained is used to predict survival.

    Computer-aided and manually corrected segmentation labels for the preoperative multi-institutional scans of 65 LGG patients and 724 radiomic features along with the corresponding skull-stripped and coregistered multimodal (i.e., T1, T1-Gd, T2, T2-FLAIR) MRI data were collected from the Cancer Imaging Archive (TCIA) [16,17,18]. The corresponding RNA-seq data and Disease Free Survival (DSS) data for these 65 patients were also obtained from The Cancer Genome Atlas (TCGA) database. These data were used in this study as the training dataset.

    The gene expression data and the corresponding DSS data of 455 LGG patients were downloaded from TCGA and used in this study as the validation dataset.

    A gene coexpression network was constructed using gene expression data in the training dataset. We deleted genes that express in less than 20% of the patients or have no expression values. Then, we retained genes that have the highest 25% variance. A pairwise correlation matrix was calculated, and then we adjusted the matrix by raising it to the power of five using the R package WGCNA [19,20]. The minimum module size was set to 50, and the minimum height for merging modules was set to 0.25.

    We identified significant image features that are associated with patient DSS by training a multivariate Cox regression model [21] on the training dataset. Image features were filtered with the standard that p value must be less than 0.01. Then, these image features were treated as image biomarkers and survival prediction indexes. For each image feature, we divided patients on the validation dataset into two groups—high-risk group and low-risk group—by taking the median value of the feature as the threshold and plotted the Kaplan-Meier curves. The concordance index (C-index) [22] and the log-rank test were also used to assess the prognostic prediction performance.

    The basic formula of the multivariate Cox regression model is described as follows:

    h(t,X)=h0(t)exp(β1X1+β2X2+...+βmXm) (1)

    h(t,X) represents the hazard function and h0(t) is the baseline hazard function. The factor X1, X2, ..., Xm correspond to the image features here and β1, β2, ..., βm are the corresponding regression coefficients.

    We calculated Pearson correlation coefficients and their statistical significance to obtain the correlations between gene modules and selected image features. Because there are many genes in each module, the principal component analysis (PCA) was used to reduce the dimension of gene expression data of 65 patients in the training dataset. Then, image features were filtered. Features that showed significant correlation (p value less than 0.05) with at least one gene module were retained, and others were removed. Then, gene modules associated with the same image feature were integrated. The enrichment analysis was performed to identify the significantly enriched molecular pathways on these modules.

    We established a radiogenomic map by identifying gene signatures associated with the prognostic imaging features. Lasso (least absolute shrinkage and selection operator) is a regression analysis method that performs both variable selection and regularization [23,24]. This method can enhance the prediction accuracy and interpretability of the statistical model it produces.

    Q(β)=yXβ2+λβ1 (2)

    Among the above formulas, X is the variable and y is the label. β is the coefficient that we want to optimize. Q(β) is the objective function that we want to minimize. Compared with the method of least squares, the objective function in the Lasso model has a regularization term λβ1. With this L1 norm regularization term, Lasso can control the number of variables used and improve the generalization ability of the model. For each image feature remaining in the gene module analysis, Lasso was trained to select gene signatures from related gene modules and make a prediction on image features with MRI data and gene expression data in the training dataset. We determined the regularization coefficient λ by minimizing the MSE (mean squared error) of the model.

    In this step, we obtained a survival prediction index using only gene signatures, without the information of image features. SVMs (support vector machines) are supervised learning models that can be used for classification and regression problems [25,26,27]. For a classification problem, the optimal hyperplane is searched to separate data into two classes with the max margin. For new data, the trained hyperplane is used to predict the label or the probability of each class. Sometimes, data may not be separated completely, and a soft margin [25] can be used by adding a penalty parameter C and slack variables ξi to obtain the minimum error. The SVM optimization problem is

    minω,C12ω2+CNi=1ξi (3)

    subject to

    yif(xi)1ξi, and ξi0 (4)

    The vector ω is the vector orthogonal to the hyperplane. xi, yi are an observation pair of data points, and f(xi) is the label of xi predicted by the SVM. SVM-RFE (support vector machine-recursive feature elimination) [28] is a powerful feature selection algorithm based on SVM that can avoid overfitting when the number of features is high. In each iteration, features are scored and sorted through model training and the least important feature is removed. Remaining features are used for a next training, and the above step is repeated. The score for sorting of the ith feature is defined as

    ci=ω2i (5)

    ωi is the ith dimension of the hyperplane orthogonal vector ω in SVM. Finally, the optimal number of features that have the minimum error is determined.

    We use SVM-FRE to select gene signatures and train a classification SVM model with expression data of these selected gene signatures and DSS data in the training dataset. The patient labels are set to 0 or 1 based on their prognostic situation—survival or death. Then, the predicted probability is treated as a survival prediction index. Survival curve and C-index are used to access the prediction performance.

    Further, we consider a combination of selected image biomarkers and the index calculated by SVM with gene signatures. To ensure improvement of the new agammaegated index, we transform the calculation of optimal combination coefficients of all features into an optimization problem. Specifically, suppose that N image features are considered to be associated with DSS independently—which are recorded as f1, f2, ..., fN and the gene index value from SVM is recorded as g. The integrated measure we want to determine is recorded as f. The optimization problem needing to be solved can be described as follows.

    maxfCf (6)

    subject to

    f=Ni=1αifi+βg (7)
    Ni=1αi+β=1 (8)

    where Cf is the C-index of integrated measure f on the training dataset. Our goal is to search optimal parameters α1, α2, ..., αN and β in Eq (7) to maximize the Cf in (6).

    The Particle Swarm Optimization (PSO) algorithm [29] is used to solve the optimization problem (6) in this study. PSO is an evolutionary computation algorithm inspired by bird activities that can solve any optimization problem. Initial population with some random particle is created first. For each particle, the position represents a solution, and the corresponding fitness means a value of target function. The object of PSO is to find the optimal particle that has the minimized fitness by updating the velocity and position of particle as the following formula:

    vi=ωvi+c1r1(pbestixi)+c2r2 (9)
    xi=xi+vi (10)

    xi, vi is the position and velocity of the ith particle. pbesti is the best position of the ith particle in history and gbest is the best position of all particles currently. r1, r2 are random numbers between 0 and 1. ω is the inertia weight, and c1, c2 are the acceleration constants.

    We take a log-rank test on 724 image features using DSS data of 65 patients in TCIA and filter these features with a standard that the p value is less than 0.01. Then, 21 features remain. Features with high similarity to each other are removed: we calculate the Pearson correlation coefficient between features and remove the one that has the bigger log-rank p value if the Pearson correlation coefficient between two image features is greater than 0.8. After this step, 6 features are removed, and 15 features remain. Based on the above univariable analysis, we first implement the proportional hazard test [21]. Each image feature meets the proportional hazard assumption (detailed information is shown in Additional file 1: Table S1). Then, we train a multivariate Cox regression model on these remaining image features with gene expression data and DSS data in the training dataset. The result is shown in Table 1, and eight features marked with are considered to be independently correlated with DSS (p<0.05).

    Table 1.  Image features for survival analysis.
    Image features exp(coef) exp(coef) lower 95% exp(coef) upper 95% Wald test p value
    TEXTURE_GLSZM_ET_T1Gd_SZLGE* 0 0 0 3.12 0.00178
    HISTO_ED_T2_Bin8* 0.7 0.55 0.88 3.02 0.00254
    TEXTURE_GLOBAL_ET_T1Gd_Skewness* 3.03E+05 47.29 1.94E+09 2.82 0.00477
    TEXTURE_GLRLM_NET_FLAIR_LRHGE* 1 0.99 1 2.66 0.0078
    HISTO_NET_T1_Bin4* 0.89 0.8 0.98 2.42 0.01559
    HISTO_ET_T1Gd_Bin10* 1.19 1.03 1.38 2.35 0.01877
    TEXTURE_GLSZM_NET_T1Gd_ZSV* 0 0 0 2.34 0.01906
    TEXTURE_GLRLM_NET_T1Gd_GLV* 2.00E+42 2230.45 1.79E+81 2.13 0.0333
    HISTO_ET_T1_Bin10 0.81 0.63 1.05 1.59 0.11219
    TEXTURE_GLCM_ET_T2_SumAverage 0 0 9.95E+83 1.57 0.11569
    TEXTURE_GLRLM_NET_T1_LGRE 0 0 2.73E+38 1.49 0.13526
    TEXTURE_GLRLM_ED_T1_RLV inf 0 inf 1.4 0.16185
    HISTO_ED_T2_Bin4 0.96 0.88 1.05 0.91 0.36241
    TEXTURE_GLCM_ED_FLAIR_Energy inf 0 inf 0.78 0.43387
    TEXTURE_GLSZM_NET_T1_LZLGE 0.99 0.96 1.03 0.39 0.69976

     | Show Table
    DownLoad: CSV

    A gene coexpression network is constructed using gene expression data of 65 patients in the training dataset. We delete genes that express in less than 20% of the patients or have no expression values (n = 1875). Then, we retain genes that have the highest 25% variance (n = 4663). A pairwise correlation matrix is calculated, and then we adjust the matrix by raising it to the power of five using the R package WGCNA [19,20]. The minimum module size is set to 50 and the minimum height for merging modules is set to 0.25. Then, we get 12 gene modules. Detailed information on the modules is shown in Additional file 2: Table S2.

    The Pearson correlation coefficient and their statistical significance were calculated between the 12 gene modules and the 8 image features. The result is shown in Figure 3. Four image features that show significant correlation (p<0.05) with at least one gene module were obtained. HISTO_ED_T2_Bin8 is the 8-bin histogram feature of the peritumoral edema in T2-weighted precontrast, TEXTURE_GLSZM_NET_T1Gd_ZSV is the zone size variance of gray level size zone matrix (GLSZM) of the nonenhancing part of the tumor core in T1-weighted postcontrast, TEXTURE_GLRLM_NET_FLAIR_LRHGE is the long run high gray level emphasis of gray level run length matrix (GLRLM) of the nonenhancing part of the tumor core in T2 Fluid-Attenuated Inversion Recovery, and TEXTURE_GLRLM_NET_T1Gd_GLV is the gray level variance of GLRLM of the nonenhancing part of the tumor core in T1-weighted postcontrast. Then, their corresponding gene modules were integrated. The statistical results are shown in Table 2 and the detailed list of genes is shown in Additional file 3: Table S3.

    Figure 3.  The heatmap of correlation between the image features and the gene modules. Colored checks marked with * means significant Pearson correlation.
    Table 2.  The statistical results of image features and their corresponding gene modules with significant association.
    Image features Associated gene modules Number of associated genes
    HISTO_ED_T2_Bin8 module2, module4, module5, module7 2794
    TEXTURE_GLSZM_NET_T1Gd_ZSV module6 506
    TEXTURE_GLRLM_NET_FLAIR_LRHGE module6 506
    TEXTURE_GLRLM_NET_T1Gd_GLV module2, module4, module6, module8, module10 1421

     | Show Table
    DownLoad: CSV

    A further KEGG enrichment analysis was performed on integrated gene modules using the Metascape website [30], which is shown in Figure 4. The complete list of biological annotations is shown in Additional file 4: Table S4. Among these, the neuroactive ligand-receptor interaction pathway is mostly enriched in all integrated gene modules with the minimum p value of 1.259×1041, which is reported to be associated with glioma [31,32].

    Figure 4.  Results of KEGG enrichment analysis: a. Enrichment of modules associated with HISTO_ED_T2_Bin8. b. Enrichment of modules associated with TEXTURE_GLSZM_NET_T1Gd_ZSV. c. Enrichment of modules associated with TEXTURE_GLRLM_NET_T1Gd_GLV. d. Enrichment of modules associated with TEXTURE_GLRLM_NET_FLAIR_LRHGE.

    Then, the Lasso method described in section 2.5 was used to select gene signatures from the related gene modules and establish a map from genes to image features. We determined the regularization coefficient λ by minimizing the MSE (mean squared error) of the model. The process is shown in Figure 6. The optimal coefficient λ and the corresponding RMSE (root mean squared error) of 65 patients are shown in Table 3. The number of selected gene signatures is also shown. The detailed list of gene signatures is shown in Additional file 5: Table S5.

    Figure 5.  The chart of the neuroactive ligand-receptor interaction pathway. Genes appearing in associated modules are marked in green.
    Figure 6.  The value and 95% confidence interval of MSE for each regularization coefficient λ. The dotted line marks λ with the minimal MSE. All Lasso models were trained on 65 patients in the training dataset. a. λ and corresponding MSE of the Lasso model, mapping from gene signatures to HISTO_ED_T2_Bin8. b. λ and corresponding MSE of Lasso model, mapping from gene signatures to TEXTURE_GLSZM_NET_T1Gd_ZSV. c. λ and corresponding MSE of Lasso model, mapping from gene signatures to TEXTURE_GLRLM_NET_FLAIR_LRHGE. d. λ and corresponding MSE of Lasso model, mapping from gene signatures to TEXTURE_GLRLM_NET_T1Gd_GLV.
    Table 3.  The optimal parameters of Lasso and number of selected gene signatures for four image features.
    Image feature Number of genes in associated modules Optimal λ RMSE Number of genes selected by Lasso
    HISTO_ED_T2_Bin8 2794 1.6627 6.0847 12
    TEXTURE_GLSZM_NET_T1Gd_ZSV 506 7.81E-06 2.0195E-5 3
    TEXTURE_GLRLM_NET_FLAIR_LRHGE 506 163.9677 528.16 6
    TEXTURE_GLRLM_NET_T1Gd_GLV 1421 3.01E-03 0.0120 18

     | Show Table
    DownLoad: CSV

    We made a prediction on the 4 image features using Lasso with gene expression data of 455 patients in TCGA as the validation dataset. We then took the value of each image feature as a survival prediction index. We calculated the C-index and plotted the Kaplan-Meier curves on the validation dataset. The result is shown in Figure 7. The C-index of these four survival prediction indexes are 0.6945, 0.7321, 0.7926, and 0.7985. These results indicate that these four image features perform well in survival prediction.

    Figure 7.  Kaplan-Meier curves of DSS and C-index. a. HISTO_ED_T2_Bin8. b. TEXTURE_GLSZM_NET_T1Gd_ZSV. c. TEXTURE_GLRLM_NET_FLAIR_LRHGE. d. TEXTURE_GLRLM_NET_T1Gd_GLV.

    From the selected 4663 genes with high variance, we fed gene expression data and DSS data of 65 patients in TCIA to SVM-FRE and obtained 43 gene signatures (shown in Additional file 6: Table S6). Then, we trained a classification SVM model with these selected genes. The variables were gene expression data of 65 patients, and the labels were set to 0 or 1 based on the patient prognostic situation—survival or death. Penalty parameter C was set to 2 and 5-fold cross-validation was used to evaluate the error in the recursive feature elimination process. We trained the SVM model and took the predicted probability of survival as a survival prediction index. C-index and survival curve are shown in Figure 8. The C-index is 0.7627.

    Figure 8.  Kaplan-Meier curve of DSS and C-index of the index from SVM.

    We took a linear combination of four significant image features and the index calculated by SVM with gene signatures. A better integrated measure was obtained that represents patient survival situation. Set N=4 in formula (7). Four normalized image feature values were recorded as f1, f2, f3, and f4, and the index value from SVM was recorded as g. The integrated measure is recorded as f. Then, we get

    f=4i=1αifi+βg (11)

    We used PSO algorithm to calculate the optimal coefficient to maximize the C-index of 65 patients in the training dataset, with parameters ω, C1 and C2 of 0.8, 0.5 and 0.5. The initial population size was set to 20, 25, 30, 35 and 40, and the corresponding iteration number was set to 30 to ensure the convergence of PSO. We repeated numerical experiments 10 times and recorded the average result for different parameters. Detailed results of each experiment are shown in Additional file 7: Table S7. For each population size, we then brought the coefficients into formula (11) and obtained integrated measure f with different forms. C-index was calculated using gene expression data on the validation dataset. The validation result is shown in Table 4.

    Table 4.  The mean result of combination coefficients calculated by PSO and C-index with different parameters.
    Populations sizes 20 25 30 35 40
    α1 0.2926 0.3187 0.2792 0.3303 0.276
    α2 0.0663 0.0394 0.0739 0.0505 0.068
    α3 0.2171 0.2329 0.2076 0.2214 0.2102
    α4 0.0091 0.019 0.0107 0.0298 0.0159
    β 0.4149 0.39 0.4288 0.368 0.4298
    C-index 0.8065 0.807 0.8061 0.807 0.8057

     | Show Table
    DownLoad: CSV

    From Table 4, we observe that β is more or less than 0.4 with different parameters. Therefore, the proportion of gene signatures in integration is approximately 40%. α1 is approximately 0.3, α2 is approximately 0.06 and α3 is approximately 0.24. α4 is nearly 0, indicating that the gray level variance of GLRLM of the nonenhancing part of the tumor core in T1-weighted postcontrast can be removed in the integration. We then set parameters α1, α2, α3, α4 and β to 0.3, 0.06, 0.24, 0 and 0.4. We brought these coefficients into formula (11) and calculated the integrated measure f on the validation dataset. The Kaplan-Meier curve is shown in Figure 9. The C-index of the four independent image features, gene signatures and integrated measures are shown in Table 5.

    Figure 9.  Kaplan-Meier curve of DSS and C-index of integrated measure f.
    Table 5.  C-index of image features and gene signatures.
    Image features f1 f2 f3 f4 g f
    C-index 0.6945 0.7321 0.7926 0.7985 0.7627 0.8071

     | Show Table
    DownLoad: CSV

    The C-index of the integrated measure f is 0.8071 and is higher than any other measure based on image signatures or gene signature. This result indicates that the integrated measure can improve the prediction accuracy. The integrated measure is recorded as follows.

    f=0.3f1+0.06f2+0.24f3+0.4g (12)

    Furthermore, we use the time dependent Receiver Operating Characteristic (ROC) [33] to further assess the predictive power and compare different prediction models. Time-dependent ROC analysis showed that the integrated measure improved our ability to predict prognosis [AUC, 0.79; and 95% confidence intervals (CI), 0.71 to 0.87] (see Figure 10), when compared with other measures based on image signatures or gene signatures.

    Figure 10.  ROC and corresponding AUCs for 5-year survival predicted by f1, f2, f3, f4, g and f on the 455 patients in validation dataset.
    Figure 11.  Image features used for final survival prediction. a The 8-bin histogram feature of the peritumoral edema. b The zone size variance of gray level size zone matrix (GLSZM) of the nonenhancing part of the tumor core. c The long run high gray level emphasis of gray level run length matrix (GLRLM) of the nonenhancing part of the tumor core.

    Patients are defined into two groups—high-risk group and low-risk group, based on their prognosis—DSS value in this study, by taking the median value of DSS of 65 patients in the training dataset as a threshold. Then, classification is conducted on 455 patients in the validation dataset by taking a threshold of the median value of the integrated measure in the training dataset. The accuracy is 72.1%, which is higher than the accuracy of the published studies [7,8,9,10,11].

    The primary goal of phenotyping and classifying a human tumor is to capture tumor heterogeneity and realize personalized precision diagnosis and therapy. In clinical practice, the massive and multiple types of big medical data are available with the rapid development of biomedical engineering and computer application technology. However, one of the biggest challenges in clinical applications is how to integrate these different types of data to extract accuracy information.

    In this study, we attempted to integrate both MRI data and gene expression data to propose a new feature measure that could be used to identify subsets of LGG patients at low and high risk for progression to DSS. Based on gene expression data, we first used the WGCNA method to construct the network and identify twelve network modules. With MRI data, eight image biomarkers were obtained by using the Cox regression model. Furthermore, through correlation analysis between gene modules and image features, four radiomic biomarkers were identified. Because MRI data are not available in our test dataset, the Lasso method was applied to build a map from gene expression data to these image features. In addition, we also independently used gene expression data to predict image biomarkers through the SVM method. Finally, an integrated measure (IM) for combining image and gene signatures was obtained through the PSO algorithm. We validated IM with gene expression data and DSS data on 455 patients in the validation dataset. The C-index of IM is 0.8071 and its Area Under Curve (AUC) of the ROC curve is 0.79, higher than any other single measure. The accuracy of classification of patients is 72.1%, which is higher than the accuracy of the published work using only radiomic data [7,8,9,10,11]. The results demonstrate that the proposed IM enhances the prediction accuracy for lower grade gliomas.

    In summary, the accuracy of DSS prediction of LGG patients is successfully improved by integrating radiomic features in Macro with the gene expression data in Micro. The proposed method in this study can also be extended to analyze different data sources of other tumors.

    This work was supported by the National Key Research and Development Program of China (No. 2018YFC1314600), the Key Program of the National Natural Science Foundation of China (No. 11831015) and the Chinese National Natural Science Foundation (No. 61672388).

    All authors declare no conflicts of interest in this paper.



    [1] H. J. Barclay, The sterile insect release method on species with two-stage life cycles, Res. Popul. Ecol., 21 (1980), 165–180. doi: 10.1007/BF02513619
    [2] H. J. Barclay, M. Mackauer, The sterile insect release method for pest control: A density dependent model, Environ. Entomol., 9 (1980), 810–817. doi: 10.1093/ee/9.6.810
    [3] H. J. Barclay, Pest population stability under sterile releases, Res. Popul. Ecol., 24 (1982), 405–416. doi: 10.1007/BF02515585
    [4] H. J. Barclay, Modeling incomplete sterility in a sterile release program: Interactions with other factors, Popul. Ecol., 43 (2001), 197–206. doi: 10.1007/s10144-001-8183-7
    [5] H. J. Barclay, Mathematical models for the use of sterile insects, in Sterile Insect Technique, Springer, Heidelberg, (2005), 147–174.
    [6] L. Alphey, M. Benedict, R. Bellini, G. G. Clark, D. A. Dame, M. W. Service, et al., Sterile-insect methods for control of mosquito-borne diseases: an analysis, Vector-Borne Zoonotic Dis., 10 (2010), 295–311. doi: 10.1089/vbz.2009.0014
    [7] W. Klassen, Area-wide integrated pest management and the sterile insect technique, in Sterile Insect Technique (eds. V. A. Dyck, J. Hendrichs and A. S. Robinson), Springer, The Netherlands, (2005), 39–68.
    [8] M. Strugarek, H. Bossin, Y. Dumont, On the use of the sterile insect release technique to reduce or eliminate mosquito populations, Appl. Math. Model., 68 (2019), 443–470. doi: 10.1016/j.apm.2018.11.026
    [9] H. Laven, Eradication of culex pipiens fatigans through cytoplasmic incompatibility, Nature, 216 (1967), 383–384. doi: 10.1038/216383a0
    [10] X. Zheng, D. Zhang, Y. Li, C. Yang, Y. Wu, X. Liang, et al., Incompatible and sterile insect techniques combined eliminate mosquitoes, Nature, 572 (2019), 56–61. doi: 10.1038/s41586-019-1407-9
    [11] K. R. Fister, M. L. Mccarthy, S. F. Oppenheimer, C. Collins, Optimal control of insects through sterile insect release and habitat modification, Math. Biosci., 244 (2013), 201–212. doi: 10.1016/j.mbs.2013.05.008
    [12] S. M. White, P. Rohani, S. M. Sait, Modelling pulsed releases for sterile insect techniques: fitness costs of sterile and transgenic males and the effects on mosquito dynamics, J. Appl. Ecol., 47 (2010), 1329–1339. doi: 10.1111/j.1365-2664.2010.01880.x
    [13] J. Li, Z. Yuan, Modeling releases of sterile mosquitoes with different strategies, J. Biol. Dynam., 9 (2015), 1–14.
    [14] J. Li, J. Li, New revised simple models for interactive wild and sterile mosquito populations and their dynamics, J. Biol. Dynam., 11 (2017), 316–333. doi: 10.1080/17513758.2016.1216613
    [15] J. Li, L. Cai, Y. Li, Stage-structured wild and sterile mosquito population models and their dynamics, J. Biol. Dynam., 11 (2017), 79–101. doi: 10.1080/17513758.2016.1159740
    [16] L. Cai, S. Ai, J. Li, Dynamics of mosquitoes populations with different strategies for releasing sterile mosquitoes, SIAM, J. Appl. Math., 74 (2014), 1786–1809. doi: 10.1137/13094102X
    [17] Y. Dumont, J. M. Tchuenche, Mathematical studies on the sterile insect technique for the Chikungunya disease andAedes albopictus, J. Math. Biol., 65 (2012), 809–854. doi: 10.1007/s00285-011-0477-6
    [18] J. Huang, S. Ruan, P. Yu, Y. Zhang, Bifurcation analysis of a mosquito population model with a saturated release rate of sterile mosquitoes, SIAM J. Appl. Dyn. Syst., 18 (2019), 939–972. doi: 10.1137/18M1208435
    [19] L. Cai, J. Huang, X. Song, Y. Zhang, Bifurcation analysis of a mosquito population model for proportional releasing sterile mosquitoes, Discrete Contin. Dynam. Syst. Ser. B, 25 (2019), 6279–6295.
    [20] Z. Qiu, X. Wei, C. Shan, H. Zhu, Monotone dynamics and global behaviors of a West Nile virusmodel with mosquito demographics, J. Math. Biol., 80 (2020), 809–834. doi: 10.1007/s00285-019-01442-4
    [21] M. Huang, X. Song, J. Li, Modelling and analysis of impulsive release of sterile mosquitoes, J. Biol. Dynam., 11 (2017), 147–171. doi: 10.1080/17513758.2016.1254286
    [22] P. A. Bliman, D. Cardona-Salgado, Y. Dumont, O. Vasilieva, Implementation of control strategies for sterile insect techniques, Math. Biosci., 314 (2019), 43–60. doi: 10.1016/j.mbs.2019.06.002
    [23] J. Yu, Modeling mosquito population suppression based on delay differential equations, SIAM, J. Appl. Math., 78 (2018), 3168–3187. doi: 10.1137/18M1204917
    [24] J. Yu, J. Li, Global asymptotic stability in an interactive wild and sterile mosquito model, J. Differ. Equations, 269 (2020), 6193–6215. doi: 10.1016/j.jde.2020.04.036
    [25] J. Yu, Existence and stability of a unique and exact two periodic orbits for an interactive wild and sterile mosquito model, J. Differ. Equations, 269 (2020), 10395–10415. doi: 10.1016/j.jde.2020.07.019
    [26] J. Yu, J. Li, Dynamics of interactive wild and sterile mosquitoes with time delay, J. Biol. Dynam., 13 (2019), 606–620. doi: 10.1080/17513758.2019.1682201
    [27] J. Li, S. Ai, Impulsive releases of sterile mosquitoes and interactive dynamics with time delay, J. Biol. Dynam., 14 (2020), 313–331.
    [28] J. Yu, B. Zheng, Modeling Wolbachia infection in mosquito population via discrete dynamical models, J. Differ. Equations Appl., 25 (2019), 1–19. doi: 10.1080/10236198.2018.1551379
    [29] M. Huang, M. Tang, J. Yu, B. Zheng, A stage structured model of delay differential equations for aedes mosquito population suppression, Discrete Contin. Dynam. Syst., 40 (2020), 3467–3484. doi: 10.3934/dcds.2020042
    [30] S. Xiang, Y. Pei, X. Liang, Analysis and optimization-based on a sex pheromone and pesticide pest model with gestation delay, Int. J. Biomath., 12 (2019), 1950054. doi: 10.1142/S1793524519500542
    [31] Z. Q. Yang, X. Y. Wang, Y. N. Zhang, S. B. Vinson, Recent advances in biological control of important native and invasive forest pests in China, Biol. Control, 68 (2014), 117–128. doi: 10.1016/j.biocontrol.2013.06.010
    [32] P. Neuenschwander, H. R. Herren, Biological control of the cassava mealybug, Phenacoccusmanihoti, by the exotic parasitoid epidinocarsis lopezi in Africa, Philos. Trans. R. Soc. Lond. B, Biol. Sci., 318 (1988), 319–333. doi: 10.1098/rstb.1988.0012
    [33] X. Y. Liang, Y. Z. Pei, M. X. Zhu, Y. F. Lv, Multiple kinds of optimal impulse control strategies on plant-pest-predator model with eco-epidemiology, Appl. Math. Comput., 287 (2016), 1–11.
    [34] Y. Z. Pei, M. M. Chen, X. Y. Liang, C. G. Li, M. X. Zhu, Optimizing pulse timings and amounts of biological interventions for a pest regulation model, Nonlinear Anal. Hybrid Syst., 27 (2018), 353–365. doi: 10.1016/j.nahs.2017.10.003
    [35] B. Dennis, Allee effects: Population growth, critical density, and the chance of extinction, Nat. Resour. Model., 3 (1989), 481–538. doi: 10.1111/j.1939-7445.1989.tb00119.x
    [36] S. J. Schreiber, Allee effects, extinctions, and chaotic transients in simple population models, Theor. Popul. Biol., 64 (2003), 201–209. doi: 10.1016/S0040-5809(03)00072-8
    [37] D. D. Bainov, P. S. Simeonov, Impulsive Differential Equations: Periodic Solutions and Applications, CRC Press, London, 66 (1993).
    [38] D. D. Bainov, P. S. Simeonov, System with Impulse Effect, Theory and Applications, Prentice Hall, Englewood, (1989).
    [39] H. L. Smith, Monotone dynamical systems: an introduction to the theory of competitive and cooperative systems, Bull. Am. Math. Soc., 33 (1996), 203–209. doi: 10.1090/S0273-0979-96-00642-8
    [40] K. L. Teo, Control parametrization enhancing transform to optimal control problems, Nonlinear Anal.: Theory, Methods, Appl., 63 (2005), e2223–e2236. doi: 10.1016/j.na.2005.03.066
    [41] Y. Liu, K. L. Teo, L. S. Jennings, S. Wang, On a class of optimal control problems with state jumps, J. Optim. Theory Appl., 98 (1998), 65–82. doi: 10.1023/A:1022684730236
    [42] F. D. Parker, Management of pest populations by manipulating densities of both hosts and parasites through periodic releases, in Biological Control, Springer, Boston, (1971), 365–376.
  • This article has been cited by:

    1. Farinaz Forouzannia, Vahid Shahrezaei, Mohammad Kohandel, The impact of random microenvironmental fluctuations on tumor control probability, 2021, 509, 00225193, 110494, 10.1016/j.jtbi.2020.110494
    2. Anuraag Bukkuri, Kenneth J. Pienta, Robert H. Austin, Emma U. Hammarlund, Sarah R. Amend, Joel S. Brown, A mathematical investigation of polyaneuploid cancer cell memory and cross-resistance in state-structured cancer populations, 2023, 13, 2045-2322, 10.1038/s41598-023-42368-8
    3. Anuraag Bukkuri, Modeling stress-induced responses: plasticity in continuous state space and gradual clonal evolution, 2024, 143, 1431-7613, 63, 10.1007/s12064-023-00410-3
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2973) PDF downloads(191) Cited by(9)

Figures and Tables

Figures(9)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog