In survival analysis, the cure rate model is widely adopted when a proportion of subjects have long-term survivors. The cure rate model is composed of two parts: the first part is the incident part which describes the probability of cure (infinity survival), and the second part is the latency part which describes the conditional survival of the uncured subjects (finite survival). In the standard cure rate model, there are no constraints on the relations between the coefficients in the two model parts. However, in practical applications, the two model parts are quite related. It is desirable that there may be some relations between the two sets of the coefficients corresponding to the same covariates. Existing works have considered incorporating a joint distribution or structural effect, which is too restrictive. In this paper, we consider a more flexible model that allows the two sets of covariates can be in different distributions and magnitudes. In many practical cases, it is hard to interpret the results when the two sets of the coefficients of the same covariates have conflicting signs. Therefore, we proposed a sign consistency cure rate model with a sign-based penalty to improve interpretability. To accommodate high-dimensional data, we adopt a group lasso penalty for variable selection. Simulations and a real data analysis demonstrate that the proposed method has competitive performance compared with alternative methods.
Citation: Chenlu Zheng, Jianping Zhu. Promote sign consistency in cure rate model with Weibull lifetime[J]. AIMS Mathematics, 2022, 7(2): 3186-3202. doi: 10.3934/math.2022176
In survival analysis, the cure rate model is widely adopted when a proportion of subjects have long-term survivors. The cure rate model is composed of two parts: the first part is the incident part which describes the probability of cure (infinity survival), and the second part is the latency part which describes the conditional survival of the uncured subjects (finite survival). In the standard cure rate model, there are no constraints on the relations between the coefficients in the two model parts. However, in practical applications, the two model parts are quite related. It is desirable that there may be some relations between the two sets of the coefficients corresponding to the same covariates. Existing works have considered incorporating a joint distribution or structural effect, which is too restrictive. In this paper, we consider a more flexible model that allows the two sets of covariates can be in different distributions and magnitudes. In many practical cases, it is hard to interpret the results when the two sets of the coefficients of the same covariates have conflicting signs. Therefore, we proposed a sign consistency cure rate model with a sign-based penalty to improve interpretability. To accommodate high-dimensional data, we adopt a group lasso penalty for variable selection. Simulations and a real data analysis demonstrate that the proposed method has competitive performance compared with alternative methods.
[1] | J. P. Klein, M. L. Moeschberger, Survival analysis: techniques for censored and truncated data, 2 Eds., New York: Springer-Verlag, 2003. doi: 10.1007/b97377. |
[2] | M. Stepanova, L. Thomas, Survival analysis methods for personal loan data, Oper. Res., 50 (2002), 277-289. doi: 10.1287/opre.50.2.277.426. doi: 10.1287/opre.50.2.277.426 |
[3] | V. B. Djeundje, J. Crook, Dynamic survival models with varying coefficients for credit risks, Eur. J. Oper. Res., 275 (2019), 319-333. doi: 10.1016/j.ejor.2018.11.029. doi: 10.1016/j.ejor.2018.11.029 |
[4] | Q. Zhang, S. Zhang, J. Liu, J. Huang, S. Ma, Penalized integrative analysis under the accelerated failure time model, Stat. Sin., 26 (2016), 492-508. doi: 10.5705/ss.2014.194. doi: 10.5705/ss.2014.194 |
[5] | J. Berkson, R. P. Gage, Survival curve for cancer patients following treatment. J. Am. Stat. Assoc., 47 (1952), 501-515. doi: 10.1080/01621459.1952.10501187. doi: 10.1080/01621459.1952.10501187 |
[6] | J. Rodrigues, V. G. Cancho, M.D. Castro, F. Louzada-Neto, On the unification of long-term survival models, Stat. Probability Letters, 79 (2009), 753-759. doi: 10.1016/j.spl.2008.10.029. |
[7] | F. Cooner, S. Banerjee, B. P. Carlin, D. Sinha, Flexible cure rate modeling under latent activation schemes, J. Am. Stat. Assoc., 102 (2007), 560-572. doi: 10.1198/016214507000000112. doi: 10.1198/016214507000000112 |
[8] | J. Rodrigues, M. Castro, V.G. Cancho, N. Balakrishnan, COM-Poisson cure rate survival models and an application to a cutaneous melanoma data, J. Stat. Plan. Infer., 139 (2009), 3605-3611. doi: 10.1016/j.jspi.2009.04.014. doi: 10.1016/j.jspi.2009.04.014 |
[9] | L. Li, J. H. Lee, A latent promotion time cure rate model using dependent tail-free mixtures, J. R. Statist. Soc. A, 180 (2017), 891-905. doi: 10.1111/rssa.12226. doi: 10.1111/rssa.12226 |
[10] | L. Dirick, G. Claeskens, B. Baesens, Time to default in credit scoring using survival analysis: a benchmark study, J. Oper. Res. Soc., 68 (2017), 652-665. doi: 10.1057/s41274-016-0128-9. doi: 10.1057/s41274-016-0128-9 |
[11] | O. Georgiana, A. B. Lawson, Bayesian cure-rate survival model with spatially structured censoring, Spatial Stat., 28 (2018), 352-364. doi: 10.1016/j.spasta.2018.08.007. doi: 10.1016/j.spasta.2018.08.007 |
[12] | S. Pal, S. Roy, A new non-linear conjugate gradient algorithm for destructive cure rate model and a simulation study: illustration with negative binomial competing risks, Commun. Stat.-Simul. Comput., 2020. doi: 10.1080/03610918.2020.1819321. |
[13] | C. Li, J. M. G. Taylor, Smoothing covariate effects in cure models, Commun. Statist.- Theory Meth., 31 (2002), 477-493. doi: 10.1081/STA-120002860. doi: 10.1081/STA-120002860 |
[14] | T. Chen, P. Du, Promotion time cure rate model with nonparametric form of covariate effects, Stat. Sin., 37 (2018): 1625-1635. doi: 10.1002/sim.7597. |
[15] | E. N. C. Tong, C. Mues, L. C. Thomas, Mixture cure models in credit scoring: if and when borrowers default, Eur. J. Oper. Res., 218 (2012), 132-139. doi: 10.1016/j.ejor.2011.10.007. doi: 10.1016/j.ejor.2011.10.007 |
[16] | C. Jiang, Z. Wang, H. Zhao, A prediction-driven mixture cure model and its application in credit scoring, Eur. J. Oper. Res., 277 (2019), 20-31. doi: 10.1016/j.ejor.2019.01.072. doi: 10.1016/j.ejor.2019.01.072 |
[17] | C. Han, R. Kronmal, Two-part models for analysis of Agatston scores with possible proportionality constraints, Commun. Stat.-Theory Meth., 35 (2006), 99-111. doi: 10.1080/03610920500438614. doi: 10.1080/03610920500438614 |
[18] | K. Fang, X. Wang, B.C. Shia, S. Ma, Identification of proportionality structure with two-part models using penalization, Comput. Stat. Data Anal, 99 (2016), 12-24. doi: 10.1016/j.csda.2016.01.002. doi: 10.1016/j.csda.2016.01.002 |
[19] | F. Liu, Z. Hua, A. Lim, Identifying future defaulters: a hierarchical Bayesian method, Eur. J. Oper. Res., 241 (2015), 202-211. doi: 10.1016/j.ejor.2014.08.008. doi: 10.1016/j.ejor.2014.08.008 |
[20] | X. Fan, M. Liu, K. Fang, Y. Huang, S. Ma, Promoting structural effects of covariates in the cure rate model with penalization, Stat. Methods Med. Res., 26 (2017), 2078-2092. doi: 10.1177/0962280217708684. doi: 10.1177/0962280217708684 |
[21] | Q. Zhang, S. Ma, Y. Huang, Promote sign consistency in the joint estimation of precision matrices, Comput. Stat. Data Anal., 159 (2021), 107210. doi: 10.1016/j.csda.2021.107210. doi: 10.1016/j.csda.2021.107210 |
[22] | X. Shi, S. Ma, and Y. Huang, Promoting sign consistency in the cure model estimation and selection, Stat. Methods Med. Res., 29 (2020), 15-28. doi: 10.1177/0962280218820356. doi: 10.1177/0962280218820356 |
[23] | M. Yuan, Y. Lin, Model selection and estimation in regression with grouped variables, J. R. Statist. Soc. B, 68 (2006), 49-67. doi: 10.1111/j.1467-9868.2005.00532.x. doi: 10.1111/j.1467-9868.2005.00532.x |
[24] | T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2 Eds., New York: Springer, 2009. doi: 10.1007/978-0-387-84858-7. |
[25] | J. Huang, P. Breheny, S. Ma, A selective review of group selection in high-dimensional models, Stat. Sci., 27 (2012), 481-499. doi: 10.1214/12-STS392. doi: 10.1214/12-STS392 |
[26] | N. Balakrishnan, S. Pal, Expectation maximization-based likelihood inference for flexible cure rate models with Weibull lifetimes, Stat. Methods Med. Res., 25 (2016), 1535-1563. doi: 10.1177/0962280213491641. doi: 10.1177/0962280213491641 |
[27] | M. Omer, M. Bakar, M. B. Adam, M. S. Mustafa, Cure models with exponentiated Weibull exponential distribution for the analysis of melanoma patients, Mathematics, 8 (2021), 1926. doi: 10.3390/math8111926. doi: 10.3390/math8111926 |
[28] | S. Pal, N. Balakrishnan, Likelihood inference based on EM algorithm for the destructive length-biased Poisson cure rate model with Weibull lifetime, Commun. Stat. Simulation Computation, 47 (2018), 644-660. doi: 10.1080/03610918.2015.1053918. doi: 10.1080/03610918.2015.1053918 |
[29] | X. Li, Y. Tang, A. Xu, Objective Bayesian analysis of Weibull mixture cure model, Qual. Eng., 32 (2020), 449-464. doi: 10.1080/08982112.2020.1757706. doi: 10.1080/08982112.2020.1757706 |
[30] | J. Huang, T. Zhang, The benefit of group sparsity, Ann. Stat., 38 (2010), 1978-2004. doi: 10.1214/09-AOS778. doi: 10.1214/09-AOS778 |
[31] | L. Meier, S. V. D. Geer, P. Bhlmann, E. T. H. Zrich, The group lasso for logistic regression, J. R. Statist. Soc. B, 70 (2008), 53-71. doi: 10.1111/j.1467-9868.2007.00627.x. doi: 10.1111/j.1467-9868.2007.00627.x |
[32] | Y. Yang, H. Zou, A fast unified algorithm for solving group-lasso penalize learning problems, Stat. and Comput., 25 (2015), 1129-1141. doi: 10.1007/s11222-014-9498-5. doi: 10.1007/s11222-014-9498-5 |
[33] | H. Wang, B. Li, C. Leng, Shrinkage tuning parameter selection with a diverging number of parameters, J. R. Statist. Soc. B, 71 (2009), 671-683. doi: 10.1111/j.1467-9868.2008.00693.x. doi: 10.1111/j.1467-9868.2008.00693.x |
[34] | S. Pal, A simplified stochastic EM algorithm for cure rate model with negative binomial competing risks: an application to breast cancer data, Stat. Med., 2021. doi: 10.1002/sim.9189. |
[35] | Y. Li, Y. Li, Y. Li, What factors are influencing credit card customer's default behavior in China? A study based on survival analysis, Physica A, 526 (2019), Article ID 120861. doi: 10.1016/j.physa.2019.04.097. |
[36] | Y. Shu, Q. Y. Yang, Research on auto loan default prediction based on large sample data model, Manage. Rev., 29 (2017), 59-71. |