Survival data with high dimensional covariates have been collected in medical studies and other fields. In this work, we propose a seamless $ L_0 $ (SELO) penalized method for the accelerated failure time (AFT) model under the framework of high dimension. Specifically, we apply the SELO to do variable selection and estimation under this model. Under appropriate conditions, we show that the SELO selects a model whose dimension is comparable to the underlying model, and prove that the proposed procedure is asymptotically normal. Simulation results demonstrate that the SELO procedure outperforms other existing procedures. The real data analysis is considered as well which shows that SELO selects the variables more correctly.
Citation: Yin Xu, Ning Wang. Variable selection and estimation for accelerated failure time model via seamless-$ L_0 $ penalty[J]. AIMS Mathematics, 2023, 8(1): 1195-1207. doi: 10.3934/math.2023060
Survival data with high dimensional covariates have been collected in medical studies and other fields. In this work, we propose a seamless $ L_0 $ (SELO) penalized method for the accelerated failure time (AFT) model under the framework of high dimension. Specifically, we apply the SELO to do variable selection and estimation under this model. Under appropriate conditions, we show that the SELO selects a model whose dimension is comparable to the underlying model, and prove that the proposed procedure is asymptotically normal. Simulation results demonstrate that the SELO procedure outperforms other existing procedures. The real data analysis is considered as well which shows that SELO selects the variables more correctly.
[1] | J. Buckley, I. James, Linear regression with censored data, Biometrika, 66 (1979), 429–436. https://doi.org/10.1093/biomet/66.3.429 doi: 10.1093/biomet/66.3.429 |
[2] | D. R. Cox, Regression models and life-tables (with discussion), J. Roy. Stat. Soc. Ser. B, 34 (1972), 187–220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x doi: 10.1111/j.2517-6161.1972.tb00899.x |
[3] | H. Chai, Q. Z. Zhang, J. Huang, S. G. Ma, Inference for low-dimensional covariates in a high-dimensional accelerated failure time model, Stat. Sinica, 29 (2019), 877–894. https://doi.org/10.5705/ss.202016.0449 doi: 10.5705/ss.202016.0449 |
[4] | T. Choi, S. Choi, A fast algorithm for the accelerated failure time model with high-dimensional time-to-event data, J. Stat. Comput. Simul., 91 (2021), 3385–3403. https://doi.org/10.1080/00949655.2021.1927034 doi: 10.1080/00949655.2021.1927034 |
[5] | L. Dicker, B. S. Huang, X. H. Lin, Variable selection and estimation with the seamless-L 0 penalty, Stat. Sinica, 23 (2013), 929–962. https://dx.org/10.5705/ss.2011.074 doi: 10.5705/ss.2011.074 |
[6] | J. Q. Fan, R. Z. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., 96 (2001), 1348–1360. https://doi.org/10.1198/016214501753382273 doi: 10.1198/016214501753382273 |
[7] | J. Huang, S. G. Ma, H. L. Xie, Regularized estimation in the accelerated failure time model with high-dimensional covariates, Biometrics, 62 (2006), 813–820. https://doi.org/10.1111/j.1541-0420.2006.00562.x doi: 10.1111/j.1541-0420.2006.00562.x |
[8] | J. Huang, S. G. Ma, Variable selection in the accelerated failure time model via the bridge method, Lifetime Data Anal., 16 (2010), 176–195. https://doi.org/10.1007/s10985-009-9144-2 doi: 10.1007/s10985-009-9144-2 |
[9] | S. M. Hu, J. S. Rao, Sparse penalization with censoring constraints for estimating high dimensional AFT models with applications to microarray data analysis, Technical reports, University of Miami, 2010. |
[10] | J. D. Kalbfleisch, R. L. Prentice, The statistical analysis of failure time data, John Wiley & Sons. Inc., New Jersey, 2 (2011), 168–170. https://doi.org/10.1016/0197-2456(81)90009-X |
[11] | Y. D. Kim, H. Choi, H. S. Oh, Smoothly clipped absolute deviation on high dimensions, J. Am. Stat. Assoc., 103 (2008), 1665–1673. https://doi.org/10.1198/016214508000001066 doi: 10.1198/016214508000001066 |
[12] | Y. Li, M. X. Liang, L. Mao, S. J. Wang, Robust estimation and variable selection for the accelerated failure time model, Stat. Med., 40 (2021), 4473–4491. https://doi.org/10.1002/sim.9042 doi: 10.1002/sim.9042 |
[13] | Y. Ritov, Estimation in a linear regression model with censored data, Ann. Stat., 18 (1990), 354–372. https://doi.org/10.1214/aos/1176347502 doi: 10.1214/aos/1176347502 |
[14] | W. Stute, Consistent estimation under random censorship when covariables are present, J. Multivariate Anal., 45 (1993), 89–103. https://doi.org/10.1006/jmva.1993.1028 doi: 10.1006/jmva.1993.1028 |
[15] | W. Stute, Distributional convergence under random censorship when covariables are present, Scand. J. Stat., 23 (1996), 461–471. https://doi.org/10.1016/s0167-7152(98)00069-8 doi: 10.1016/s0167-7152(98)00069-8 |
[16] | R. Tibshirani, Regression shrinkage and selection via the lasso, J. Roy. Stat. Soc. B, 58 (1996), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x doi: 10.1111/j.2517-6161.1996.tb02080.x |
[17] | A. A. Tsiatis, Estimating regression parameters using linear rank tests for censored data, Ann. Stat., 18 (1990), 354–372. https://doi.org/354-372.10.1214/aos/1176347504 |
[18] | X. G. Wang, L. X. Song, Adaptive Lasso variable selection for the accelerated failure models, Commun. Stat.-Theor. M., 40 (2011), 4372–4386. https://doi.org/10.1080/03610926.2010.513785 doi: 10.1080/03610926.2010.513785 |
[19] | H. Zou, The adaptive lasso and its oracle properties, J. Am. Stat. Assoc., 101 (2006), 1418–1429. https://doi.org/10.1198/016214506000000735 doi: 10.1198/016214506000000735 |
[20] | H. Zou, Nearly unbiased variable selection under minimax concave penalty, J. Am. Stat. Assoc., 38 (2010), 894–942. https://doi.org/894-942.10.1214/09-AOS729 |
[21] | W. J. Fu, Penalized regressions: The bridge versus the lasso, J. Comput. Graph. Stat., 7 (1998). https://doi.org/397-416.10.1214/09-AOS729 |
[22] | M. H. R. Khan, J. E. H. Shaw, Variable selection for survival data with a class of adaptive elastic net techniques, Stat. Comput., 26 (2016), 725–741. https://doi.org/10.1007/s11222-015-9555-8 doi: 10.1007/s11222-015-9555-8 |