Research article

Simultaneous variable selection and estimation for longitudinal ordinal data with a diverging number of covariates

  • Received: 01 November 2021 Revised: 18 January 2022 Accepted: 26 January 2022 Published: 09 February 2022
  • MSC : 62E20, 62F12, 62J12

  • In this paper, we study the problem of simultaneous variable selection and estimation for longitudinal ordinal data with high-dimensional covariates. Using the penalized generalized estimation equation (GEE) method, we obtain some asymptotic properties for these types of data in the case that the dimension of the covariates $ p_n $ tends to infinity as the number of cluster $ n $ approaches to infinity. More precisely, under appropriate regular conditions, all the covariates with zero coefficients can be examined simultaneously with probability tending to 1, and the estimator of the non-zero coefficients exhibits the asymptotic Oracle properties. Finally, we also perform some Monte Carlo studies to illustrate the theoretical analysis. The main result in this paper extends the elegant work of Wang et al. [1] to the multinomial response variable case.

    Citation: Xianbin Chen, Juliang Yin. Simultaneous variable selection and estimation for longitudinal ordinal data with a diverging number of covariates[J]. AIMS Mathematics, 2022, 7(4): 7199-7211. doi: 10.3934/math.2022402

    Related Papers:

  • In this paper, we study the problem of simultaneous variable selection and estimation for longitudinal ordinal data with high-dimensional covariates. Using the penalized generalized estimation equation (GEE) method, we obtain some asymptotic properties for these types of data in the case that the dimension of the covariates $ p_n $ tends to infinity as the number of cluster $ n $ approaches to infinity. More precisely, under appropriate regular conditions, all the covariates with zero coefficients can be examined simultaneously with probability tending to 1, and the estimator of the non-zero coefficients exhibits the asymptotic Oracle properties. Finally, we also perform some Monte Carlo studies to illustrate the theoretical analysis. The main result in this paper extends the elegant work of Wang et al. [1] to the multinomial response variable case.



    加载中


    [1] L. Wang, J. H. Zhou, A. N. Qu, Penalized generalized estimating equations for high-dimensional longitudinal data analysis, Biometrics, 68 (2012), 353–360. http://dx.doi.org/10.1111/j.1541-0420.2011.01678.x doi: 10.1111/j.1541-0420.2011.01678.x
    [2] L. Wang, GEE analysis of clustered binary data with diverging number of covariates, Ann. Stat., 39 (2011), 389–417. https://doi.org/10.1214/10-AOS846 doi: 10.1214/10-AOS846
    [3] H. Akaike, A new look at the statistical model identification, IEEE. T. Automat. Contr. 19 (1974), 716–723. http://dx.doi.org/10.1109/tac.1974.1100705 doi: 10.1109/tac.1974.1100705
    [4] G. Schwarz, Estimating the dimension of a model, Ann. Stat., 6 (1978), 461–464. http://dx.doi.org/10.1214/aos/1176344136 doi: 10.1214/aos/1176344136
    [5] W. Pan, Akaike's information criterion in generalized estimating equations, Biometrics, 57 (2001), 120–125. https://doi.org/10.1111/j.0006-341X.2001.00120.x doi: 10.1111/j.0006-341X.2001.00120.x
    [6] W. J. Fu, Penalized estimating equations, Biometrics, 59 (2003), 126–132. http://dx.doi.org/10.1111/1541-0420.00015 doi: 10.1111/1541-0420.00015
    [7] E. Cantoni, J. M. Flemming, E. Ronchetti, Variable selection for marginal longitudinal generalized linear models, Biometrics, 61 (2005), 507–514. http://dx.doi.org/10.1111/j.1541-0420.2005.00331.x doi: 10.1111/j.1541-0420.2005.00331.x
    [8] L. Wang, A. N. Qu, Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach, J. Roy. Statist. Soc., 71 (2009), 177–190. https://doi.org/10.1111/j.1467-9868.2008.00679.x doi: 10.1111/j.1467-9868.2008.00679.x
    [9] H. Yang, P. Lin, G. H. Zou, H. Liang, Variable selection and model averaging for longitudinal data incorporating GEE approach, Stat. Sinica, 27 (2017), 389–413. http://dx.doi.org/10.5705/ss.2013.277 doi: 10.5705/ss.2013.277
    [10] Z. M. Chen, Z. F. Wang, Y. Ivan Chang, Sequential adaptive variables and subject selection for GEE methods, Biometrics, 76 (2020), 496–507. http://dx.doi.org/10.1111/biom.13160 doi: 10.1111/biom.13160
    [11] J. M. Williamson, H. M. Lin, H. X. Barnhart, A classification statistic for GEE categorical response models, Journal of Data Science, 1 (2003), 149–165. http://dx.doi.org/10.6339/JDS.2003.01(2).106 doi: 10.6339/JDS.2003.01(2).106
    [12] S. R. Lipsitz, K. Kim, L. P. Zhao, Analysis of repeated categorical data using generalized estimating equations, Stat. Med., 13 (1994), 1149–1163. https://doi.org/10.1002/sim.4780131106 doi: 10.1002/sim.4780131106
    [13] K. C. Lin, Y. J. Chen, Assessing GEE models with longitudinal ordinal data by global odds ratio, Int. Statistical Inst.: Proc. 58th World Statistical Congress, (2011), 5763–5768.
    [14] K. Y. Liang, S. L. Zeger, Longitudinal data analysis using generalized linear models, Biometrika, 73 (1986), 13–22. https://doi.org/10.1093/biomet/73.1.13 doi: 10.1093/biomet/73.1.13
    [15] J. Q. Fan, R. Z. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., 96 (2001), 1348–1360. https://doi.org/10.1198/016214501753382273 doi: 10.1198/016214501753382273
    [16] L. Fahrmeir, G. Tutz, Multivariate statistcal modelling based on generalized linear models, New York: Springer, 1994. https://doi.org/10.1007/978-1-4899-0010-4
    [17] A. Touloumis, A. Agresti, M. Kateri, GEE for multinomial responses using a local odds ratios parameterization, Biometrics, 69 (2013), 633–640. http://dx.doi.org/10.1111/biom.12054 doi: 10.1111/biom.12054
    [18] S. G. Wang, J. H. Shi, S. J. Yin, M. X. Wu, Introduction to linear models. 3rd ed, Beijing: Science Press, 2004.
    [19] A. Touloumis, Simulating correlated binary and multinomial responses under marginal model specification: the SimCorMultRes package, The R Journal, 8 (2016), 79–91. http://dx.doi.org/10.32614/RJ-2016-034 doi: 10.32614/RJ-2016-034
    [20] X. B. Chen, J. L. Yin, Asymptotic properties of GEE estimator for clustered ordinal data with high-dimensional covariates, Commun. Stat.-Theor. M., (2021). http://dx.doi.org/10.1080/03610926.2021.1934029
    [21] V. D. Vaart, J. Wellner, Weak convergence and empirical processes: with applications to statistics, New York: Springer, 1996.
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1178) PDF downloads(60) Cited by(0)

Article outline

Figures and Tables

Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog