This paper investigates generalized pilot estimators of covariance matrix in the presence of missing data. When the random samples have only bounded fourth moment, two kinds of generalized pilot estimators are provided, the generalized Huber estimator and the generalized truncated mean estimator. In addition, we construct thresholding generalized pilot estimator for a kind of sparse covariance matrices and establish the convergence rates in terms of probability under spectral and Frobenius norms respectively. Moreover, the convergence rates in sense of expectation are also given under an extra condition. Finally, simulation studies are conducted to demonstrate the superiority of our method.
Citation: Huimin Li, Jinru Wang. Pilot estimators for a kind of sparse covariance matrices with incomplete heavy-tailed data[J]. AIMS Mathematics, 2023, 8(9): 21439-21462. doi: 10.3934/math.20231092
This paper investigates generalized pilot estimators of covariance matrix in the presence of missing data. When the random samples have only bounded fourth moment, two kinds of generalized pilot estimators are provided, the generalized Huber estimator and the generalized truncated mean estimator. In addition, we construct thresholding generalized pilot estimator for a kind of sparse covariance matrices and establish the convergence rates in terms of probability under spectral and Frobenius norms respectively. Moreover, the convergence rates in sense of expectation are also given under an extra condition. Finally, simulation studies are conducted to demonstrate the superiority of our method.
[1] | S. Mendelson, N. Zhivotovskiy, Robust covariance estimation under $L_{4}-L_{2}$ norm equivalence, Ann. Statist., 48 (2020), 1648–1664. https://doi.org/10.1214/19-AOS1862 doi: 10.1214/19-AOS1862 |
[2] | Y. Dendramis, L. Giraitis, G. Kapetanios, Estimation of time-varying covariance matrices for large datasets, Economet. Theory, 37 (2021), 1100–1134. https://doi.org/10.1017/S0266466620000535 doi: 10.1017/S0266466620000535 |
[3] | Y. Zhang, J. Tao, Y. Lv, G. Wang, An improved DCC model based on large-dimensional covariance matrices estimation and its applications, Symmetry, 15 (2023), 953. https://doi.org/10.3390/sym15040953 doi: 10.3390/sym15040953 |
[4] | D. Belomestny, M. Trabs, A. Tsybakov, Sparse covariance matrix estimation in high-dimensional deconvolution, Bernoulli, 25 (2019), 1901–1938. https://doi.org/10.3150/18-BEJ1040A doi: 10.3150/18-BEJ1040A |
[5] | X. Kang, X. Deng, On variable ordination of Cholesky-based estimation for a sparse covariance matrix, Canad. J. Stat., 49 (2021), 283–310. https://doi.org/10.1002/cjs.11564 doi: 10.1002/cjs.11564 |
[6] | N. Bettache, C. Butucea, M. Sorba, Fast nonasymptotic testing and support recovery for large sparse Toeplitz covariance matrices, J. Multivariate Anal., 190 (2022), 104883. https://doi.org/10.1016/j.jmva.2021.104883 doi: 10.1016/j.jmva.2021.104883 |
[7] | W. Liang, Y. Wu, H. Chen, Sparse covariance matrix estimation for ultrahigh dimensional data, Stat, 11 (2022), e479. https://doi.org/10.1002/sta4.479 doi: 10.1002/sta4.479 |
[8] | P. J. Bickel, E. Levina, Covariance regularization by thresholding, Ann. Statist., 36 (2008), 2577–2604. https://doi.org/10.1214/08-AOS600 doi: 10.1214/08-AOS600 |
[9] | T. Cai, W. Liu, Adaptive thresholding for sparse covariance matrix estimation, J. Amer. Stat. Assoc., 106 (2011), 672–684. https://doi.org/10.1198/jasa.2011.tm10560 doi: 10.1198/jasa.2011.tm10560 |
[10] | T. T. Cai, H. H. Zhou, Optimal rates of convergence for sparse covariance matrix estimation, Ann. Statist., 40 (2012), 2389–2420. https://doi.org/10.1214/12-AOS998 doi: 10.1214/12-AOS998 |
[11] | M. Avella-Medina, H. Battery, J. Fan, Q. Li, Robust estimation of high-dimensional covariance and precision matrices, Biometrika, 105 (2018), 271–284. https://doi.org/10.1093/biomet/asy011 doi: 10.1093/biomet/asy011 |
[12] | R. D. Hawkins, G. C. Hon, B. Ren, Next-generation genomics: an intergrative approach, Nat. Rev. Genet., 11 (2010), 476–486. https://doi.org/10.1038/nrg2795 doi: 10.1038/nrg2795 |
[13] | K. Lounici, Sparse principal component analysis with missing observations, In: C. Houdré, D. Mason, J. Rosiński, J. Wellner, High dimensional probability VI, Progress in Probability, 66 (2013), 327–356. https://doi.org/10.1007/978-3-0348-0490-5_20 |
[14] | P. L. Loh, M. J. Wainwright, High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity, Ann. Statist., 40 (2012), 1637–1664. https://doi.org/10.1214/12-AOS1018 doi: 10.1214/12-AOS1018 |
[15] | T. T. Cai, A. Zhang, Minimax rate-optimal estimation of high-dimensional covariance matrices with incomplete data, J. Multivariate Anal., 150 (2016), 55–74. https://doi.org/10.1016/j.jmva.2016.05.002 doi: 10.1016/j.jmva.2016.05.002 |
[16] | J. Fan, Q. Li, Y. Wang, Estimation of high-dimensional mean regression in absence of symmetry and light-tail assumptions, J. R. Stat. Soc. B, 79 (2017), 247–265. https://doi.org/10.1111/rssb.12166 doi: 10.1111/rssb.12166 |
[17] | M. Pascal, Concentration inequalities and model selection, Berlin, Heidelberg: Springer, 2007. https://doi.org/10.1007/978-3-540-48503-2 |
[18] | A. J. Rothman, E. Levina, J. Zhu, Generalized thresholding of large covariance matrices, J. Am. Stat. Assoc., 104 (2009), 177–186. https://doi.org/10.1198/jasa.2009.0101 doi: 10.1198/jasa.2009.0101 |
[19] | T. T. Cai, W. Liu, H. H. Zhou, Estiamtion sparse precision matrix: optimal rates of covariacne and adaptive estimation, Ann. Statist., 44 (2016), 455–488. https://doi.org/10.1214/13-AOS1171 doi: 10.1214/13-AOS1171 |
[20] | D. Li, A. Srinivasan, Q. Chen, L. Xue, Robust covariance matrix estimation for high-dimensional compositional data with application to sales data analysis, J. Bus. Econ. Stat., in press. https://doi.org/10.1080/07350015.2022.2106990 |
[21] | J. Xu, K. Lange, A proximal distance algorithm for likelihood-based sparse covariance estimation, Biometrika, 109 (2022), 1047–1066. https://doi.org/10.1093/biomet/asac011 doi: 10.1093/biomet/asac011 |
[22] | F. Xie, J. Cape, C. E. Priebe, Y. Xu, Bayesian sparse spiked covariance model with a continuous matrix shrinkage prior, Bayesian Anal., 17 (2022), 1193–1217. https://doi.org/10.1214/21-BA1292 doi: 10.1214/21-BA1292 |