A novel concentration inequality for the sum of independent sub-Gaussian variables with random dependent weights is introduced in statistical settings for high-dimensional data. The random dependent weights are functions of some regularized estimators. We applied the proposed concentration inequality to obtain a high probability bound for the stochastic Lipschitz constant for negative binomial loss functions involved in Lasso-penalized negative binomial regressions. We used this bound to study oracle inequalities for Lasso estimators. Additionally, a similar concentration inequality was derived for a randomly weighted sum of independent centred exponential family variables.
Citation: Huiming Zhang, Hengzhen Huang. Concentration for multiplier empirical processes with dependent weights[J]. AIMS Mathematics, 2023, 8(12): 28738-28752. doi: 10.3934/math.20231471
A novel concentration inequality for the sum of independent sub-Gaussian variables with random dependent weights is introduced in statistical settings for high-dimensional data. The random dependent weights are functions of some regularized estimators. We applied the proposed concentration inequality to obtain a high probability bound for the stochastic Lipschitz constant for negative binomial loss functions involved in Lasso-penalized negative binomial regressions. We used this bound to study oracle inequalities for Lasso estimators. Additionally, a similar concentration inequality was derived for a randomly weighted sum of independent centred exponential family variables.
[1] | V. V. Buldygin, Y. V. Kozachenko, Metric characterization of random variables and random processes, Providence: American Mathematical Society, 2000. |
[2] | S. Boucheron, G. Lugosi, P. Massart, Concentration inequalities: A nonasymptotic theory of independence, Oxford: Oxford University Press, 2013. |
[3] | P. Bühlmann, S. A. van de Geer, Statistics for high-dimensional data: methods, theory and applications, Berlin: Springer, 2011. https://doi.org/10.1007/978-3-642-20192-9 |
[4] | Z. Chi, A local stochastic Lipschitz condition with application to Lasso for high dimensional generalized linear models, arXiv: 1009.1052. https://doi.org/10.48550/arXiv.1009.1052 |
[5] | D. Halikias, B. Klartag, B. A. Slomka, Discrete variants of Brunn-Minkowski type inequalities, Annales de la Faculté des Sciences de Toulouse Mathématiques, 30 (2021), 267–279. https://doi.org/10.5802/afst.1674 |
[6] | Q. Han, J. A. Wellner, Convergence rates of least squares regression estimators with heavy-tailed errors, Ann. Statist., 47 (2019), 2286–2319. https://doi.org/10.1214/18-AOS1748 doi: 10.1214/18-AOS1748 |
[7] | Q. Han, Multiplier U-processes: sharp bounds and applications, Bernoulli, 28 (2022), 87–124. https://doi.org/10.3150/21-BEJ1334 doi: 10.3150/21-BEJ1334 |
[8] | W. Hoeffding, Probability inequalities for sums of bounded random variables, J. Am. Stat. Assoc., 58 (1963), 13–30. https://doi.org/10.1080/01621459.1963.10500830 doi: 10.1080/01621459.1963.10500830 |
[9] | J. Kahane, Propriétés locales des fonctions à séries de Fourier aléatoires, Stud. Math., 19 (1960), 1–25. https://doi.org/10.4064/sm-19-1-1-25 doi: 10.4064/sm-19-1-1-25 |
[10] | S. Li, H. Wei, X. Lei, Heterogeneous overdispersed count data regressions via double-penalized estimations, Mathematics, 10 (2022), 1700. https://doi.org/10.3390/math10101700 doi: 10.3390/math10101700 |
[11] | S. Mendelson, Upper bounds on product and multiplier empirical processes, Stoch. Proc. Appl., 126 (2016), 3652–3680. https://doi.org/10.1016/j.spa.2016.04.019 doi: 10.1016/j.spa.2016.04.019 |
[12] | S. Moriguchi, K. Murota, A. Tamura, F. Tardella, Discrete midpoint convexity, Math. Oper. Res., 45 (2020), 99–128. https://doi.org/10.1287/moor.2018.0984 |
[13] | M. W. Mahoney, J. C. Duchi, A. C. Gilbert, The mathematics of data, Providence: American Mathematical Society, 2018. |
[14] | P. Massart, Some applications of concentration inequalities to statistics, Annales de la Facult des Sciences de Toulouse Mathmatiques, 9 (2000), 245–303. https://doi.org/10.5802/afst.961 doi: 10.5802/afst.961 |
[15] | P. Rigollet, J. C. Hütter, High dimensional statistics, New York: Spring, 2019. |
[16] | R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, arXiv: 1011.3027. https://doi.org/10.48550/arXiv.1011.3027 |
[17] | A. W. Vaart, J. A. Wellner, Weak convergence and empirical processes: with applications to statistics, New York: Springer, 1996. https://doi.org/10.1007/978-1-4757-2545-2 |
[18] | M. J. Wainwright, High-dimensional statistics: a non-asymptotic viewpoint, Cambridge: Cambridge University Press, 2019. |
[19] | Ü. Yüceer, Discrete convexity: convexity for functions defined on discrete spaces, Discrete Appl. Math., 119 (2002), 297–304. https://doi.org/10.1016/S0166-218X(01)00191-3 doi: 10.1016/S0166-218X(01)00191-3 |
[20] | H. Zhang, S. Chen, Concentration inequalities for statistical inference, Commun. Math. Res., 37 (2021), 1–85 https://doi.org/10.4208/cmr.2020-0041 doi: 10.4208/cmr.2020-0041 |
[21] | H. Zhang, J. Jia, Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signals detection, Stat. Sinica, 32 (2022), 181–207. https://doi.org/10.5705/SS.202019.0315 doi: 10.5705/SS.202019.0315 |
[22] | H. Zhang, X. Lei, Growing-dimensional partially functional linear models: non-asymptotic optimal prediction error, Phys. Scr., 98 (2023), 095216. https://doi.org/10.1088/1402-4896/aceac0 doi: 10.1088/1402-4896/aceac0 |
[23] | H. Zhang, H. Wei, G. Cheng, Tight non-asymptotic inference via sub-Gaussian intrinsic moment norm, arXiv: 2303.07287. https://doi.org/10.48550/arXiv.2303.07287 |