An efficient and flexible multiplicity adjustment for chi-square endpoints

Amy Wagler; Melinda McCann; Amy Wagler; Melinda McCann

doi:10.3934/mbe.2021253

Mathematical Biosciences and Engineering

2021, Volume 18, Issue 5: 4971-4986. doi: 10.3934/mbe.2021253

Previous Article Next Article

Research article Special Issues

An efficient and flexible multiplicity adjustment for chi-square endpoints

Amy Wagler ^{1
,
,},
Melinda McCann ²

1.
Department of Mathematical Sciences, The University of Texas at El Paso, El Paso, TX 79968, USA
2.
Department of Statistics, Oklahoma State University, Stillwater, OK 74701, USA

Received: 24 February 2021 Accepted: 19 May 2021 Published: 07 June 2021

This manuscript proposes a fast and efficient multiplicity adjustment that strictly controls the type I error for a family of high-dimensional chi-square distributed endpoints. The method is flexible and may be efficiently applied to chi-square distributed endpoints with any positive definite correlation structure. Controlling the family-wise error rate ensures that the results have a high standard of credulity due to the strict limitation of type I errors. Numerical results confirm that this procedure is effective at controlling familywise error, is far more powerful than utilizing a Bonferroni adjustment, is more computationally feasible in high-dimensional settings than existing methods, and, except for highly correlated data, performs similarly to less accessible simulation-based methods. Additionally, since this method controls the family-wise error rate, it provides protection against reproducibility issues. An application illustrates the use of the proposed multiplicity adjustment to a large scale testing example.
- multiple comparisons,
- simultaneous inference,
- Type I error control
Citation: Amy Wagler, Melinda McCann. An efficient and flexible multiplicity adjustment for chi-square endpoints[J]. Mathematical Biosciences and Engineering, 2021, 18(5): 4971-4986. doi: 10.3934/mbe.2021253

Related Papers:

Abstract

This manuscript proposes a fast and efficient multiplicity adjustment that strictly controls the type I error for a family of high-dimensional chi-square distributed endpoints. The method is flexible and may be efficiently applied to chi-square distributed endpoints with any positive definite correlation structure. Controlling the family-wise error rate ensures that the results have a high standard of credulity due to the strict limitation of type I errors. Numerical results confirm that this procedure is effective at controlling familywise error, is far more powerful than utilizing a Bonferroni adjustment, is more computationally feasible in high-dimensional settings than existing methods, and, except for highly correlated data, performs similarly to less accessible simulation-based methods. Additionally, since this method controls the family-wise error rate, it provides protection against reproducibility issues. An application illustrates the use of the proposed multiplicity adjustment to a large scale testing example.

References

[1]	R Core Team, R: A language and environment for statistical computing, R Found Stat. Comput., Vienna, Austria, 2019. http://www.R-project.org/.
[2]	K. S. Pollard, S. Dudoit, M. J. {van der Laan}, Multiple testing procedures: R multtest package and applications to Genomics, Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, 2005.
[3]	T. Hothorn, F. Bretz, P. Westfall, Simultaneous inference in general parametric models, Biomet. J., 50 (2008), 346-363. doi: 10.1002/bimj.200810425
[4]	C. C. Bartenschlager, J. O. Brunner, A new user specific multiple testing method for business applications: The SiMaFlex procedure, J. Stat. Plan Infer., 2021, Online.
[5]	Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: A practical and powerful approach to multiple testing, J. Royal Stat. Soc. B., 57 (1995), 289-300.
[6]	Y. Benjamini, D. Yekutieli, The control of the false discovery rate in multiple testing under dependency, Ann. Stat., 29 (2001), 1165-1188.
[7]	D. Hunter, An upper bound for the probability of a union, J. Appl. Probab., 13 (1976), 597-603. doi: 10.2307/3212481
[8]	M. McCann, D. Edwards, A path length inequality for the multivariate-t distribution, with applications to multiple comparisons, J. Am. Stat. Assoc., 91 (1996), 211-216.
[9]	D. Q. Naiman, Simultaneous confidence bounds in multiple regression using predictor variable constraints, J. Am. Stat. Assoc., 82 (1987), 214-219. doi: 10.1080/01621459.1987.10478422
[10]	J. Sun, C. R. Loader, Simultaneous confidence bands for linear regression and smoothing, Ann. Stat., (1994), 1328-1345.
[11]	K. J. Worsley, An improved Bonferroni inequality and applications, Biometrika, 69 (1982), 297-302. doi: 10.1093/biomet/69.2.297
[12]	M. Heo, A. C. Leon, Comparison of statistical methods for analysis of clustered binary observations, Stat. Med., 24 (2005), 911-923. doi: 10.1002/sim.1958
[13]	R. B. Arani, J. J. Chen, A power study of a sequential method of p-value adjustment for correlated continuous endpoints, J. Biopharm. Stat., 8 (1998), 585-598. doi: 10.1080/10543409808835262
[14]	S. James, The approximate multinormal probabilities applied to correlated multiple endpoints in clinical trials, Stat. Med., 10 (1991), 1123-1135. doi: 10.1002/sim.4780100712
[15]	S. J. Pocock, N. L. Geller, A. A. Tsiatis, The analysis of multiple endpoints in clinical trials, Biometrics, (1987), 487-498.
[16]	P. C. O'Brien, Procedures for comparing samples with multiple endpoints, Biometrics, (1984), 1079-1087.
[17]	R. E. Tarone, A modified Bonferroni method for discrete data, Biometrics, (1990), 515-522.
[18]	P. H. Westfall, R. D. Tobias, Multiple testing of general contrasts, J. Am. Stat. Assoc., 92 (2007), 299-306.
[19]	P. H. Westfall, J. F. Troendle, Multiple testing with minimal assumptions, Biomet. J., 50 (2008), 745-755. doi: 10.1002/bimj.200710456
[20]	W. Pan, Asymptotic tests of association with multiple SNPs in linkage disequilibrium, Genet. Epidemiol., 33 (2009), 497-507. doi: 10.1002/gepi.20402
[21]	J. Stange, N. Loginova, T. Dickhaus, Computing and approximating multivariate chi-square probabilities, J. Stat. Comput. Sim., 86 (2016), 1233-1247. doi: 10.1080/00949655.2015.1058798
[22]	S. Dudoit, M. J. van der Laan, Multiple tests of association with biological annotation metadata, in Multiple Testing Procedures with Applications to Genomics, Springer Series in Statistics, Springer, (2008), 413-476.
[23]	K. Wright, W. J. Kennedy, Self-validated Computations for the Probabilities of the Central Bivariate Chi-square Distribution and a Bivariate F Distribution, J. Stat. Comput. Sim., 72 (2002), 63-75. doi: 10.1080/00949650211422
[24]	P. R. Krishnaiah (Ed.), Handbook of statistics, Motilal Banarsidass Publisher, (1980).
[25]	J. B. Kruskal, On the shortest spanning subtree of a graph and the traveling salesman problem, Proc. Am. Math. Soc., 7 (1956), 48-50. doi: 10.1090/S0002-9939-1956-0078686-7
[26]	A. K. Bera, Y. Bilias, Rao's score, Neyman's C() and Silvey's LM tests: An essay on historical developments and some new results, J. Stat. Plan Infer., 97 (2001), 9-44. doi: 10.1016/S0378-3758(00)00343-8
[27]	F. Guinot, M. Szafranski, C. Ambroise, F. Samson, Learning the optimal scale for GWAS through hierarchical SNP aggregation, BMC Bioinform., 19 (2018), 1-14. doi: 10.1186/s12859-017-2006-0
[28]	G. Lovison, On Rao score and Pearson $\chi^{2}$ statistics in generalized linear models, Stat. Papers, 46 (2005), 555-574. doi: 10.1007/BF02763005
[29]	D. Pregibon, Score tests in GLIM with applications, In GLIM82: Proceedings of the International Conference on Generalized Linear Models, R Gilchrist (ed.), Lec. Notes Stat., 14, Springer, New York, (1982), 87-97.
[30]	G. K. Smyth, Pearson's goodness of fit statistic as a score test statistic, Science and Statistics: A Festschrift for Terry Speed, D. R. Goldstein (ed.), IMS Lec Notes., 40, Institute of Mathematical Statistics, Beachwood, Ohio, (2003), 115-126. http://www.statsci.org/smyth/pubs/goodness.pdf.

Reader Comments

Your name:*

Email:*
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)