Fail to reject null | Reject null | ||
True Null | U | V | n0 |
True Alternative | T | S | n1 |
W | Q | n |
Citation: Debashis Ghosh. Wavelet-based Benjamini-Hochberg procedures for multiple testing under dependence[J]. Mathematical Biosciences and Engineering, 2020, 17(1): 56-72. doi: 10.3934/mbe.2020003
[1] | Viliam Ďuriš, Vladimir I. Semenov, Sergey G. Chumarov . Wavelets and digital filters designed and synthesized in the time and frequency domains. Mathematical Biosciences and Engineering, 2022, 19(3): 3056-3068. doi: 10.3934/mbe.2022141 |
[2] | Qian Zhang, Haigang Li, Ming Li, Lei Ding . Feature extraction of face image based on LBP and 2-D Gabor wavelet transform. Mathematical Biosciences and Engineering, 2020, 17(2): 1578-1592. doi: 10.3934/mbe.2020082 |
[3] | Rahat Zarin, Usa Wannasingha Humphries, Amir Khan, Aeshah A. Raezah . Computational modeling of fractional COVID-19 model by Haar wavelet collocation Methods with real data. Mathematical Biosciences and Engineering, 2023, 20(6): 11281-11312. doi: 10.3934/mbe.2023500 |
[4] | Amy Wagler, Melinda McCann . An efficient and flexible multiplicity adjustment for chi-square endpoints. Mathematical Biosciences and Engineering, 2021, 18(5): 4971-4986. doi: 10.3934/mbe.2021253 |
[5] | Xinlin Liu, Viktor Krylov, Su Jun, Natalya Volkova, Anatoliy Sachenko, Galina Shcherbakova, Jacek Woloszyn . Segmentation and identification of spectral and statistical textures for computer medical diagnostics in dermatology. Mathematical Biosciences and Engineering, 2022, 19(7): 6923-6939. doi: 10.3934/mbe.2022326 |
[6] | Jinyi Tai, Chang Liu, Xing Wu, Jianwei Yang . Bearing fault diagnosis based on wavelet sparse convolutional network and acoustic emission compression signals. Mathematical Biosciences and Engineering, 2022, 19(8): 8057-8080. doi: 10.3934/mbe.2022377 |
[7] | Song Yang, Huibin Wang, Hongmin Gao, Lili Zhang . Few-shot remote sensing scene classification based on multi subband deep feature fusion. Mathematical Biosciences and Engineering, 2023, 20(7): 12889-12907. doi: 10.3934/mbe.2023575 |
[8] | Selene Tomassini, Annachiara Strazza, Agnese Sbrollini, Ilaria Marcantoni, Micaela Morettini, Sandro Fioretti, Laura Burattini . Wavelet filtering of fetal phonocardiography: A comparative analysis. Mathematical Biosciences and Engineering, 2019, 16(5): 6034-6046. doi: 10.3934/mbe.2019302 |
[9] | Michele La Rocca, Cira Perna . Designing neural networks for modeling biological data: A statistical perspective. Mathematical Biosciences and Engineering, 2014, 11(2): 331-342. doi: 10.3934/mbe.2014.11.331 |
[10] | María del Carmen Pardo, Beatriz Cobo . Comparison of methods to testing for differential treatment effect under non-proportional hazards data. Mathematical Biosciences and Engineering, 2023, 20(10): 17646-17660. doi: 10.3934/mbe.2023784 |
Consideration of high-dimensional data has come to the forefront of current statistical practice because of scientific advancements in genomics, astronomy and imaging. This has led to a reemergence of the study of multiple comparisons procedures. Many authors have recently advocated for control of the false discovery rate (FDR) [1] relative to the traditional familywise type I error (FWER). The Benjamini-Hochberg (B-H) procedure [1] has received much recent attention in that it controls FDR, which is a more liberal error criterion relative to FWER. This can lead to greater power relative to a multiple testing adjustment using the Bonferroni correction.
One issue that has been of extreme interest has been on adapting the Benjamini-Hochberg procedure to accommodate dependence. The original paper of Benjamini and Hochberg [1] assumed that the p-values being tested were statistically independent. In followup work, Benjamini and Yekutieli [2] showed that the B-H procedure was valid in finite-samples under a condition known as positive regression dependence. More generally, recent work on FDR-controlling procedures under dependence has fallen into three tracks. The first has been to develop procedures that are valid under the positive regression dependence condition and related conditions [3,4,5]. The second has been to develop FDR-controlling procedures that will be robust to dependence that vanishes asymptotically [6,7]. The third, which is more recent, has been to model the dependence and incorporate it directly into the FDR-controlling procedure. This has been done by several authors [8,9,10,11,12,13]; we will review their proposals in Section 2.2.
In this article, we will develop B-H-type procedures for arbitrary dependence. Our starting point is quite different from the methods listed in the previous paragraph. We will make use of the fact that the B-H procedure sorts p-values; this will lead to a useful characterization of the B-H procedure using spacings [14,15]. The other innovation that will be employed here are wavelets [16,17]. While FDR-controlling procedures have been developed in conjunction with wavelet estimators [18,19], these methods bear absolutely no resemblance to the procedures being proposed here. The structure of this paper is as follows. In Section 2, we give some background on multiple testing, describe the procedure of Benjamini and Hochberg [1] and review previous proposals to accommodating dependence. We also provide a brief overview on wavelets. In Section 3, we recast the B-H procedure using spacing results and motivate the wavelet-based methodology alluded to in the title of the paper. In addition, we demonstrate FDR control asymptotically and discuss adaptive estimation. Some simulation studies illustrating the proposed methodology, along with application to several microarray datasets are given in Section 4. We conclude with some discussion in Section 5.
We deal with the problem of testing n null hypotheses H10,…,Hn0. Note that the superscript indicates which hypothesis is being tested, while the subscript refers to the null hypothesis. For these null hypotheses, we have corresponding p-values p1,…,pn. Suppose that out of the n hypotheses, n0 are true. We can then use the following 2×2 table to define error metrics for multiple comparisons:
The rows of the table allude to truth (i.e., whether the null or alternative is correct), while the column is based on the outcome of the testing procedure. Based on Table 1, the FWER is defined as P(V≥1). Benjamini and Hochberg [1] define the FDR to be
FDR≡P(Q>0)E[VQ|Q>0]. |
Fail to reject null | Reject null | ||
True Null | U | V | n0 |
True Alternative | T | S | n1 |
W | Q | n |
In addition, they propose the following algorithm for multiple testing with p1,…,pn:
Box 1. Benjamini and Hochberg (1995) [1] procedure
(a) Let p(1)≤p(2)≤⋯≤p(n) denote the ordered, observed p-values.
(b) Define α to be the target FDR. Find ˆk=max{1≤i≤n:p(i)≤αi/n}.
(c) If ˆk exists, then reject null hypotheses p(1)≤⋯≤p(ˆk). Otherwise, reject nothing.
Benjamini and Hochberg [1] demonstrated the remarkable result that the procedure in Box 1 controls the FDR at level α when the p-values are independent and uniformly distributed. In fact, the procedure controls the FDR at level n0α/n. This has sparked interest developing adaptive estimators for n0 or equivalently, π0≡n0/n [7,20,21].
In terms of results involving error control of the FDR procedure under dependence, they fall into two classes: those that achieve exact FDR control in finite samples and those that achieve asymptotic FDR control. For the first class of results, Benjamini and Yekutieli [2] showed that the Benjamini-Hochberg procedure has exact error control under positive regression dependence. More recently, Sarkar [3,4] has developed useful inequalities in order to prove the exact FDR control of generalizations of the Benjamini-Hochberg procedure under positive regression dependence. Blanchard and Roquain [5] and Finners et al. [22] provide proofs of the validity of the B-H procedure and related procedures under various dependence conditions related to those considered by [2].
The second class of results involving FDR procedures are asymptotic in nature. Several authors have identified the equivalence between the B-H procedure with a thresholding rule based on the empirical distribution of the p-values [6,7,23]. Based on the equivalence, one can then prove results about asymptotic control of the B-H procedure provided that the empirical cumulative distributions of the true null hypotheses and false null hypotheses converge to the correct quantities. In recent work, Wu [24] developed a weak convergence result about the B-H procedure that held under more general dependence conditions.
More recent work has focused on modelling the correlation between test statistics or p-values in the multiple comparisons problem. Efron [10,11] developed a methodology in which the correlation between the test statistics is modelled parametrically using a latent factor representation. This has been extended by Fan et al. [25] and Desai and Storey [26]. This approach has the effect of accommodating deviations in the variance of the test statistics for the null hypothesis by adjustment for correlation. Perone Pacifico et al. [8] use Gaussian random field theory results [27] in order to construct envelopes for the false discovery proportion. In this approach, the covariance between test statistics is modelled parametrically through use of a covariance function. Leek and Storey [34] model the data using a so-called 'dependence kernel' in order to account for correlation. Schwartzman and Lin [13] formulate a model for the mean and variance for the number of rejections. The latter quantity had a term that incorporated correlation between the p-values. Based on the model, they were able to show that the usual FDR-controlling procedure fails to control the FDR asymptotically. Sun and Cai [9] modelled test statistics as arising from a Hidden Markov model. They derived an optimal classification rule using decision-theoretic ideas that was based on what they term as the local index of significance and has a Bayesian interpretation as a posterior probability of differential expression.
The proposed methodology in Section 3.2. will require use of wavelet estimation procedures. In this section, we provide a brief review of this methodology. The goal is to model a function y=g(x)+ϵ, where y represents an outcome variable, x a fixed predictor variable, g an unknown function and ϵ an error term. We observe the data (xi,yi), i=1,…,K. For the purposes of exposition, we will initially assume that K=2J for some integer J.
Wavelets are functions that allow for the decomposition of g by projection onto localized basis functions. We can define an appropriate function sequence as follows. We first define the function ϕ by
ϕ(x)={1 0≤x≤10 else. |
The function ϕ(x) is referred to as the Haar father wavelet. Given ϕ, we define
V0={f:f(x)=∑k∈Zckϕ(x−k), ∑k∈Zc2k<∞} |
with Z denoting the set of nonnegative integers. Now recursively define Vi for i≥1 by Vi={f:f(x)=g(2x),g∈Vi−1}. This leads to a sequence of of subspaces {Vn,n≥0} with the following properties. First, the subspaces are nested, i.e. Vk⊂Vk+1∀k≥0. Second, if we define V=⋃jVj, where the union is taken over all j≥0, then V is dense in L2(R). This representation is termed as the multiresolution analysis [16]. By orthogonalizing the sequence {Vn,n≥0}, we arrive at the following representation for any f∈L2:
f(x)=∞∑k=0α0kϕ0k(x)+∞∑j=0∞∑k=0βjkϕjk(x), |
where α0k=∫f(x)ϕ0k(x) and βjk=∫f(x)ϕjk(x), the integrals being taken over the real line. The first set of coefficients are referred to as the scaling coefficients, while the latter are termed the detail coefficients. Intuitively, the former capture the trend that is in the data, while the latter model the residuals or noise after the trend is accounted for. Another property is that {ϕ0k,ϕjk}, j,k≥0 define an orthonormal basis.
While we have described the multiscale resolution analysis approach assuming the father wavelet is the Haar wavelet, it is obvious that the father wavelet is not smooth. Daubechies [17] argued against the use of such wavelets and proposed an alternative father wavelet that is smooth, compactly supported and almost symmetric. It is referred to in the literature as the symmlet. The symmlet does not have a closed analytical form in contrast to the Haar wavelet. We will use this transform throughout the paper. Computationally, estimation in the wavelet framework proceeds by use of the wavelet transform. Wavelets have several desirable properties. First, they are computationally quite fast. Second, it provides good localization in both the frequency and spatial domains. Finally, by projection onto the orthonormal basis functions, wavelets are effective in decorrelating data. This is the property that is of primary interest to our problem.
An example of wavelet estimators is given in Figure 2, which represents data from financial returns from IBM that can be found in the R package waveslim. The first line shows the observed data; the second through fifth lines are the wavelet coefficients for levels one through four.
As a practical matter, in most situations, the number of tests will not be a dyadic power. In this instance, we will instead use the maximal overlap discrete wavelet transform with a periodic boundary condition [34], which allows for any value of n. The disadvantage is that this is not a orthogonal transformation of the data. In discussing theoretical results in Section 3.3., we will again return to assuming that n is a dyadic power.
We now review some results from [14] on spacings and show how they can be used to recast the B-H procedure. We will begin by assuming that U1,…,Un denote a random sample from the Uniform(0, 1) distribution. Then the spacings are defined as Vi=U(i)−U(i−1), for i=1,…,n+1, where U(i) denotes the ith order statistic of U, U(0)=0 and U(n+1)=1. Note that the joint density of the spacings is given by
f(v1,…,vn)={n! if vi≥0 for all i and ∑n+1i=1vi=10 otherwise. | (3.1) |
Because of the constraint that the vi's sum to one, the spacings are not statistically independent. However, they do have a marginal Beta(1,n+1) distribution.
Using these facts, we can reexpress the Benjamini-Hochberg procedure in the following way. Letting
˜pi=p(i)−p(i−1), i=1,…,n+1, |
the B-H procedure in Box 1 rejects p(1),…,p(ˆk), where
ˆk=max{i:i−1i∑j=1˜pi≤α(n+1)n−1E(˜p1)}, | (3.2) |
with ˆk=0 if the set in (3.2) is empty.
The B-H procedure assesses clustering of p-values using first-order differences in the sorted p-values. In particular, gaps between the sorted p-values that are smaller than expected, i.e. E(˜p1), constitute evidence against the null hypothesis.
There are several insights that are gained by the spacings approach. First, while the p-values for H10,…,Hn0 have a certain joint distribution, it is in fact the joint distribution of the order statistics, or equivalently, the spacings that matters in determining the operating characteristics of the B-H procedure. The second insight is that the order statistics, by definition, provides an ordering of the p-values. However, it comes with the constraint that they are in increasing order. By contrast, there is no such constraint for the spacings, but they still have an ordering. The ordering is isomorphic to a one-dimensional time series. It is precisely this property we will exploit when developing our proposed methodology for multiple testing in the presence of dependence.
The proposed methodology is conceptually simple in nature and can be described as follows.
Box 2. Proposed procedure
(a) Let p(1)≤p(2)≤⋯≤p(n) denote the ordered, observed p-values.
(b) Define α to be the target FDR. Calculate ˜p1,…,˜pn+1.
(c) Transform ˜p1,…,˜pn+1 into observations ˜p∗1,…,˜p∗n+1 using the wavelet transform.
(d) Find k∗=max{i:i−1∑ij=1˜p∗j≤α/n}.
(e) If k∗ exists, then reject null hypotheses corresponding to p(1)≤⋯≤p(k∗). Otherwise, reject nothing.
The key step of the procedure in Box 2 is the transformation in step (c). It has to take the spacings and transform them into new data points that we term pseudospacings. To compute pseudo-spacings, we will utilize the wavelet methods discussed in Section 3.1. The code, available at https://github.com/ghoshd/WaveBH, shows that we use a non-decimated discrete wavelet transform using the spacings where we only consider the level-1 coefficients. The pyramid algorithm [16] was used in the computation of the discrete wavelet transform. In our simulation studies in Section 4.1., we found these procedures to have satisfactory properties compared to existing methods. Optimization of these parameters from a practical and theoretical perspective is an open question and one that we leave to future research. The strategy we are taking here can be thought of as transforming the problem into a domain that is better behaved statistically, performing the problem there and backtransforming. What we mean by ''better statistical behavior'' is that by application of the wavelet transform to the spacings, we will be dealing''nearly statistically independent" observations due to their being projections onto orthonormal basis functions. Applying the cumulative sum operator to the spacings will lead to a B-H procedure on transformed order statistics of ''nearly independent'' p-values. For these p-values, we can simply apply the usual B-H procedure. We will attempt to formalize this notion more mathematically in the next section.
Remark 1. If we are to compare step (d) of the algorithm in Box 2. with (3.2), we see that the difference is we have replaced the spacings for the observed p-values with orthogonal transformations thereof. Note that the orthogonal transformation should have no effect on the mean structure of the procedure, so we would expect E(˜p1) and E(˜p∗1) to the same. However, the orthogonal transforms have the effect of transforming the spacings into a space where the correlation structure is lower than for the observed spacings.
Remark 2. An obvious question might be why we are not attempting to apply the wavelet transform to the order statistics of the p-values themselves. Note that if the the original p-values are iid Uniform(0, 1) random variables, then the marginal distribution for the ith order statistic will be Beta(i,n−i+1), i=1,…,n. In particular, the ith order statistic has a marginal variance that depends on i. We will use a working regression model to study the behavior of the proposed procedure from a theoretical point of view, and one of the key assumptions will be constant variance of the error term. Such an assumption would be violated by consideration on the original order statistic scale.
Remark 3. In order to develop some intuition behind this procedure, we recall the work of [7]. In their Lemma 3, they show that in the situation of the null p-values being independent, the process
∑i∈NI(pi≤t)t≡∑ni=1I(p(i)≤t,i∈N)t |
where N represents the indices for the true null hypotheses, is a martingale with respect to the σ−filtration Ft=σ(I{pi≤s},t≤s≤1,i=1,…,n). The application of the wavelet transform in our step (c) can be viewed as an attempt to compute orthogonal martingale increments for the process here. Again, the key property that we are exploit of wavelets is the fact that they represent orthogonal projections.
Remark 4. As pointed by a reviewer, the choice of orthogonal basis is relatively arbitrary. We have also implemented the methodology using a Fourier basis, but we found the methodology to have substantially lower power in the simulation studies described in Section 4. How to select an optimal basis onto which to project the spacings remains an open topic for future investigation.
As alluded to earlier, we will now assume that n, the number of tests, is a dyadic power in this section. To provide theoretical justification for the proposed procedure, we first consider the case of independent p-values. As noted previously, the spacings will be distributed as dependent Beta(1,n+1) random variables. Then it can be shown through some calculation that the covariance between any two spacings ˜pi and ˜pj, i≠j is given by
Cov(˜pi,˜pj)=−1(n+1)2(n+2) |
and that the variance of ˜pi is n(n+1)−2(n+2)−1. Thus, the spacings corresponding to a random sample of p-values can be conceptualized as a time series with exchangeable correlation structure between any two time points. While this is certainly a nonstandard type of time series, we will consider other dependence structures later in the section.
In the case of independent p-values, we have that the B-H procedure controls the FDR at level α. It can be reexpressed in terms of the following lemma based on our spacings representation [15]:
Lemma 1. Suppose that the p-values p1,…,pn correspond to a random sample from the Uniform(0,1) distribution and that we reject hypotheses p(1),…,p(ˆk) where
ˆk=max{1≤i≤n:i−1i∑j=1˜pj≤α/n}. |
Note that if the set is empty, we fail to reject any hypotheses. Then the FDR of the procedure is controlled at level α.
If we apply the wavelet transformation W to the spacings, then this will not change the joint distribution of the spacings because the wavelet estimator is an orthogonal transformation. Thus, in theory we could replace ˜p(i) by ˜p∗(i), i=1,…,n in the statement of Lemma 1.
We now invoke a working regression model that captures the essential features of this problem. The model is given by the following:
˜pj=f(j/n)+ϵj | (3.3) |
where f is an unknown function, and ϵ is an error term normally distributed with zero mean and constant variance. Matching this up with the case of the p-values coming from a random sample, we have that f(x)=(n+1)−1 and that the variance of ϵj is n{(n+1)2(n+2)}−1. While a Gaussian error structure cannot be strictly true for the spacings, it nevertheless is useful in finding insights about the proposed multiple testing procedure.
We first consider the case of (3.3) with ϵj, j=1,…,n coming from a stationary Gaussian process with covariance rl=Cov(ej,ej+l). We next prove the following result.
Theorem 1. Assume that
∞∑l=−∞|rl|<∞. |
Then the proposed procedure in Box 1 asymptotically controls the FDR at level α.
Proof. Let [a] denote the greatest integer less than or equal to a. We first define the process
˜Pn(t)=1n[nt]∑j=1˜pj=n−1[nt]∑j=1f(j/n)+n−1[nt]∑j=1ej=Fn(t)+ϵn(t). |
Then by Lemma 5.1. of Taqqu [33], n1/2{˜Pn(t)−Fn(t)} converges in distribution to τB(t), where B(t) is a Brownian motion and τ2=∑∞l=−∞rl. Now define η=τn−1/2. Then ˜Pn(t) can be approximated by
˜P(t)=F(t)+ϵB(t), | (3.4) |
where F(t)=∫t0f(s)ds. Now define ψjk(t)≡2j/2ψ(2jt−k) as a wavelet basis on the real line using a wavelet ψ with compact support. Define {˜ψl,l∈L} to be the corresponding wavelet basis on [0,1]; one such construction can be found in Cohen et al. (1992) [34]. The set L is defined by pairs (j,k), j≥j0, k=1,…,2j for the wavelet functions and (j0−1,k) for the scaling functions. Next, we form inner products
yl=∫˜ψld˜P, |
μl=∫˜ψlf, |
and zl=∫˜ψldB. As n→∞, we assume that μl approaches zero. This leads to the following equivalent model relative to (3.3):
yl=ϵzl, l∈L. |
While the distribution of zl at a given level is not spherically normal, we can argue as in Johnstone and Silverman ([30], p. 337) that this induces only a slight approximation error so that we can use the following model as being equivalent to (3.3):
yl=ϵzl, l∈L, zli.i.d.∼N(0,1). | (3.5) |
For a given level j0, the process ∑l∈Nyl is a martingale, so we can apply arguments as in Lemmas 3 and 4 of Storey et al. [7] to prove that the procedure based on rejecting hypotheses p(1),…,p(ˆk) where
˜k=max{1≤i≤n:i−1i∑j=1yj≤α/n}. |
will control the false discovery rate at level α. Because of the equivalence of (3.5) with (3.3), we have the desired result, which corresponds to step (d) of the proposed algorithm. Based on the application of the inverse wavelet transform, which is an orthogonal transformation, the FDR is controlled. This completes the proof.
The type of dependence being considered in Theorem 1 can be thought of as short-range dependence. However, because the dependence structure is for the spacings of the p-values, this induces a different dependence structure for the p-values themselves that might not be short-range dependence. As an example, the theorem would cover the situation of the spacings being from an autoregressive process of order 2. A simulated example of the spacings and ordered p-values from such a process is shown in Figure 2. On the p-value scale, there is induced long-range dependence as a result of averaging over the locally correlations from the left-hand graph in Figure 2.
We now discuss how the theoretical results could be extended to the case of a more long-range dependence structure in the spacings. Assume that the spacings satisfy fractional Brownian motion:
Cov(˜pj,˜pk)=VH2(|j|2H+|k|2H−|j−k|2H), | (3.6) |
where VH=−Γ(2−2H)cos(πH)[πH(2H−1)]−1, and H is the so-called Hurst parameter. Note that there is a one-to-one correspondence between H and the decay parameter α if we assume that the autocorrelation function of the spacings is the of the form Ak−α for 0<α<1. It is given by H=1−α/2. We can then use the arguments in the proof of Theorem 1 to approximate ˜Pn(t) by
˜P(t)=F(t)+ϵαBH(t), |
and BH(t) is a fractional Brownian motion process, i.e. a zero-mean Gaussian process with covariance function given by the continuous analog of (3.6). Then proceeding as in the proof of Theorem 1, we have that an equivalent model is given by
yl=ϵαλjzl, l∈L, | (3.7) |
where λ2j=2−j(1−α)τ2, τ=2A{(1−α)(1−2α)}−1 and
zl=λ−1j∫˜ψldBH. |
We point out that from Johnstone and Silverman [30], the zl are not independent but have bounded dependence in that there exists a constant ρ0 such that 0<ρ0≤Var(zl|zm,m≠l)≤1. One can then develop a B-H style rule in the transformed space and back-transform as in our proposed procedure.
For many situations in multiple testing, we expect there to be a large number of true null hypotheses. As alluded to earlier, one finding of Benjamini and Hochberg [1] was that their procedure controlled the FDR at level n0α/n in the setting where the test statistics for testing n hypotheses are independent. This finding suggests the possibility of estimating n0 from the data to improve power in the B-H procedure. A large amount of literature has been devoted to developing estimators of n0 or equivalently, π0≡n0/n. A nice overview of estimators of this quantity can be found in [20].
Given the nature of our testing procedure in Box 2, it seems natural to consider the following adaptive version of the procedure under dependence:
Box 3. Proposed adaptive procedure
(a) Let p(1)≤p(2)≤⋯≤p(n) denote the ordered, observed p-values.
(b) Define α to be the target FDR. Calculate ˜p1,…,˜pn+1.
(c) Transform ˜p1,…,˜pn+1 into observations ˜p∗1,…,˜p∗n+1 using the wavelet transform.
(d) Estimate n0 using the algorithm of [21].
(e) Find ˜k∗=max{i:i−1∑ij=1˜p∗j≤α/ˆn0}.
(f) If ˜k∗ exists, then reject null hypotheses corresponding to p(1)≤⋯≤p(˜k∗). Otherwise, reject nothing.
We make several observations here. First, while we are using the procedure of Benjamini et al. [21] to estimate n0 in step (d), any procedure that estimates π0 could be used in step (d) simply by converting to an estimate of n0 through ˆn0=nˆπ0. Second, the only difference between the procedure in Box 3 versus that in Box 2 is the use of an estimator of n0 for determining the cutoff in step (d). The procedure in Box 2 implicitly assumes that n0=n. Examining the proof of Theorem 1, it can be shown in the transformed space, the procedure actually controls the FDR at level (n0/n)α. Thus, slight modifications of the proof of Theorem 1 lead to the following result:
Theorem 2. Assume that
∞∑l=−∞|rl|<∞. |
Then the proposed procedure in Box 1 asymptotically controls the FDR at level n0α/n.
In this section, via simulation we compare the proposed methodologies in this paper to the Benjamini-Hochberg procedure, the Benjamini-Yekutieli correction [2], which are known to be valid under arbitrary dependence, . We wished to compare procedures that only involve the univariate p-values. Two competitor we do not show here are the Bonferroni pocedure as well as the correlation-adjusted empirical null hypothesis methodology of Efron (2010) [11], there were two drawbacks against it. First, when π0 is small, the methodology fails to give a numerically stable estimator for π0. Second, in situations where π0 could be estimated, it turned out that the performance of the method was similar to that from the Bonferroni correction, both of which were worse than the methods presented in the Table (data not shown). We considered several correlation scenarios. The one we report on in this paper is the ''clumpy dependence'' scenario that was also considered in [7]. This corresponds to the situation in genomics where genes act in networks. The null statistics are distributed as standard normal random variables, while the true alternative have normal distributions with mean two and variance one. For these simulations, 3000 test statistics were correlated in groups of ten with correlation of ±ρ. Formally, for 1≤k≤l≤10,
Σkl={1, k=l,ρ, k<l≤5,−ρ, k≤5,l>5, |
and the covariance of the test statistics is the Kronecker product of Σ with a 300×300 identity matrix. The proportion of true null hypotheses was set to be π0=0.2,0.5 and 0.9. For each simulation setting, 1000 simulated datasets were generated.
We first show the estimated FDRs at two thresholds, 0.05 and 0.01, of the various procedures in Table 2.
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0097(0.001) |
0.2 | 0.4 | 0.01 | 0.029(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.001) |
0.2 | 0.6 | 0.01 | 0.04(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0099(0.001) |
0.2 | 0.8 | 0.01 | 0.031(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.002) |
0.2 | 0.2 | 0.01 | 0.013(0.01) | 0.003(0.019) | 0.009(0.001) | 0.0091(0.001) |
0.2 | 0.4 | 0.01 | 0.019(0.01) | 0.003(0.0004) | 0.019(0.001) | 0.0098(0.002) |
0.2 | 0.6 | 0.01 | 0.028(0.01) | 0.002(0.003) | 0.019(0.004) | 0.0099(0.002) |
0.2 | 0.8 | 0.01 | 0.035(0.01) | 0.003(0.001) | 0.019(0.002) | 0.0097(0.002) |
0.5 | 0.2 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.4 | 0.05 | 0.059(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.6 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.8 | 0.05 | 0.054(0.01) | 0.001(0.002) | 0.024(0.003) | 0.039(0.004) |
0.5 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.003) | 0.005(0.002) | 0.006(0.002) |
0.5 | 0.4 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.0002) | 0.006(0.001) |
0.5 | 0.6 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.002) | 0.007(0.001) |
0.5 | 0.8 | 0.01 | 0.03(0.01) | 0.002(0.002) | 0.005(0.002) | 0.006(0.003) |
0.9 | 0.2 | 0.05 | 0.06(0.01) | 0.01(0.003) | 0.037(0.005) | 0.039(0.005) |
0.9 | 0.4 | 0.05 | 0.055(0.01) | 0.01(0.003) | 0.037(0.006) | 0.039(0.006) |
0.9 | 0.6 | 0.05 | 0.062(0.01) | 0.012(0.004) | 0.037(0.007) | 0.040(0.007) |
0.9 | 0.8 | 0.05 | 0.057(0.01) | 0.012(0.006) | 0.038(0.009) | 0.041(0.009) |
0.9 | 0.2 | 0.01 | 0.016(0.01) | 0.003(0.002) | 0.003(0.002) | 0.004(0.002) |
0.9 | 0.4 | 0.01 | 0.021(0.01) | 0.003(0.003) | 0.004(0.002) | 0.004(0.002) |
0.9 | 0.6 | 0.01 | 0.019(0.01) | 0.003(0.003) | 0.005(0.003) | 0.005(0.003) |
0.9 | 0.8 | 0.01 | 0.027(0.01) | 0.003(0.004) | 0.004(0.0003) | 0.004(0.003) |
Several interesting findings arise from this table. For BH, we see that FDR control is not achieved with this clumpy dependence scenario, a finding also noted in Storey (2002). As is well-known, the Benjamini-Yekutieli procedures are quite conservative in their control of FDR. In this regard, the proposed methods (the 'Wavelet' and 'Wavelet-Adapt' columns) tend to be much closer to the nominal FDR levels. They tend to do better for smaller values of π0, although there is a suggestion of anticonservatism of the adaptive method for π0=0.5. The results are all not sensitive to the correlation level, which is an interesting finding. This is in keeping with the idea that the wavelet transform is effective at removing correlation between the p-values. Next, we study the corresponding powers, shown in Table 3.
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.4 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.6 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.8 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.02) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.93 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.2 | 0.2 | 0.01 | 0.94 (0.01) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.5 | 0.2 | 0.05 | 0.91 (0.01) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.006) |
0.5 | 0.4 | 0.05 | 0.92 (0.01) | 0.40 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.6 | 0.05 | 0.91 (0.02) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.005) |
0.5 | 0.8 | 0.05 | 0.92 (0.02) | 0.39 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.2 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.005) | 0.70(0.006) |
0.5 | 0.4 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.006) | 0.69(0.007) |
0.5 | 0.6 | 0.01 | 0.09 (0.002) | 0.30 (0.003) | 0.60 (0.006) | 0.70 (0.007) |
0.5 | 0.8 | 0.01 | 0.09 (0.002) | 0.30 (0.002) | 0.59 (0.005) | 0.69 (0.006) |
0.9 | 0.2 | 0.05 | 0.11 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.4 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.6 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.8 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.001) | 0.05 (0.002) |
0.9 | 0.2 | 0.01 | 0.016 (0.001) | 0.015 (0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.4 | 0.01 | 0.018 (0.001) | 0.014(0.001) | 0.012(0.002) | 0.012 (0.002) |
0.9 | 0.6 | 0.01 | 0.018 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.8 | 0.01 | 0.019 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |
In terms of power, the proposed methods appear to have slightly larger power than B-Y. While the B-H has much power, keep in mind that it violated FDR control in Table 2. Thus, the comparison with our proposed methodology is not a fair one. In this simulation setup, the main determinant of the power gain appears to be α and π0. For smaller values of π0, the proposed methods have bigger power gains relative to existing methods. This advantage diminishes as π0→1. Again, the methodology is quite resistant to the correlation structure.
We applied the proposed methodology to several gene expression datasets. The datasets came from the LBE library in R. The first was the breast cancer gene expression dataset from Hedenfalk et al. [31]. While there were three types of breast cancer samples that were profiled, we focus here on samples with the BRCA1 mutation (seven samples) and those with the BRCA2 mutation (eight samples). After microarray preprocessing, there were 3170 genes under consideration. At a false discovery rate of 0.05, the Bonferroni procedure selected two genes as significant. The Benjamini-Yekutieli (2001) procedure selected three genes as significant [2]. The Benjamini-Hochberg procedure selected 162 genes as significant. The proposed methods in Boxes 2 and 3 selected 92 and 93 genes, respectively, as statistically significant at a false discovery rate level of 0.05.
The next dataset that was studied was the acute leukemia (AML/ALL) dataset from the Golub et al. [32] gene expression study. In this study, there were n=7129 p-values considered, calculated from unpooled two-sample t-tests comparing the samples in the AML group (11 samples) to those in the ALL group (27 samples). Using an FDR of 0.05, the Bonferroni adjustment selected 243 genes, the Benjamini-Yekutieli (2001) procedure 527 [2], Benjamini-Hocbherg (1995) 1519 [1], and the procedures in Boxes 2 and 3 1224 and 1331 genes, respectively.
In this paper, we have developed a B-H-type procedure that accommodates dependence through the use of the wavelet transform. This is done via a spacings decomposition of the B-H procedure [15] to which the wavelet transform is applied. The proposed methodology attempts to keep the structure of the B-H procedure while at the same time decorrelating dependence using the wavelet transform. The intuition behind the procedure is that it attempts to compute orthogonal increments in the wavelet space so that the procedure in the transformed space works with 'effectively independent' data. We have shown the superiority of the proposed methodology in the 'clumpy dependence' model that was studied in [7]. This situation has proven to be challenging to most multiple testing procedures.
There are many potential extensions of this work. First, the wavelet procedure was not optimized in any sense. We simply used the information from the level-1 wavelet transform. It might be possible to combined information from multiple levels in order to improve power of the procedure. Second, the wavelet basis function might not make sense because of the restriction of spacings to the [0, 1] range. While we did some comparisons with using Fourier basis functions, another open problem is to see whether or not use of a different orthonormal basis might lead to different performance of the procedure. The procedure is quite scalable to handling large numbers of p-values. Finally, the intuition behind the procedure that we transform the problem to one where there is 'independent information.' This elementary principle should be transferrable to other problems in statistics. We have made available the code for implementing functions and their application to the real datasets in Section 4.2. in github (https://github.com/ghoshd/WaveBH).
The author declares there is no conflict of interest.
[1] | Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, R. Stat. Soc. B, 57 (1995), 289-300. |
[2] | Y. Benjamini and D. Yekutieli, The control of the false discovery rate in multiple testing under dependency, Ann. Stat., 29 (2001), 1165-1188. |
[3] | S. K. Sarkar, Some results on false discovery rates in stepwise multiple testing procedures, Ann. Stat., 30 (2002), 239-257. |
[4] | S. K. Sarkar, False discovery and false nondiscovery rates in single-step multiple testing procedures, Ann. Stat., 34 (2006), 394-415. |
[5] | G. Blanchard and E. Roquain, Two simple sufficient conditions for FDR control, Electron. J. Statist., 2 (2008), 963-992. |
[6] | C. R. Genovese and L. Wasserman, A stochastic process approach to false discovery control, Ann. Stat., (2004), 1035-1061. |
[7] | J. D. Storey, J. E. Taylor and D. Siegmund, Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach, J. R. Stat. Soc. B, 66 (2004), 187-205. |
[8] | M. P. Pacifico, C. Genovese, I. Verdinelli, et al., False discovery control for random fields, J. Am. Stat. Assoc., 99 (2004), 1002-1014. |
[9] | W. Sun and T. Cai, Large-scale multiple testing under dependency, J. R. Stat. Soc. B, 71 (2009), 393-424. |
[10] | B. Efron, Correlation and large-scale simultaneous significance testing, J. Am. Stat. Assoc., 102 (2007), 93-103. |
[11] | B. Efron, Correlated z-values and the accuracy of large-scale statistical estimates, J. Am. Stat. Assoc., 105 (2010), 1042-1055. |
[12] | J. T. Leek and J. D. Storey, Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis, PLoS Genet., 3 (2007), e161. |
[13] | A. Schwartzman and X. Lin, The effect of correlation on false discovery rate estimation, Biometrika, 98 (2011), 199-214. |
[14] | R. Pyke, Spacings (with discussion), J. R. Stat. Soc. B, 27 (1965), 395-436. |
[15] | D. Ghosh, Incorporating the empirical null hypothesis into the Benjamini-Hochberg procedure, Stat. Appl. Genet. Mol. Biol., 11 (2012). |
[16] | S. G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal. Machine Intell., 11 (1989), 674-693. |
[17] | I. Daubechies, Ten Lectures on Wavelets. Philadelphia: SIAM, 1992. |
[18] | F. Abramovich and Y. Benjamini, Adaptive thresholding of wavelet coefficients, Comput. Stat. Data Anal., 22 (1996), 351-361. |
[19] | X. Shen, H. C. Huang and N. Cressie, Nonparametric hypothesis testing for a spatial signal, J. Am. Stat. Assoc., 97 (2002), 1122-1140. |
[20] | M. Langaas, B. H. Lindqvist and E. Ferkingstad, Estimating the portion of true null hypotheses, with application to DNA microarray data, J. R. Stat. Soc. B, 67 (2005), 555-572. |
[21] | Y. Benjamini, A. M. Krieger and D. Yekutieli, Adaptive linear step-up procedures that control the false discovery rate, Biometrika, 93 (2006), 491-507. |
[22] | H. Finner, T. Dickhaus and M. Roters, On the false discovery rate and an asymptotically optimal rejection curve, Ann. Stat., 37 (2008), 596-618. |
[23] | J. A. Ferreira and A. H. Zwinderman, On the Benjamini-Hochberg method, Ann. Stat., 34 (2006), 1827-1849. |
[24] | W. B. Wu, On false discovery control under dependence, Ann. Stat., 36 (2008), 364-380. |
[25] | J. Fan, X. Han and W. Gu, Estimating false discovery proportion under arbitrary covariance dependence (with discussion), J. Am. Stat. Assoc., 107 (2012), 1019-1048. |
[26] | K. H. Desai and J. D. Storey, Cross-Dimensional Inference of Dependent High-Dimensional Data, J. Am. Stat. Assoc., 107 (2012), 135-151. |
[27] | R. J. Adler and J. E. Taylor, Random Fields and Geometry, New York: Springer, 2007. |
[28] | J. T. Leek and J. D. Storey, Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis, PLoS Genet., 3 (2007), e161. |
[29] | D. B. Percival and A. T. Walden, Wavelet methods for time series analysis, Cambridge: Cambridge University Press, 2000. |
[30] | I. M. Johnstone and B. W. Silverman, Wavelet threshold estimators for data with correlated noise, J. R. Stat. Soc. B, 59 (1997), 319-351. |
[31] | I. Hedenfalk, D.Duggan, Y. Chen, et al., Gene-expression profiles in hereditary breast cancer, New England J. Med., 344 (2001), 539-548. |
[32] | T. R. Golub, D. K. Slonim, P. Tamayo, et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286 (1999), 531-537. |
[33] | M. S. Taqqu, Weak convergence to fractional Brownian motion and to the Rosenblatt process, Z. Wahrscheinlichkeitstheorie verw. Geb., 31 (1975), 287-302. |
[34] | A. Cohen, I. Daubechies and J. C. Feauveau, Biorthogonal bases of compactly supported wavelets, Commun. Pur. Appl. Math., 45 (1992), 485-560. |
1. | Hang Jiang, Aliang Xia, Meng Ye, Jingyi Ren, Dongao Li, Huiquan Liu, Qinhu Wang, Ping Lu, Chunlan Wu, Jin-Rong Xu, Cong Jiang, Fabienne Malagnac, Opposing functions of Fng1 and the Rpd3 HDAC complex in H4 acetylation in Fusarium graminearum, 2020, 16, 1553-7404, e1009185, 10.1371/journal.pgen.1009185 | |
2. | Zhiwen Chen, Jianguo Zhao, Jie Song, Shenghua Han, Yaqin Du, Yuying Qiao, Zehui Liu, Jun Qiao, Weijia Li, Jingwei Li, Haiyan Wang, Baoyan Xing, Qiliang Pan, Wei Wang, Influence of graphene on the multiple metabolic pathways of Zea mays roots based on transcriptome analysis, 2021, 16, 1932-6203, e0244856, 10.1371/journal.pone.0244856 | |
3. | Roy Rada, Identifying Research-Active Specialists at an Academic Medical Center: A Case Study, 2022, 41, 0276-3869, 67, 10.1080/02763869.2022.2021035 | |
4. | Lingling He, Wenjing He, Ji Luo, Minjuan Xu, Upregulated ENC1 predicts unfavorable prognosis and correlates with immune infiltration in endometrial cancer, 2022, 10, 2296-634X, 10.3389/fcell.2022.919637 | |
5. | Zaiquan Dong, Xiaoling Shen, Yanni Hao, Jin Li, Haoran Li, Haizheng Xu, Li Yin, Weihong Kuang, Gut Microbiome: A Potential Indicator for Differential Diagnosis of Major Depressive Disorder and General Anxiety Disorder, 2021, 12, 1664-0640, 10.3389/fpsyt.2021.651536 | |
6. | Shicheng Guo, Yehua Jin, Jieru Zhou, Qi Zhu, Ting Jiang, Yanqin Bian, Runrun Zhang, Cen Chang, Lingxia Xu, Jie Shen, Xinchun Zheng, Yi Shen, Yingying Qin, Jihong Chen, Xiaorong Tang, Peng Cheng, Qin Ding, Yuanyuan Zhang, Jia Liu, Qingqing Cheng, Mengru Guo, Zhaoyi Liu, Weifang Qiu, Yi Qian, Yang Sun, Yu Shen, Hong Nie, Steven J. Schrodi, Dongyi He, MicroRNA Variants and HLA-miRNA Interactions are Novel Rheumatoid Arthritis Susceptibility Factors, 2021, 12, 1664-8021, 10.3389/fgene.2021.747274 | |
7. | Jianfeng Zheng, Jialu Guo, Yahui Wang, Yingling Zheng, Ke Zhang, Jinyi Tong, Bioinformatic Analyses of the Ferroptosis-Related lncRNAs Signature for Ovarian Cancer, 2022, 8, 2296-889X, 10.3389/fmolb.2021.735871 | |
8. | B. Yu, J. Liu, Z. Cai, T. Mu, Y. Gu, G. Xin, J. Zhang, miRNA-mRNA associations with inosine monophosphate specific deposition in the muscle of Jingyuan chicken, 2022, 63, 0007-1668, 821, 10.1080/00071668.2022.2106777 | |
9. | Chen Gong, Daiying Xu, Daiyuan Sun, Jiangang Kang, Wei Wang, Jin-Rong Xu, Xue Zhang, Michael Freitag, FgSnt1 of the Set3 HDAC complex plays a key role in mediating the regulation of histone acetylation by the cAMP-PKA pathway in Fusarium graminearum, 2022, 18, 1553-7404, e1010510, 10.1371/journal.pgen.1010510 | |
10. | Ming Zhai, Shiyu Gong, Peipei Luan, Yefei Shi, Wenxin Kou, Yanxi Zeng, Jiayun Shi, Guanye Yu, Jiayun Hou, Qing Yu, Weixia Jian, Jianhui Zhuang, Mark W. Feinberg, Wenhui Peng, Extracellular traps from activated vascular smooth muscle cells drive the progression of atherosclerosis, 2022, 13, 2041-1723, 10.1038/s41467-022-35330-1 | |
11. | Zhanfei Ma, Yang Li, Xiaoyong Ma, Yabo Wang, Jungang Kang, Guojun Jiang, Effects of Chinese herbal residues of Huang Huo oral liquid coupled with cow dung on the growth, reproduction and gene expression of earthworms, 2022, 28, 23521864, 102893, 10.1016/j.eti.2022.102893 | |
12. | Xingye Xu, Fangping Ding, Xiangqi Hu, Fan Yang, Ting Zhang, Jie Dong, Ying Xue, Tao Liu, Jing Wang, Qi Jin, Upper respiratory tract mycobiome alterations in different kinds of pulmonary disease, 2023, 14, 1664-302X, 10.3389/fmicb.2023.1117779 | |
13. | Beth Mann, Jeremy Chase Crawford, Kavya Reddy, Josi Lott, Yong Ha Youn, Geli Gao, Cliff Guy, Ching-Heng Chou, Daniel Darnell, Sanchit Trivedi, Perrine Bomme, Allister J. Loughran, Paul G. Thomas, Young-Goo Han, Elaine I. Tuomanen, Jorge Alvarez, Gerald B. Pier, Bacterial TLR2/6 Ligands Block Ciliogenesis, Derepress Hedgehog Signaling, and Expand the Neocortex, 2023, 2150-7511, 10.1128/mbio.00510-23 | |
14. | Tan-Duc Nguyen, Tomoaki Itayama, Norio Iwami, Kazuya Shimizu, Thanh-Son Dao, Thanh Luu Pham, Vinh Quang Tran, Hideaki Maseda, Toxicity of ciprofloxacin and ofloxacin to Moina macrocopa and investigation of p -value adjustments for (eco)toxicological studies , 2024, 47, 0148-0545, 662, 10.1080/01480545.2023.2239524 | |
15. | Tianyuan Qin, Yihao Wang, Zhuanfang Pu, Ningfan Shi, Richard Dormatey, Huiqiong Wang, Chao Sun, Comprehensive Transcriptome and Proteome Analyses Reveal the Drought Responsive Gene Network in Potato Roots, 2024, 13, 2223-7747, 1530, 10.3390/plants13111530 | |
16. | Shiyang Hou, Jie Zhang, Xiaoqian Chi, Xiaowei Li, Qijun Zhang, Chunbo Kang, Haifeng Shan, Roles of DSCC1 and GINS1 in gastric cancer, 2023, 102, 0025-7974, e35681, 10.1097/MD.0000000000035681 | |
17. | Jianfeng Zheng, Shan Jiang, Xuefen Lin, Huihui Wang, Li Liu, Xintong Cai, Yang Sun, Comprehensive analyses of mitophagy-related genes and mitophagy-related lncRNAs for patients with ovarian cancer, 2024, 24, 1472-6874, 10.1186/s12905-023-02864-5 | |
18. | Hao Zhu, Fa-Lin Wang, Jia-Ren Liu, Letter Re: Real-world outcomes of Italian patients with advanced non-squamous lung cancer treated with first-line pembrolizumab plus platinum-pemetrexed, 2024, 212, 09598049, 114267, 10.1016/j.ejca.2024.114267 | |
19. | Jiri Skypala, Andrea Monte, Joseph Hamill, Jan Plesek, Daniel Jandacka, Achilles tendon dimensions, ankle stiffness and footfall patterns in recreational runners, 2023, 41, 0264-0414, 812, 10.1080/02640414.2023.2240631 | |
20. | Hongyan Ding, Hongge Xu, Ting Zhang, Can Shi, Identification and validation of M2 macrophage-related genes in endometriosis, 2023, 9, 24058440, e22258, 10.1016/j.heliyon.2023.e22258 | |
21. | 锴雯 练, Statistical Simulation Study Based on Heterogeneity Test Methods, 2025, 14, 2325-226X, 31, 10.12677/sa.2025.141004 |
Fail to reject null | Reject null | ||
True Null | U | V | n0 |
True Alternative | T | S | n1 |
W | Q | n |
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0097(0.001) |
0.2 | 0.4 | 0.01 | 0.029(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.001) |
0.2 | 0.6 | 0.01 | 0.04(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0099(0.001) |
0.2 | 0.8 | 0.01 | 0.031(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.002) |
0.2 | 0.2 | 0.01 | 0.013(0.01) | 0.003(0.019) | 0.009(0.001) | 0.0091(0.001) |
0.2 | 0.4 | 0.01 | 0.019(0.01) | 0.003(0.0004) | 0.019(0.001) | 0.0098(0.002) |
0.2 | 0.6 | 0.01 | 0.028(0.01) | 0.002(0.003) | 0.019(0.004) | 0.0099(0.002) |
0.2 | 0.8 | 0.01 | 0.035(0.01) | 0.003(0.001) | 0.019(0.002) | 0.0097(0.002) |
0.5 | 0.2 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.4 | 0.05 | 0.059(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.6 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.8 | 0.05 | 0.054(0.01) | 0.001(0.002) | 0.024(0.003) | 0.039(0.004) |
0.5 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.003) | 0.005(0.002) | 0.006(0.002) |
0.5 | 0.4 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.0002) | 0.006(0.001) |
0.5 | 0.6 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.002) | 0.007(0.001) |
0.5 | 0.8 | 0.01 | 0.03(0.01) | 0.002(0.002) | 0.005(0.002) | 0.006(0.003) |
0.9 | 0.2 | 0.05 | 0.06(0.01) | 0.01(0.003) | 0.037(0.005) | 0.039(0.005) |
0.9 | 0.4 | 0.05 | 0.055(0.01) | 0.01(0.003) | 0.037(0.006) | 0.039(0.006) |
0.9 | 0.6 | 0.05 | 0.062(0.01) | 0.012(0.004) | 0.037(0.007) | 0.040(0.007) |
0.9 | 0.8 | 0.05 | 0.057(0.01) | 0.012(0.006) | 0.038(0.009) | 0.041(0.009) |
0.9 | 0.2 | 0.01 | 0.016(0.01) | 0.003(0.002) | 0.003(0.002) | 0.004(0.002) |
0.9 | 0.4 | 0.01 | 0.021(0.01) | 0.003(0.003) | 0.004(0.002) | 0.004(0.002) |
0.9 | 0.6 | 0.01 | 0.019(0.01) | 0.003(0.003) | 0.005(0.003) | 0.005(0.003) |
0.9 | 0.8 | 0.01 | 0.027(0.01) | 0.003(0.004) | 0.004(0.0003) | 0.004(0.003) |
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.4 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.6 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.8 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.02) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.93 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.2 | 0.2 | 0.01 | 0.94 (0.01) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.5 | 0.2 | 0.05 | 0.91 (0.01) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.006) |
0.5 | 0.4 | 0.05 | 0.92 (0.01) | 0.40 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.6 | 0.05 | 0.91 (0.02) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.005) |
0.5 | 0.8 | 0.05 | 0.92 (0.02) | 0.39 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.2 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.005) | 0.70(0.006) |
0.5 | 0.4 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.006) | 0.69(0.007) |
0.5 | 0.6 | 0.01 | 0.09 (0.002) | 0.30 (0.003) | 0.60 (0.006) | 0.70 (0.007) |
0.5 | 0.8 | 0.01 | 0.09 (0.002) | 0.30 (0.002) | 0.59 (0.005) | 0.69 (0.006) |
0.9 | 0.2 | 0.05 | 0.11 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.4 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.6 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.8 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.001) | 0.05 (0.002) |
0.9 | 0.2 | 0.01 | 0.016 (0.001) | 0.015 (0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.4 | 0.01 | 0.018 (0.001) | 0.014(0.001) | 0.012(0.002) | 0.012 (0.002) |
0.9 | 0.6 | 0.01 | 0.018 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.8 | 0.01 | 0.019 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |
Fail to reject null | Reject null | ||
True Null | U | V | n0 |
True Alternative | T | S | n1 |
W | Q | n |
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0097(0.001) |
0.2 | 0.4 | 0.01 | 0.029(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.001) |
0.2 | 0.6 | 0.01 | 0.04(0.01) | 0.002(0.0003) | 0.009(0.001) | 0.0099(0.001) |
0.2 | 0.8 | 0.01 | 0.031(0.01) | 0.002(0.0003) | 0.009(0.002) | 0.0098(0.002) |
0.2 | 0.2 | 0.01 | 0.013(0.01) | 0.003(0.019) | 0.009(0.001) | 0.0091(0.001) |
0.2 | 0.4 | 0.01 | 0.019(0.01) | 0.003(0.0004) | 0.019(0.001) | 0.0098(0.002) |
0.2 | 0.6 | 0.01 | 0.028(0.01) | 0.002(0.003) | 0.019(0.004) | 0.0099(0.002) |
0.2 | 0.8 | 0.01 | 0.035(0.01) | 0.003(0.001) | 0.019(0.002) | 0.0097(0.002) |
0.5 | 0.2 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.4 | 0.05 | 0.059(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.6 | 0.05 | 0.06(0.01) | 0.001(0.002) | 0.024(0.002) | 0.039(0.002) |
0.5 | 0.8 | 0.05 | 0.054(0.01) | 0.001(0.002) | 0.024(0.003) | 0.039(0.004) |
0.5 | 0.2 | 0.01 | 0.02(0.01) | 0.002(0.003) | 0.005(0.002) | 0.006(0.002) |
0.5 | 0.4 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.0002) | 0.006(0.001) |
0.5 | 0.6 | 0.01 | 0.02(0.01) | 0.002(0.001) | 0.005(0.002) | 0.007(0.001) |
0.5 | 0.8 | 0.01 | 0.03(0.01) | 0.002(0.002) | 0.005(0.002) | 0.006(0.003) |
0.9 | 0.2 | 0.05 | 0.06(0.01) | 0.01(0.003) | 0.037(0.005) | 0.039(0.005) |
0.9 | 0.4 | 0.05 | 0.055(0.01) | 0.01(0.003) | 0.037(0.006) | 0.039(0.006) |
0.9 | 0.6 | 0.05 | 0.062(0.01) | 0.012(0.004) | 0.037(0.007) | 0.040(0.007) |
0.9 | 0.8 | 0.05 | 0.057(0.01) | 0.012(0.006) | 0.038(0.009) | 0.041(0.009) |
0.9 | 0.2 | 0.01 | 0.016(0.01) | 0.003(0.002) | 0.003(0.002) | 0.004(0.002) |
0.9 | 0.4 | 0.01 | 0.021(0.01) | 0.003(0.003) | 0.004(0.002) | 0.004(0.002) |
0.9 | 0.6 | 0.01 | 0.019(0.01) | 0.003(0.003) | 0.005(0.003) | 0.005(0.003) |
0.9 | 0.8 | 0.01 | 0.027(0.01) | 0.003(0.004) | 0.004(0.0003) | 0.004(0.003) |
π0 | ρ0 | α | BH | BY | Wavelet | Wavelet-Adapt |
0.2 | 0.2 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.4 | 0.05 | 0.99(0.02) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.6 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.03) |
0.2 | 0.8 | 0.05 | 0.99(0.01) | 0.63(0.005) | 0.98(0.01) | 0.99(0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.02) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.93 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.2 | 0.2 | 0.01 | 0.94 (0.01) | 0.31 (0.005) | 0.63 (0.05) | 0.88 (0.02) |
0.2 | 0.2 | 0.01 | 0.92 (0.01) | 0.31 (0.004) | 0.63 (0.05) | 0.89 (0.02) |
0.5 | 0.2 | 0.05 | 0.91 (0.01) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.006) |
0.5 | 0.4 | 0.05 | 0.92 (0.01) | 0.40 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.6 | 0.05 | 0.91 (0.02) | 0.40 (0.004) | 0.74 (0.005) | 0.86 (0.005) |
0.5 | 0.8 | 0.05 | 0.92 (0.02) | 0.39 (0.003) | 0.75 (0.005) | 0.85 (0.004) |
0.5 | 0.2 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.005) | 0.70(0.006) |
0.5 | 0.4 | 0.01 | 0.09(0.002) | 0.30(0.003) | 0.60(0.006) | 0.69(0.007) |
0.5 | 0.6 | 0.01 | 0.09 (0.002) | 0.30 (0.003) | 0.60 (0.006) | 0.70 (0.007) |
0.5 | 0.8 | 0.01 | 0.09 (0.002) | 0.30 (0.002) | 0.59 (0.005) | 0.69 (0.006) |
0.9 | 0.2 | 0.05 | 0.11 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.4 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.6 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.002) | 0.05 (0.002) |
0.9 | 0.8 | 0.05 | 0.12 (0.001) | 0.03 (0.001) | 0.05 (0.001) | 0.05 (0.002) |
0.9 | 0.2 | 0.01 | 0.016 (0.001) | 0.015 (0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.4 | 0.01 | 0.018 (0.001) | 0.014(0.001) | 0.012(0.002) | 0.012 (0.002) |
0.9 | 0.6 | 0.01 | 0.018 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |
0.9 | 0.8 | 0.01 | 0.019 (0.001) | 0.015(0.001) | 0.012 (0.002) | 0.012 (0.002) |