This article explores the mathematical and statistical performances and connections of the two well-known ordinary least-squares estimators (OLSEs) and best linear unbiased estimators (BLUEs) of unknown parameter matrices in the context of a multivariate general linear model (MGLM) for regression, both of which are defined under two different optimality criteria. Tian and Zhang [
Citation: Bo Jiang, Yongge Tian. Equivalent analysis of different estimations under a multivariate general linear model[J]. AIMS Mathematics, 2024, 9(9): 23544-23563. doi: 10.3934/math.20241144
[1] | Ruixia Yuan, Bo Jiang, Yongge Tian . A study of the equivalence of inference results in the contexts of true and misspecified multivariate general linear models. AIMS Mathematics, 2023, 8(9): 21001-21021. doi: 10.3934/math.20231069 |
[2] | Nesrin Güler, Melek Eriş Büyükkaya . Statistical inference of a stochastically restricted linear mixed model. AIMS Mathematics, 2023, 8(10): 24401-24417. doi: 10.3934/math.20231244 |
[3] | Yongge Tian . Miscellaneous reverse order laws and their equivalent facts for generalized inverses of a triple matrix product. AIMS Mathematics, 2021, 6(12): 13845-13886. doi: 10.3934/math.2021803 |
[4] | Xinyu Qi, Jinru Wang, Jiating Shao . Minimax perturbation bounds of the low-rank matrix under Ky Fan norm. AIMS Mathematics, 2022, 7(5): 7595-7605. doi: 10.3934/math.2022426 |
[5] | Qian Li, Qianqian Yuan, Jianhua Chen . An efficient relaxed shift-splitting preconditioner for a class of complex symmetric indefinite linear systems. AIMS Mathematics, 2022, 7(9): 17123-17132. doi: 10.3934/math.2022942 |
[6] | Xi-Ming Fang . General fixed-point method for solving the linear complementarity problem. AIMS Mathematics, 2021, 6(11): 11904-11920. doi: 10.3934/math.2021691 |
[7] | Wenya Shi, Zhixiang Chen . A breakdown-free block conjugate gradient method for large-scale discriminant analysis. AIMS Mathematics, 2024, 9(7): 18777-18795. doi: 10.3934/math.2024914 |
[8] | Shousheng Zhu . Double iterative algorithm for solving different constrained solutions of multivariate quadratic matrix equations. AIMS Mathematics, 2022, 7(2): 1845-1855. doi: 10.3934/math.2022106 |
[9] | Li-Ming Yeh . Lipschitz estimate for elliptic equations with oscillatory coefficients. AIMS Mathematics, 2024, 9(10): 29135-29166. doi: 10.3934/math.20241413 |
[10] | D. Jeni Seles Martina, G. Deepa . Some algebraic properties on rough neutrosophic matrix and its application to multi-criteria decision-making. AIMS Mathematics, 2023, 8(10): 24132-24152. doi: 10.3934/math.20231230 |
This article explores the mathematical and statistical performances and connections of the two well-known ordinary least-squares estimators (OLSEs) and best linear unbiased estimators (BLUEs) of unknown parameter matrices in the context of a multivariate general linear model (MGLM) for regression, both of which are defined under two different optimality criteria. Tian and Zhang [
Throughout this paper, the symbol Rm×n stands for the collection of all m×n matrices over the field of real numbers; AT, r(A), and R(A) stand for the transpose, the rank, and the range (column space) of a matrix A∈Rm×n, respectively; Im denotes the identity matrix of order m. Two symmetric matrices A and B of the same size are said to satisfy the inequality A≽B in the Löwner partial ordering if A−B is nonnegative definite. The Kronecker product of any two matrices A and B is defined to be A⊗B=(aijB). The vectorization operation of a matrix A=[a1,…,an] is defined to be vec(A)=→A=[aT1,…,aTn]T. A well-known property of the vec operator of a triple matrix product is →AZB=(BT⊗A)→Z. The Moore–Penrose generalized inverse of A∈Rm×n, denoted by A+, is defined by the unique solution G to the four matrix equations AGA=A, GAG=G, (AG)T=AG, and (GA)T=GA. In what follows, we denote by PA=AA+, A⊥=EA=Im−AA+, and FA=In−A+A the three orthogonal projectors induced by A, respectively. Further information about the orthogonal projectors PA, EA, and FA and their applications in the linear statistical models can be found, e.g., in [17,20,26,27].
In this paper, we consider the following partitioned multivariate general linear model:
M: {Y=XΘ+Ψ=X1Θ1+⋯+XkΘk+Ψ,E(→Ψ)=0, Cov(→Ψ)=Cov{→Ψ,→Ψ}=σ2ΣΣ2⊗ΣΣ1, | (1.1) |
where Y∈Rn×m is a matrix of observable dependent variables that comes from an experimental design giving rise to n observations, X=[X1,…,Xk]∈Rn×p is a model matrix with arbitrary rank (0≤r(X)≤min{n,p}), Xi∈Rn×pi and Θi∈Rpi×m are matrices of fixed but unknown parameters, Θ=[ΘT1,…,ΘTk]T∈Rp×m, with p=p1+⋯+pk, i=1,…,k, and E(→Ψ) and Cov(→Ψ) denote the expectation and the covariance matrix of the random error vector →Ψ, σ2 is an arbitrary positive scaling factor, ΣΣ1∈Rn×n and ΣΣ2∈Rm×m are known positive semi-definite matrices of arbitrary ranks (0<r(ΣΣ1)≤n) and (0<r(ΣΣ2)≤m), the Kronecker product ΣΣ2⊗ΣΣ1 means that →Ψ has a separable covariance structure. In this study, we do not necessarily assume that the random variables have normal distributions. Only if more precise statistical properties are required, are normal or alternative distributions usually assumed and discussed ([4,22,23]).
We now give some introductions and remarks on the backgrounds of MGLMs. Equation (1.1) is a relative direct extension of the most welcome type of univariate general linear models, which means the incorporation of regressing one response variable on a given set of regressors to several response variables on the regressors. This model is also a representative of various multivariate regression frameworks, yet, has been a core object of study in the domain of multivariate analysis and applications. In fact, MGLMs have the advantage of more complete situations that involve a number of variables, both independent and dependent, and occur in many fields of statistical sciences, such as, analysis of variance (ANOVA), analysis of covariance (ANCOVA), multivariate analysis of variance (MANOVA), analysis of repeated measurements, factor analysis models, as well as many areas of applications to describe predictive relationships of multiple responses related to a set of regressors. The increased employment of repeated measures for panel studies has led to the necessity for more researches in the modeling of this type of data; see a number of textbook and handbook literature [2,6,10,14,21,29,32] that refers to the general theory of MGLMs and their applications. Due to the matrix structures of the model equations in MGLMs, a commonly used method in the statistical analysis of MGLMs is to use the well-known Kronecker products and vectorization operations of matrices. Through the use of these operations, we can alternatively represent (1.1) in the following univariate general linear model:
ˆM: {→Y=(Im⊗X)→Θ+→Ψ=(Im⊗X1)→Θ1+⋯+(Im⊗Xk)→Θk+→Ψ,E(→Ψ)=0, Cov(→Ψ)=σ2ΣΣ2⊗ΣΣ1. | (1.2) |
As we know, there are a number of optimality criteria that can be adopted to establish estimation theory for unknown parameters in linear regression models. In comparison, the two kinds of classical and widely-used estimators are known as the ordinary least-squares estimators (for short, OLSEs) and the best linear unbiased estimators (for short, BLUEs), both of which are in fact two fundamental and orthodox estimators of unknown parameters in theory and applications of linear statistical models for regression. Below, we briefly introduce the existing definitions of the estimability, OLSEs, and BLUEs of the unknown parameters under (1.1) ([1,26]), and then present some known exact and analytical formulas for calculating the OLSEs and BLUEs and their properties.
Definition 1.1. Let M and ˆM be as given in (1.1) and (1.2), respectively.
(ⅰ) The matrix KΘ of parametric functions, where K∈Rk×p, is said to be estimable under M if there exists an L∈Rk×n such that E(LY−KΘ)=0.
(ⅱ) The vector T→Θ of parametric functions, where T∈Rt×mp, is said to be estimable under ˆM if there exists an L∈Rt×mn such that E(L→Y−T→Θ)=0.
The definitions of the OLSE and the BLUE of KΘ under (1.1) are given below.
Definition 1.2. Let M be as given in (1.1), and let K∈Rk×p be given.
(ⅰ) The OLSE of the parameter matrix Θ in M, denoted by OLSE(Θ), is defined to be
OLSE(Θ)=argminΘtr((Y−XΘ)T(Y−XΘ)). | (1.3) |
Correspondingly, the OLSE of KΘ under (1.1) is defined to be OLSE(KΘ)=KOLSE(Θ).
(ⅱ) If there exists an L∈Rk×n such that
Cov(→LY−KΘ)=min s.t. E(LY−KΘ)=0 | (1.4) |
holds in the Löwner partial ordering, the corresponding linear matrix statistic LY is defined to be the BLUE of KΘ under M, and is denoted by LY=BLUE(KΘ).
Since the OLSEs and BLUEs are defined by different optimality criteria, they may have diverse expressions and mathematical and statistical properties. In fact, it has been known that the OLSEs and BLUEs of unknown parameter matrices in a given MGLM can be represented in certain analytical formulas that are composed of the given matrices and their generalized inverses in the models. Thus, the algebraic and statistical properties and performances of OLSEs and BLUEs can be easily derived from the analytical formulas. Because OLSEs are easy to compute and have many simple and nice properties in a linear regression framework, statisticians are interested in the connections between OLSEs and other estimators. Statisticians have noticed that OLSEs and BLUEs of the same unknown parameters in a linear statistical model have some essential links; in particular, the OLSEs and BLUEs of the same unknown parameters are equivalent under some rational conditions. Based on the known theory on OLSEs and BLUEs under linear statistical models, statisticians proposed and studied many inference problems from theoretical and applied points of view. Especially, they were interested in the connections between the two types of estimators under linear statistical models by means of some precise and effective matrix analysis tools. Recall that the problem of investigating twists of OLSEs and BLUEs in linear regression theory was initialized and approached in the late 1940s from theoretical and applied points of view, and many classic and novel contributions on this topic have been properly documented in the statistical literature since then; see, e.g., a recent survey paper [18] and the references therein.
As a new attempt to approach this kind of problems under more general model assumptions, we consider the connections between OLSEs and BLUEs under (1.1) in this current work. The study includes solving the following two problems on the relationships between the estimators of the whole and partial mean parameter matrices in (1.1):
(Ⅰ) Establish the necessary and sufficient conditions for the following equality
OLSE(KΘ)=BLUE(KΘ) | (1.5) |
to hold, where KΘ is an estimable matrix of parametric functions under M for K∈Rk×p.
(Ⅱ) Prove the following equivalent equalities of OLSEs and BLUEs
OLSE(XΘ)=BLUE(XΘ)⇔OLSE(XiΘi)=BLUE(XiΘi) | (1.6) |
under the assumption that XiΘi is estimable under M, i=1,2,…,k.
The equalities for estimators in (1.5) and (1.6) have many different possible interpretations from the mathematical and statistical points of view, and are not rare to see in the statistical inference of a given MGLM. In fact, there are many publications on establishing equalities between OLSEs and BLUEs under various linear statistical model assumptions. It is easy to convert these estimator equalities to certain matrix equalities that involve the given matrices and their generalized inverses in the models. Many influential and effective mathematical tools are available in order to characterize the above equalities of estimators and their covariance matrices under MGLMs, but we prefer to use the matrix rank method (for short, MRM) to characterize the equalities in (Ⅰ) and (Ⅱ). The MRM is now highly recognized as a useful tool to establish and characterize various simple or complicated algebraic equalities for matrices and their operations; see e.g., [35,36] and the references therein on the MRM in the investigations of the linear statistical models. In order to relate the work to existing results in machine learning, some important results can be found in [7,8,15,30,39].
The rest of this paper is organized as follows: In Section 2, we introduce some matrix analysis tools that can be used to characterize matrix equalities that involve generalized inverses, and give a group of results related to the estimability of matrices of parametric functions under (1.1). We then show how to directly establish analytical expressions of the OLSEs and BLUEs under the assumptions in (1.1). In Section 3, we present several groups of classic and new equivalent statements for the OLSEs to be the BLUEs under (1.1) by characterizing various matrix equalities. Some conclusions and remarks are given in Section 4.
Recall that matrix algebra is one of the most important areas of mathematics, while many matrix analysis tools play key roles in data science and statistical theory, including the field of multivariate analysis and inferences from regression regression frameworks; see, e.g., [5,26], among others. Specifically, the theory of generalized inverses of matrices, as ubiquitous tools to deal with singular matrices, has been widely utilized to approach various complicated theoretical and computational problems that occur in statistical analysis and inference; see, e.g., the reference books [3,9,11,27,28,31] concerning applications of matrix theory in statistics. Also recall that many problems in statistical analysis of regression models can be equivalently transformed into certain matrix analysis problems, so that people can use various algebraic tools in matrix theory to approach the statistical analysis problems and to obtain various exact and satisfactory results and facts from mathematical and statistical points of view. In the following, we introduce and explain some preliminary groundwork in linear algebra and matrix theory that will be utilized in the context of this paper. Let us first recall that block matrix and rank of a matrix are two basic concepts in mathematics that appear at the entry level of linear algebra and are easily understandable by a beginner. On the other hand, they have been taken as two irreplaceable study tools for dealing with various basic and advanced problems in theoretical and computational mathematics because they give us the ability to construct and analyze various simple and complicated matrix expressions and matrix equalities in a clear and concise way.
For the purpose of establishing and characterizing various possible equalities for estimations in the context of linear regression models, and simplifying various matrix equalities involving Moore–Penrose generalized inverses of matrices, we need to use some well-known facts on ranks and generalized inverses of matrices in the following two lemmas, and then proceed to give the proofs of the main results in this article.
Lemma 2.1 ([19]). Let A∈Rm×n, B∈Rm×k, C∈Rl×n, and D∈Rl×k. Then,
r[A,B]=r(A)+r(EAB)=r(B)+r(EBA), | (2.1) |
r[AC]=r(A)+r(CFA)=r(C)+r(AFC). | (2.2) |
If R(B)⊆R(A) and R(CT)⊆R(AT), then
r[ABCD]=r(A)+r(D−CA+B). | (2.3) |
In addition, the following results hold.
(ⅰ) r[A,B]=r(A)⇔R(B)⊆R(A)⇔AA+B=B⇔EAB=0.
(ⅱ) r[AC]=r(A)⇔R(CT)⊆R(AT)⇔CA+A=C⇔CFA=0.
(ⅲ) r[A,B]=r(A)+r(B)⇔R(A)∩R(B)={0}⇔R((EAB)T)=R(BT)⇔R((EBA)T)=R(AT).
(ⅳ) r[AC]=r(A)+r(C)⇔R(AT)∩R(CT)={0}⇔R(CFA)=R(C)⇔R(AFC)=R(A).
(ⅴ) r(A+B)=r(A)+r(B)⇔R(A)∩R(B)={0} and R(AT)∩R(BT)={0} for A,B∈Rm×n.
Lemma 2.2 ([24]). Given two matrices A and B of appropriate sizes, the matrix equation AX=B is solvable for X if and only if r[A,B]=r(A), or equivalently, AA+B=B. In this case, the general solution of the equation is X=A+B+FAU, where U is an arbitrary matrix.
We also use the following basic facts about the Kronecker products of matrices.
Lemma 2.3. Let A∈Rm×n and B∈Rp×q, C∈Rn×s, and D∈Rq×t. Then, (A⊗B)(C⊗D)=AC⊗BD. In particular, A⊗B=0 if and only if A=0 or B=0.
For convenience of representation, we denote
Vi=[0,…,Xi,…,0], Wi=[X1,…,Xi−1,0,Xi+1,…,Xk], i=1,2,…,k. | (2.4) |
In this case, the model matrix X in (1.1) and Im⊗X in (1.2) can be decomposed as
X=Vi+Wi=V1+⋯+Vk, i=1,2,…,k, | (2.5) |
Im⊗X=Im⊗Vi+Im⊗Wi=Im⊗V1+⋯+Im⊗Vk, i=1,2,…,k. | (2.6) |
Correspondingly, the partial mean parameter matrices XiΘi and the partial mean parameter vectors (Im⊗Xi)→Θi on the right-hand sides of (1.1) and (1.2) can be rewritten as
XiΘi=ViΘ, (Im⊗Xi)→Θi=(Im⊗Vi)→Θ, i=1,2,…,k. | (2.7) |
In the following, we derive some necessary and sufficient conditions for KΘ to be estimable under (1.1).
Theorem 2.4. Let M and ˆM be as given in (1.1) and (1.2), respectively, and let K∈Rk×p be given. Then, the following statements are equivalent:
(ⅰ) KΘ is estimable under M.
(ⅱ) (Im⊗K)→Θ is estimable under ˆM.
(ⅲ) R(Im⊗KT)⊆R(Im⊗XT).
(ⅳ) R(KT)⊆R(XT).
(ⅴ) KX+X=K.
Proof. By Definition 1.1 (ⅰ), we see that
E(LY−KΘ)=0⇔(LX−K)Θ=0 for all Θ⇔LX=K. | (2.8) |
Further, by Lemma 2.2, the matrix equation on the right-hand side of (2.8) is solvable for L if and only if (ⅳ) holds. The equivalence of (ⅳ) and (ⅴ) follows from Lemma 2.1 (ⅱ). Also, by Definition 1.1 (ⅱ),
E(L→Y−(Im⊗K)→Θ)=0⇔(L(Im⊗X)−(Im⊗K))→Θ=0 for all →Θ⇔L(Im⊗X)=(Im⊗K), | (2.9) |
and by Lemma 2.2, the equation on the right-hand side of (2.9) is solvable for L if and only if (ⅲ) holds. Consequently, (ⅲ) holds if and only if r[Im⊗XIm⊗K]=r(Im⊗X). Expanding both sides of the equality, we obtain
r[X0⋯00X⋯0⋮⋮⋱⋮00⋯XK0⋯00K⋯0⋮⋮⋱⋮00⋯K]=r[X0⋯00X⋯0⋮⋮⋱⋮00⋯X]⇔r[XK]=r(X). |
This fact leads to the equivalence of (ⅲ) and (ⅳ).
Concerning the estimability of XiΘi in (1.1), i=1,2,…,k, we have the following facts.
Theorem 2.5. Let M and ˆM be as given in (1.1) and (1.2), respectively. Then, the following five statements are equivalent:
(ⅰ) XiΘi=ViΘ is estimable in M, i=1,2,…,k.
(ⅱ) (Im⊗Vi)→Θ is estimable under ˆM, i=1,2,…,k.
(ⅲ) R(VTi)⊆R(XT), i=1,2,…,k.
(ⅳ) R(Vi)∩R(Wi)=R(Xi)∩R(Wi)={0}, i=1,2,…,k.
(ⅴ) r(X)=r(Vi)+r(Wi)=r(Xi)+r(Wi), i=1,2,…,k.
Proof. Setting K=Vi in Theorem 2.4, we obtain (Im⊗K)→Θ=(Im⊗Vi)→Θ=(Im⊗Xi)→Θi. In this case, applying Theorem 2.4 to it and simplifying yield the equivalences of (ⅰ)–(ⅲ). Further, (ⅲ) is equivalent to r[VTi,XT]=r(X), where r[VTi,XT]=r[VTi,WTi]=r(Vi)+r(Wi)=r(Xi)+r(Wi) by (2.4) for i=1,2,…,k, thus establishing the equivalences of (ⅲ)–(ⅴ).
Theorem 2.6. Let M be as given in (1.1). Then, the following two statements are equivalent:
(ⅰ) All X1Θ1,…,XkΘk are estimable in M.
(ⅱ) r(X)=r(X1)+⋯+r(Xk).
Proof. From (2.5),
r(X)⩽r(Xi)+r(Wi)⩽r(X1)+⋯+r(Xk), i=1,2,…,k. | (2.10) |
If (ⅱ) holds, we see from (2.10) that
r(X)=r(X1)+r(W1)=⋯=r(Xk)+r(Wk). | (2.11) |
Hence, (i) holds by Theorem 2.5. The equivalence of (ⅱ) and (2.11) can be established by induction.
Theorem 2.5 (ⅳ) and (ⅴ) and Theorem 2.6 (ⅱ) can be easily verified for the given model matrix in M. In particular, they are all satisfied under the condition r(X)=p.
Below, we present a new derivation for the analytical formula of the OLSE of Θ in (1.1).
Theorem 2.7. Let M be as given in (1.1), and suppose that KΘ is estimable under M for K∈Rk×p. Then, the OLSE of KΘ under M can be uniquely expressed as
OLSE(KΘ)=KX+Y | (2.12) |
with
E(OLSE(KΘ))=KΘ, Cov(→OLSE(KΘ))=σ2ΣΣ2⊗((KX+)ΣΣ1(KX+)T). | (2.13) |
In particular, the following results hold.
(ⅰ) XΘ in M is always estimable, and
OLSE(XΘ)=XX+Y, | (2.14) |
E(OLSE(XΘ))=XΘ, Cov(→OLSE(XΘ))=σ2ΣΣ2⊗(XX+ΣΣ1XX+). | (2.15) |
(ⅱ) If XiΘi is estimable under M, then,
OLSE(XiΘi)=ViX+Y, i=1,2,…,k, | (2.16) |
E(OLSE(XiΘi))=XiΘi, Cov(→OLSE(XiΘi))=σ2ΣΣ2⊗(ViX+ΣΣ1(ViX+)T), i=1,2,…,k. | (2.17) |
(ⅲ) If all X1Θ1,…,XkΘk are estimable under M, then,
OLSE(XΘ)=OLSE(X1Θ1)+⋯+OLSE(XkΘk). | (2.18) |
Proof. We first decompose the matrix product (Y−XΘ)T(Y−XΘ) as
(Y−XΘ)T(Y−XΘ)=(Y−XX+Y)T(Y−XX+Y)+(XX+Y−XΘ)T(XX+Y−XΘ)=YTEXY+(PXY−XΘ)T(PXY−XΘ). |
Thus,
tr((Y−XΘ)T(Y−XΘ))=tr(YTEXY)+tr((PXY−XΘ)T(PXY−XΘ)). |
Minimizing both sides of equality with respect to Θ, we obtain the following formula
minΘtr((Y−XΘ)T(Y−XΘ))=tr(YTEXY)+minΘtr((PXY−XΘ)T(PXY−XΘ)). |
It is obvious that XΘ=PXY is always solvable for Θ, and the general solution is given by Θ=X+Y+(Ip−X+X)U by Lemma 2.2. In this case, we obtain the following three fundamental formulas
OLSE(Θ)=argminΘtr((Y−XΘ)T(Y−XΘ))=X+Y+(Ip−X+X)U,minΘtr((Y−XΘ)T(Y−XΘ))=tr(YTEXY),OLSE(KΘ)=KX+Y+K(Ip−X+X)U=KX+Y, |
where U∈Rp×m is arbitrary. The expectation and covariance matrix of OLSE(KΘ) are
E(OLSE(KΘ))=E(KX+Y)=KX+XΘ=KΘ,Cov(→OLSE(KΘ))=Cov(→KX+Y)=Cov((Im⊗KX+)→Y)=σ2(Im⊗KX+)(ΣΣ2⊗ΣΣ1)(Im⊗KX+)T=σ2ΣΣ2⊗((KX+)ΣΣ1(KX+)T), |
by Lemma 2.3. Consequently, we are able to obtain (ⅰ)–(ⅲ) from (2.12) and (2.13).
With regard to the exact expression of the BLUE of KΘ, the following general results were established in [12,13,37] through the ordinary employment of the analytical solutions of a constrained quadratic matrix-valued optimization problem in [34].
Theorem 2.8. Let M be as given in (1.1), and suppose that KΘ is estimable under M for K∈Rk×p. Then,
Cov(→LY−KΘ)=min s.t. E(LY−KΘ)=0⇔L[X,ΣΣ1X⊥]=[K,0]. | (2.19) |
The matrix equation in (2.19), called the BLUE equation, is solvable for L, namely,
[K,0][X,ΣΣ1X⊥]+[X,ΣΣ1X⊥]=[K,0] | (2.20) |
holds. The general solution of L and the corresponding BLUE of KΘ under M can be written as
BLUE(KΘ)=LY=([K,0][X,ΣΣ1X⊥]++U[X,ΣΣ1X⊥]⊥)Y, | (2.21) |
where U∈Rk×n is arbitrary. The expectation and covariance matrix of BLUE(KΘ) are
E(BLUE(KΘ))=KΘ, | (2.22) |
Cov(→BLUE(KΘ))=σ2ΣΣ2⊗([K,0][X,ΣΣ1X⊥]+ΣΣ1([K,0][X,ΣΣ1X⊥]+)T), | (2.23) |
where
r[X,ΣΣ1X⊥]=r[X,X⊥ΣΣ1]=r[X,ΣΣ1]. | (2.24) |
Furthermore, the following statements hold:
(ⅰ) XΘ in M is always estimable, and
BLUE(XΘ)=([X,0][X,ΣΣ1X⊥]++U[X,ΣΣ1X⊥]⊥)Y, | (2.25) |
E(BLUE(XΘ))=XΘ, | (2.26) |
Cov(→BLUE(XΘ))=σ2ΣΣ2⊗([X,0][X,ΣΣ1X⊥]+ΣΣ1([X,0][X,ΣΣ1X⊥]+)T), | (2.27) |
where U∈Rn×n is arbitrary.
(ⅱ) If XiΘi in M is estimable, then,
BLUE(XiΘi)=([Vi,0][X,ΣΣ1X⊥]++Ui[X,ΣΣ1X⊥]⊥)Y, | (2.28) |
E(BLUE(XiΘi))=XiΘi, | (2.29) |
Cov(→BLUE(XiΘi))=σ2ΣΣ2⊗([Vi,0][X,ΣΣ1X⊥]+ΣΣ1([Vi,0][X,ΣΣ1X⊥]+)T), | (2.30) |
where Ui∈Rn×n is arbitrary, i=1,2,…,k.
(ⅲ) If all XiΘi in (1.1) are estimable, then,
BLUE(XΘ)=BLUE(X1Θ1)+⋯+BLUE(XkΘk). | (2.31) |
In this section, we solve the estimation equality problems outlined in (1.5) and (1.6), and provide a variety of algebraic and statistical descriptions of the proposed equivalent facts. For the purpose of characterizing equalities between L1Y and L2Y, we need to use the following three manifest criteria for comparison and contrast of linear statistics:
Definition 3.1. Let Y be as given in (1.1), and let L1,L2∈Rk×n.
(ⅰ) The equality L1Y=L2Y is said to hold definitely if L1=L2.
(ⅱ) The equality L1Y=L2Y is said to hold with probability 1 if both E(L1Y−L2Y)=0, and Cov((Im⊗L1)→Y−(Im⊗L2)→Y)=0.
(ⅲ) L1Y and L2Y are said to have the same expectation matrices and dispersion matrices if both E(L1Y)=E(L2Y), and Cov((Im⊗L1)→Y)=Cov((Im⊗L2)→Y) hold.
Our main results in the paper are presented below.
Theorem 3.2. Let M be as given in (1.1), and suppose that KΘ is estimable under M for K∈Rk×p. Also, let OLSE(KΘ) and BLUE(KΘ) be as given in (2.12) and (2.21), respectively. Then, the following 16 statements are equivalent:
(ⅰ) OLSE(KΘ)=BLUE(KΘ) holds definitely.
(ⅱ) OLSE(KΘ)=BLUE(KΘ) holds with probability 1.
(ⅲ) Cov(→OLSE(KΘ))=Cov(→BLUE(KΘ)).
(ⅳ) Cov{→OLSE(KΘ),→Y}=Cov{→BLUE(KΘ),→Y}.
(ⅴ) Cov{→OLSE(KΘ),→Y}=Cov{→BLUE(KΘ),→OLSE(XΘ)}.
(ⅵ) Cov{→OLSE(KΘ),→Y−→OLSE(XΘ)}=0.
(ⅶ) KX+=[K,0][X,ΣΣ1X⊥]+.
(ⅷ) KX+ΣΣ1=[K,0][X,ΣΣ1X⊥]+ΣΣ1.
(ⅸ) KX+ΣΣ1=[K,0][X,ΣΣ1X⊥]+ΣΣ1XX+.
(ⅹ) KX+ΣΣ1(KX+)T=[K,0][X,ΣΣ1X⊥]+ΣΣ1([K,0][X,ΣΣ1X⊥]+)T.
(xi) KX+ΣΣ1X⊥=0.
(xii) X⊥[ΣΣ1X,0]F[XTX,KT]=0.
(xiii) r[ΣΣ1XX0XTX0KT]=r[ΣΣ1XXXTX0]=2r(X).
(xiv) R((KX+ΣΣ1)T)⊆R(X).
(xv) R[XTΣΣ1X⊥0]⊆R[XTXK].
(xvi) R[0KT]⊆R[ΣΣ1XXXTX0].
Proof. The equivalences of the matrix equalities and relations in (vii)–(xvi) can be proved by the algebraic tools presented in Lemma 2.1, while the details of the proofs can be found in [38], see also [16,33].
By Definition 3.1 (ⅰ), the equality OLSE(KΘ)=BLUE(KΘ) in (ⅰ) holds definitely if and only if the coefficient matrices of Y in (2.12) and (2.21) are equal, i.e.,
KX+=[K,0][X,ΣΣ1X⊥]++U[X,ΣΣ1X⊥]⊥. | (3.1) |
By Lemma 2.2, there exists a matrix U such that (3.1) holds if and only if
r[KX+−[K,0][X,ΣΣ1X⊥]+[X,ΣΣ1X⊥]⊥]=r([X,ΣΣ1X⊥]⊥). | (3.2) |
We next simplify both sides of this rank equality. By (2.1),
r[KX+−[K,0][X,ΣΣ1X⊥]+[X,ΣΣ1X⊥]⊥]=r[KX+−[K,0][X,ΣΣ1X⊥]+0In[X,ΣΣ1X⊥]]−r[X,ΣΣ1X⊥]=r[KX+[K,0]In[X,ΣΣ1X⊥]]−r[X,ΣΣ1] (by (2.24))=r[00KX+ΣΣ1X⊥In00]−r[X,ΣΣ1] (by Theorem 2.4(v))=r(KX+ΣΣ1X⊥)+n−r[X,ΣΣ1]=r[K(XTX)+XTΣΣ1X⊥]+n−r[X,ΣΣ1] (by X+=(XTX)+XT)=r[XTXXTΣΣ1X⊥K0]−r(X)+n−r[X,ΣΣ1] (by (2.3))=r[XTXXTΣΣ1K00XT]−2r(X)+n−r[X,ΣΣ1] (by (2.2))=r[ΣΣ1XX0XTX0KT]−2r(X)+n−r[X,ΣΣ1], | (3.3) |
r([X,ΣΣ1X⊥]⊥)=n−r[X,ΣΣ1], | (3.4) |
r[ΣΣ1XXXTX0]=r[ΣΣ1XXX0]=r[0XX0]=2r(X). | (3.5) |
Substituting (3.3)–(3.5) into (3.2) and simplifying lead to the equivalence of (ⅰ) and (xiii).
Because E(OLSE(KΘ)−BLUE(KΘ))=0, we see from Definition 3.1 (ⅱ) that OLSE(KΘ)=BLUE(KΘ) holds with probability 1 if and only if
Cov(→OLSE(KΘ)−→BLUE(KΘ))=σ2(Im⊗(KX+−[K,0][X,ΣΣ1X⊥]+−U[X,ΣΣ1X⊥]⊥))(ΣΣ2⊗ΣΣ1) (Im⊗(KX+−[K,0][X,ΣΣ1X⊥]+−U[X,ΣΣ1X⊥]⊥))T=0. | (3.6) |
Since ΣΣ2⊗ΣΣ1 is nonnegative definite and ΣΣ2≠0, (3.6) is equivalent to
(KX+−[K,0][X,ΣΣ1X⊥]+)ΣΣ1=0 |
by Lemma 2.3, thus establishing the equivalence of (ⅱ) and (viii).
It follows from (2.13) and (2.23) that (ⅲ) is equivalent to
Cov(→OLSE(KΘ))−Cov(→BLUE(KΘ))=σ2ΣΣ2⊗(KX+ΣΣ1(KX+)T)−σ2ΣΣ2⊗([K,0][X,ΣΣ1X⊥]+ΣΣ1([K,0][X,ΣΣ1X⊥]+)T)=σ2ΣΣ2⊗(KX+ΣΣ1(KX+)T−[K,0][X,ΣΣ1X⊥]+ΣΣ1([K,0][X,ΣΣ1X⊥]+)T)=0, |
which is further equivalent to
KX+ΣΣ1(KX+)T−[K,0][X,ΣΣ1X⊥]+ΣΣ1([K,0][X,ΣΣ1X⊥]+)T=0 |
by Lemma 2.3, thus establishing the equivalence of (ⅲ) and (ⅹ).
By (2.12) and (2.21),
Cov{→OLSE(KΘ), →Y}=σ2(Im⊗KX+)(ΣΣ2⊗ΣΣ1)=σ2ΣΣ2⊗KX+ΣΣ1, | (3.7) |
Cov{→BLUE(KΘ), →Y}=σ2(Im⊗([K,0][X,ΣΣ1X⊥]++U[X,ΣΣ1X⊥]⊥))(ΣΣ2⊗ΣΣ1)=σ2ΣΣ2⊗([K,0][X,ΣΣ1X⊥]+ΣΣ1). | (3.8) |
Comparing the right-hand sides of (3.7) and (3.8) leads to the equivalence of (ⅳ) and (ⅷ).
By (2.14) and (2.21),
Cov{→BLUE(KΘ), →OLSE(XΘ)}=σ2(Im⊗([K,0][X,ΣΣ1X⊥]++U[X,ΣΣ1X⊥]⊥))(ΣΣ2⊗ΣΣ1)(Im⊗XX+)=σ2ΣΣ2⊗([K,0][X,ΣΣ1X⊥]+ΣΣ1XX+). | (3.9) |
Comparing the right-hand sides of (3.7) and (3.9) leads to the equivalence of (ⅴ) and (ⅸ).
By (2.12), (2.14), and (3.7),
Cov{→OLSE(KΘ), →Y−→OLSE(XΘ)}=Cov{→OLSE(KΘ), →Y}−Cov{→OLSE(KΘ), →OLSE(XΘ)}=σ2ΣΣ2⊗KX+ΣΣ1−σ2ΣΣ2⊗(KX+ΣΣ1XX+)=σ2ΣΣ2⊗(KX+ΣΣ1X⊥). | (3.10) |
Setting all sides of the equalities equal of the to zero yields the equivalence of (vi) and (xi) by Lemma 2.3.
More results and facts associated with the equalities of OLSEs and BLUEs can be established from algebraic and statistical considerations. In particular, the matrix equalities in Theorem 3.2 (ⅷ)–(xi) correspond directly to the estimation equalities and the covariance matrix equalities. In other words, they have clear statistical interpretations and can be utilized in the corresponding statistical inference.
Let K=X in Theorem 3.2. We obtain the following results:
Corollary 3.3. Let OLSE(XΘ) and BLUE(XΘ) be as given in (2.14) and (2.25), respectively. Then, the following 31 statements are equivalent:
(ⅰ) OLSE(XΘ)=BLUE(XΘ) holds definitely (with probability 1).
(ⅱ) Cov(→OLSE(XΘ))=Cov(→BLUE(XΘ)).
(ⅲ) Cov(→Y−→OLSE(XΘ))=Cov(→Y−→BLUE(XΘ)).
(ⅳ) Cov(→OLSE(XΘ))=Cov{→BLUE(XΘ),→Y}.
(ⅴ) Cov(→OLSE(XΘ))=Cov{→BLUE(XΘ),→OLSE(XΘ)}.
(ⅵ) Cov{→OLSE(XΘ),→Y}=Cov{→BLUE(XΘ),→Y}.
(ⅶ) Cov{→OLSE(XΘ),→Y}=Cov{→BLUE(XΘ),→OLSE(XΘ)}.
(ⅷ) Cov{→OLSE(XΘ),→Y}=Cov{→Y,→OLSE(XΘ)}.
(ⅸ) Cov{→Y−→OLSE(XΘ),→Y}=Cov{→Y,→Y−→OLSE(XΘ)}.
(ⅹ) Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}=Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}=0.
(xi) Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}+Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}=0.
(xii) Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}=0.
(xiii) Cov(→Y)=Cov(→OLSE(XΘ))+Cov(→Y−→OLSE(XΘ)).
(xiv) PX=[X,0][X,ΣΣ1X⊥]+.
(xv) PXΣΣ1=[X,0][X,ΣΣ1X⊥]+ΣΣ1.
(xvi) PXΣΣ1=[X,0][X,ΣΣ1X⊥]+ΣΣ1PX.
(xvii) PXΣΣ1PX=[X,0][X,ΣΣ1X⊥]+ΣΣ1([X,0][X,ΣΣ1X⊥]+)T.
(xviii) X⊥ΣΣ1X⊥=(In−[X,0][X,ΣΣ1X⊥]+)ΣΣ1(In−[X,0][X,ΣΣ1X⊥]+)T.
(xix) PXΣΣ1PX=[X,0][X,ΣΣ1X⊥]+ΣΣ1.
(xx) PXΣΣ1PX=[X,0][X,ΣΣ1X⊥]+ΣΣ1PX.
(xxi) PXΣΣ1X⊥=X⊥ΣΣ1PX=0.
(xxii) PXΣΣ1X⊥+X⊥ΣΣ1PX=0.
(xxiii) X⊥ΣΣ1X=0.
(xxiv) PXΣΣ1=ΣΣ1PX.
(xxv) X⊥ΣΣ1=ΣΣ1X⊥.
(xxvi) r[X,ΣΣ1X]=r(X).
(xxvii) r[X⊥,ΣΣ1X⊥]=r(X⊥).
(xxviii) R(ΣΣ1X)⊆R(X).
(xxix) R(ΣΣ1X⊥)⊆R(X⊥).
(xxx) R(ΣΣ1X)=R(ΣΣ1)∩R(X).
(xxxi) R(ΣΣ1X⊥)=R(ΣΣ1)∩R(X⊥).
Proof. The equivalences of (ⅰ), (ⅱ), (ⅵ), (ⅶ), (xii), (xiv)–(xvii), and (xxi) follow from Theorem 3.2 (ⅰ)–(xi) via setting K=X. The equivalences of the matrix equalities and relations in (xviii)–(xx), (xxiii), (xxiv), (xxvi), and (xxviii) were collected and proved in [38].
From (2.14) and (2.25),
Cov(→Y−→OLSE(XΘ))=σ2(Im⊗X⊥)(ΣΣ2⊗ΣΣ1)(Im⊗X⊥)=σ2ΣΣ2⊗X⊥ΣΣ1X⊥, | (3.11) |
Cov(→Y−→BLUE(XΘ))=σ2(Im⊗(In−[X,0][X,ΣΣ1X⊥]+−U[X,ΣΣ1X⊥]⊥))(ΣΣ2⊗ΣΣ1) (Im⊗(In−[X,0][X,ΣΣ1X⊥]+−U[X,ΣΣ1X⊥]⊥))T=σ2ΣΣ2⊗(In−[X,0][X,ΣΣ1X⊥]+)ΣΣ1(In−[X,0][X,ΣΣ1X⊥]+)T. | (3.12) |
Comparing the right-hand sides of (3.11) and (3.12) leads to the equivalence of (ⅲ) and (xviii).
From (2.15) and (3.8),
Cov(→OLSE(XΘ))−Cov{→BLUE(XΘ), →Y}=σ2ΣΣ2⊗(PXΣΣ1PX)−σ2ΣΣ2⊗([X,0][X,ΣΣ1X⊥]+ΣΣ1)=σ2ΣΣ2⊗(PXΣΣ1PX−[X,0][X,ΣΣ1X⊥]+ΣΣ1). | (3.13) |
Setting both sides equal to zero yields the equivalence of (ⅳ) and (xix).
From (2.15) and (3.9),
Cov(→OLSE(XΘ))−Cov{→BLUE(XΘ), →OLSE(XΘ)}=σ2ΣΣ2⊗(PXΣΣ1PX)−σ2ΣΣ2⊗([X,0][X,ΣΣ1X⊥]+ΣΣ1PX)=σ2ΣΣ2⊗(PXΣΣ1PX−[X,0][X,ΣΣ1X⊥]+ΣΣ1PX). | (3.14) |
Thus, setting both sides equal to zero yields the equivalence of (ⅴ) and (xx).
From (3.7),
Cov{→OLSE(XΘ), →Y}−Cov{→Y, →OLSE(XΘ)}=σ2ΣΣ2⊗PXΣΣ1−σ2ΣΣ2⊗ΣΣ1PX=σ2ΣΣ2⊗(PXΣΣ1−ΣΣ1PX). | (3.15) |
Setting both sides equal to zero yields the equivalence of (ⅷ) and (xxiv).
From (1.1) and (3.7),
Cov{→Y−→OLSE(XΘ),→Y}−Cov{→Y,→Y−→OLSE(XΘ)}=σ2ΣΣ2⊗X⊥ΣΣ1−σ2ΣΣ2⊗ΣΣ1X⊥=σ2ΣΣ2⊗(X⊥ΣΣ1−ΣΣ1X⊥). | (3.16) |
Setting both sides equal to zero yields the equivalence of (ⅸ) and (xxv).
From (xii), we obtain the equivalence of (ⅹ) and (xxi).
From (1.1), (2.15), (3.10), and (3.11),
Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}+Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}=σ2ΣΣ2⊗PXΣΣ1X⊥+σ2ΣΣ2⊗X⊥ΣΣ1PX=σ2ΣΣ2⊗(PXΣΣ1X⊥+X⊥ΣΣ1PX), | (3.17) |
and
Cov(→Y)−Cov(→OLSE(XΘ))−Cov(→Y−→OLSE(XΘ))=σ2ΣΣ2⊗ΣΣ1−σ2ΣΣ2⊗PXΣΣ1PX−σ2ΣΣ2⊗X⊥ΣΣ1X⊥=σ2ΣΣ2⊗(PXΣΣ1X⊥+X⊥ΣΣ1PX), | (3.18) |
where by (2.1) and Lemma 2.1(ⅴ),
r(PXΣΣ1X⊥+X⊥ΣΣ1PX)=r(PXΣΣ1X⊥)+r(X⊥ΣΣ1PX)=2r(X⊥ΣΣ1PX)=2r[X,ΣΣ1X]−2r(X). | (3.19) |
Setting both sides of (3.19) equal to zero and combining it with (3.17) and (3.18) yields the equivalences of (xi), (xiii), (xxii), (xxiii), (xxvi), and (xxviii).
The equivalences of (xxiii)–(xxxi) on matrix equalities and range equalities are well known [25].
Observing from (2.7) that XiΘi=ViΘ, and letting K=Vi in Theorem 3.2, we obtain the following results:
Corollary 3.4. Suppose that the partial mean matrix XiΘi is estimable under (1.1), and let OLSE(XiΘi) and BLUE(XiΘi) be as given in (2.16) and (2.28), respectively, i=1,…,k. Then, the following 18 statistical and algebraic statements are equivalent:
(ⅰ) OLSE(XiΘi)=BLUE(XiΘi) holds definitely (with probability 1), i=1,2,…,k.
(ⅱ) Cov(→OLSE(XiΘi))=Cov(→BLUE(XiΘi)), i=1,2,…,k.
(ⅲ) Cov{→OLSE(XiΘi), →Y}=Cov{→BLUE(XiΘi), →Y}, i=1,2,…,k.
(ⅳ) Cov{→OLSE(XiΘi), →Y}=Cov{→BLUE(XiΘi), →OLSE(XΘ)}, i=1,2,…,k.
(ⅴ) Cov{→OLSE(XiΘi), →Y−→OLSE(XΘ)}=0, i=1,2,…,k.
(ⅵ) ViX+=[Vi,0][X,ΣΣ1X⊥]+, i=1,2,…,k.
(ⅶ) ViX+ΣΣ1=[Vi,0][X,ΣΣ1X⊥]+ΣΣ1, i=1,2,…,k.
(ⅷ) ViX+ΣΣ1=[Vi,0][X,ΣΣ1X⊥]+ΣΣ1PX, i=1,2,…,k.
(ⅸ) ViX+ΣΣ1(ViX+)T=[Vi,0][X,ΣΣ1X⊥]+ΣΣ1([Vi,0][X,ΣΣ1X⊥]+)T, i=1,2,…,k.
(ⅹ) ViX+ΣΣ1X⊥=0, i=1,2,…,k.
(xi) X⊥[ΣΣ1X,0]F[XTX,VTi]=0, i=1,2,…,k.
(xii) X⊥ΣΣ1W⊥iXi=0, i=1,2,…,k.
(xiii) r[ΣΣ1XX0XTX0VTi]=r[ΣΣ1XXXTX0]=2r(X), i=1,2,…,k.
(xiv) r[ΣΣ1XXWTiX0]=r(X)+r(Wi), i=1,2,…,k.
(xv) R[(ViX+ΣΣ1)T]⊆R(X), i=1,2,…,k.
(xvi) R[XTΣΣ1X⊥0]⊆R[XTXVi], i=1,2,…,k.
(xvii) R[0VTi]⊆R[ΣΣ1XXXTX0], i=1,2,…,k.
(xviii) R(ΣΣ1W⊥iXi)⊆R(X), i=1,2,…,k.
Proof. The equivalences of (ⅰ)–(ⅴ) follow from Theorem 3.2 (ⅰ)–(ⅵ). The equivalences of the matrix equalities and relations in (ⅵ)–(xviii) were collected and proved in [38]; see also [33].
Concerning the equalities between the OLSEs and the BLUEs of the whole and partial mean parameter matrices in (1.1), we have the following results:
Theorem 3.5. Suppose that all XiΘi in (1.1) are estimable, i=1,…,k. Then, the following 18 statistical statements are equivalent:
(ⅰ) OLSE(XΘ)=BLUE(XΘ) holds definitely (with probability 1).
(ⅱ) Cov(→OLSE(XΘ))=Cov(→BLUE(XΘ)).
(ⅲ) Cov(→Y−→OLSE(XΘ))=Cov(→Y−→BLUE(XΘ)).
(ⅳ) Cov(→Y)=Cov(→OLSE(XΘ))+Cov(→Y−→OLSE(XΘ)).
(ⅴ) Cov(→OLSE(XΘ))=Cov{→BLUE(XΘ),→Y}.
(ⅵ) Cov(→OLSE(XΘ))=Cov{→BLUE(XΘ),→OLSE(XΘ)}.
(ⅶ) Cov{→OLSE(XΘ),→Y}=Cov{→BLUE(XΘ),→Y}.
(ⅷ) Cov{→OLSE(XΘ),→Y}=Cov{→BLUE(XΘ),→OLSE(XΘ)}.
(ⅸ) Cov{→OLSE(XΘ),→Y}=Cov{→Y,→OLSE(XΘ)}.
(ⅹ) Cov{→Y−→OLSE(XΘ),→Y}=Cov{→Y,→Y−→OLSE(XΘ)}.
(xi) Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}=Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}=0.
(xii) Cov{→Y−→OLSE(XΘ),→OLSE(XΘ)}+Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}=0.
(xiii) Cov{→OLSE(XΘ),→Y−→OLSE(XΘ)}=0.
(xiv) All OLSE(XiΘi)=BLUE(XiΘi) hold definitely (with probability 1), i=1,2,…,k.
(xv) All Cov(→OLSE(XiΘi))=Cov(→BLUE(XiΘi)) hold, i=1,2,…,k.
(xvi) All Cov{→OLSE(XiΘi), →Y}=Cov{→BLUE(XiΘi), →Y} hold, i=1,2,…,k.
(xvii) All Cov{→OLSE(XiΘi), →Y}=Cov{→BLUE(XiΘi), →OLSE(XΘ)} hold, i=1,2,…,k.
(xviii) All Cov{→Y−→OLSE(XΘ),→OLSE(XiΘi)}=0 hold, i=1,2,…,k.
Proof. The equivalences of (ⅰ)–(xiii) follow from Corollary 3.3. The equivalences of (xiv)–(xviii) follow from Corollary 3.4. If (ⅰ) holds, then R(ΣΣ1X)⊆R(X) holds by Corollary 3.3 (xxviii), and therefore the rank equality in Corollary 3.4 (xiii) holds as well, so that (ⅰ) implies (xiv)–(xviii). On the contrary, adding both sides of the k equalities in (xiv) and combining the equality with (2.18) and (2.31), we obtain (ⅰ).
Because OLSEs and BLUEs of unknown parameters in MGLMs can all be represented in certain exact and analytical formulas, it is definitely possible to obtain a number of clear and significant facts and results regarding the two fundamental and orthodox types of estimators by means of various mathematical analysis tools. In light of this fact, we described and studied in the preceding sections a group of research problems on establishing connections between OLSEs and BLUEs of unknown parameters in MGLMs through the well-organized employment of some exact algebraic methods and techniques in matrix theory and obtained a variety of algebraic and statistical interpretations for the equivalences of OLSEs and BLUEs under MGLMs. Unquestionably, this study shows that the equivalence problems of OLSEs and BLUEs under MGLMs are not isolated facts, but have diverse intrinsic links from algebraic and statistical points of view. Also, notice that the results in the previous sections are given in clear and analytical forms, so that they are easy to understand under various assumptions, and thereby enable us to recognize and use these versatile equivalent facts in many different situations when estimating parametric spaces in MGLMs and describing various mathematical and statistical properties of the estimators. Thus, it is technically necessary to collect these equivalent algebraic and statistical facts together and take them as a certain theoretical foundation for OLSE and BLUE problems on MGLMs. This work also illustrates that although OLSEs and BLUEs are classic objects of study, we are still able to propose various new and deep statistical inference problems on these objects and derive many novel and profound results under general assumptions by making use of various influential and effective matrix analysis tools and skilled partitioned matrix calculations. Hence, the contributions in this paper are closely related to the current research on linear statistical inference in its broadest sense. Finally, we believe that the resultant approaches to the equivalences of OLSEs and BLUEs provide significant advances to algebraical methodology in the statistical analysis of MGLMs and will enable methodological improvements and advances in the field of multivariate analysis.
Bo Jiang: conducted the formal research and investigation process; Yongge Tian: provided the ideas formulation and methodology for the research goals of the article and wrote the initial draft. All authors have read and approved the final version of the manuscript for publication.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
We would like to express our sincere thanks to the anonymous reviewers for their helpful comments and suggestions on an earlier version of this paper.
The authors declare no conflict of interest.
[1] |
I. S. Alalouf, G. P. H. Styan, Characterizations of estimability in the general linear model, Ann. Statist., 7 (1979), 194–200. http://dx.doi.org/10.1214/aos/1176344564 doi: 10.1214/aos/1176344564
![]() |
[2] | T. W. Anderson, An introduction to multivariate statistical analysis, 2 Eds., New York: Wiley, 1984. |
[3] | A. Basilevsky, Applied matrix algebra in the statistical sciences, New York: Dover Publications, 2013. |
[4] |
D. Bertsimas, M. S. Copenhaver, Characterization of the equivalence of robustification and regularization in linear and matrix regression, Euro. J. Oper. Res., 70 (2018), 931–942. https://dx.doi.org/10.1016/j.ejor.2017.03.051 doi: 10.1016/j.ejor.2017.03.051
![]() |
[5] |
N. H. Bingham, W. J. Krzanowski, Linear algebra and multivariate analysis in statistics: development and interconnections in the twentieth century, British Journal for the History of Mathematics, 37 (2022), 43–63. http://dx.doi.org/10.1080/26375451.2022.2045811 doi: 10.1080/26375451.2022.2045811
![]() |
[6] | R. Christensen, Linear models for multivariate, time series, and spatial data, New York: Springer, 1991. http://dx.doi.org/10.1007/978-1-4757-4103-2 |
[7] |
M. H. Ding, H. Y. Liu, G. H. Zheng, On inverse problems for several coupled PDE systems arising in mathematical biology, J. Math. Biol., 87 (2023), 86. http://dx.doi.org/10.1007/s00285-023-02021-4 doi: 10.1007/s00285-023-02021-4
![]() |
[8] |
R. W. Farebrother, A. C. Aitken and the consolidation of matrix theory, Linear Algebra Appl., 264 (1997), 3–12. http://dx.doi.org/10.1016/S0024-3795(96)00398-9 doi: 10.1016/S0024-3795(96)00398-9
![]() |
[9] | J. E. Gentle, Matrix algebra: theory, computations, and applications in statistics, 2 Eds., New York: Springer, 2017. http://dx.doi.org/10.1007/978-0-387-70873-7 |
[10] | R. Gnanadesikan, Methods for statistical data analysis of multivariate observations, 2 Eds., New York: Wiley, 1997. http://dx.doi.org/10.1002/9781118032671 |
[11] | D. A. Harville, Matrix algebra from a statistician's perspective, New York: Springer, 1997. https://dx.doi.org/10.1007/b98818 |
[12] |
B. Jiang, Y. G. Tian, On additive decompositions of estimators under a multivariate general linear model and its two submodels, J. Multivariate Anal., 162 (2017), 193–214. http://dx.doi.org/10.1016/j.jmva.2017.09.007 doi: 10.1016/j.jmva.2017.09.007
![]() |
[13] |
B. Jiang, Y. G. Tian, On equivalence of predictors/estimators under a multivariate general linear model with augmentation, J. Korean Stat. Soc., 46 (2017), 551–561. http://dx.doi.org/10.1016/j.jkss.2017.04.001 doi: 10.1016/j.jkss.2017.04.001
![]() |
[14] | K. Kim, N. Timm, Univariate and multivariate general linear models: theory and applications with SAS, 2 Eds., New York: CRC Press, 2006. |
[15] |
H. Y. Liu, C. W. K. Lo, Determining a parabolic system by boundary observation of its non-negative solutions with biological applications, Inverse Probl., 40 (2024), 025009. http://dx.doi.org/10.1088/1361-6420/ad149f doi: 10.1088/1361-6420/ad149f
![]() |
[16] |
R. Ma, Y. G. Tian, A matrix approach to a general partitioned linear model with partial parameter restrictions, Linear Multilinear A., 70 (2022), 2513–2532. http://dx.doi.org/10.1080/03081087.2020.1804521 doi: 10.1080/03081087.2020.1804521
![]() |
[17] |
A. Markiewicz, S. Puntanen, All about the ⊥ with its applications in the linear statistical models, Open Math., 13 (2015), 33–50. http://dx.doi.org/10.1515/math-2015-0005 doi: 10.1515/math-2015-0005
![]() |
[18] | A. Markiewicz, S. Puntanen, G. P. H. Styan, The legend of the equality of OLSE and BLUE: highlighted by C. R. Rao in 1967, In: Methodology and applications of statistics, Cham: Springer, 2021, 51–76. https://doi.org/10.1007/978-3-030-83670-2_3 |
[19] |
G. Marsaglia, G. P. H. Styan, Equalities and inequalities for ranks of matrices, Linear Multilinear A., 2 (1974), 269–292. http://dx.doi.org/10.1080/03081087408817070 doi: 10.1080/03081087408817070
![]() |
[20] |
S. K. Mitra, Generalized inverse of matrices and applications to linear models, Handbook of Statistics, 1 (1980), 471–512. https://dx.doi.org/10.1016/S0169-7161(80)80045-9 doi: 10.1016/S0169-7161(80)80045-9
![]() |
[21] | K. E. Muller, P. W. Stewart, Linear model theory: univariate, multivariate, and mixed models, New York: Wiley, 2006. http://dx.doi.org/10.1002/0470052147 |
[22] |
S. C. Narula, P. J. Korhonen, Multivariate multiple linear regression based on the minimum sum of absolute errors criterion, Euro. J. Oper. Res., 73 (1994), 70–75. http://dx.doi.org/10.1016/0377-2217(94)90144-9 doi: 10.1016/0377-2217(94)90144-9
![]() |
[23] |
S. C. Narula, J. F. Wellington, Multiple criteria linear regression, Euro. J. Oper. Res., 181 (2007), 767–772. http://dx.doi.org/10.1016/j.ejor.2006.06.026 doi: 10.1016/j.ejor.2006.06.026
![]() |
[24] |
R. Penrose, A generalized inverse for matrices, Math. Proc. Cambridge, 51 (1955), 406–413. http://dx.doi.org/10.1017/S0305004100030401 doi: 10.1017/S0305004100030401
![]() |
[25] |
S. Puntanen, G. P. H. Styan, The equality of the ordinary least squares estimator and the best linear unbiased estimator, with comments by O. Kempthorne, S. R. Searle, and a reply by the authors, Am. Stat., 43 (1989), 153–161. http://dx.doi.org/10.1080/00031305.1989.10475644 doi: 10.1080/00031305.1989.10475644
![]() |
[26] | S. Puntanen, G. P. H. Styan, J. Isotalo, Matrix tricks for linear statistical models: Our personal top twenty, Berlin: Springer, 2011. http://dx.doi.org/10.1007/978-3-642-10473-2 |
[27] | C. R. Rao, S. K. Mitra, Generalized inverse of matrices and its applications, New York: Wiley, 1972. |
[28] | C. R. Rao, M. B. Rao, Matrix algebra and its applications to statistics and econometrics, Singapore: World Scientific, 1998. http://dx.doi.org/10.1142/9789812779281 |
[29] | G. C. Reinsei, R. P. Velu, Multivariate reduced-rank regression: theory and applications, New York: Springer, 1998. http://dx.doi.org/10.1007/978-1-4757-2853-8 |
[30] |
J. S. Respondek, Matrix black box algorithms–a survey, B. Pol. Acad. Sci.-Tech., 70 (2022), e140535. http://dx.doi.org/10.24425/bpasts.2022.140535 doi: 10.24425/bpasts.2022.140535
![]() |
[31] | S. R. Searle, A. I. Khuri, Matrix algebra useful for statistics, 2 Eds., Hoboken: Wiley, 2017. |
[32] | G. A. F. Seber, Multivariate observations, Hoboken: Wiley, 2004. http://dx.doi.org/10.1002/9780470316641 |
[33] |
Y. G. Tian, On equalities of estimations of parametric functions under a general linear model and its restricted models, Metrika, 72 (2010), 313–330. http://dx.doi.org/10.1007/s00184-009-0255-2 doi: 10.1007/s00184-009-0255-2
![]() |
[34] |
Y. G. Tian, A new derivation of BLUPs under random-effects model, Metrika, 78 (2015), 905–918. http://dx.doi.org/10.1007/s00184-015-0533-0 doi: 10.1007/s00184-015-0533-0
![]() |
[35] |
Y. G. Tian, Matrix rank and inertia formulas in the analysis of general linear models, Open Math., 15 (2017), 126–150. http://dx.doi.org/10.1515/math-2017-0013 doi: 10.1515/math-2017-0013
![]() |
[36] |
Y. G. Tian, B. Jiang, Matrix rank/inertia formulas for least-squares solutions with statistical applications, Spec. Matrices, 4 (2016), 130–140. http://dx.doi.org/10.1515/spma-2016-0013 doi: 10.1515/spma-2016-0013
![]() |
[37] |
Y. G. Tian, C. Wang, On simultaneous prediction in a multivariate general linear model with future observations, Stat. Probabil. Lett., 128 (2017), 52–59. http://dx.doi.org/10.1016/j.spl.2017.04.007 doi: 10.1016/j.spl.2017.04.007
![]() |
[38] |
Y. G. Tian, X. Zhang, On connections among OLSEs and BLUEs of whole and partial parameters under a general linear model, Stat. Probabil. Lett., 112 (2016), 105–112. http://dx.doi.org/10.1016/j.spl.2016.01.019 doi: 10.1016/j.spl.2016.01.019
![]() |
[39] |
Y. W. Yin, W. S. Yin, P. C. Meng, H. Y. Liu, The interior inverse scattering problem for a two-layered cavity using the Bayesian method, Inverse Probl. Imag., 16 (2022), 673–690. http://dx.doi.org/10.3934/ipi.2021069 doi: 10.3934/ipi.2021069
![]() |