1.
Introduction
Consider the least squares (LS) problem involving Kronecker products
where A∈Rm×n, B∈Rp×q, b∈Rmp, and where the Kronecker product [13] is defined by
The LS problems involving Kronecker products of the type (1.1) arise in many areas of application including signal and image processing, photogrammetry, fast transform algorithms, the Lyapunov approach to stability theory, circuits and systems, and stochastic matrices [8,19]. Therefore, the LS problems involving Kronecker products have attracted many researchers to study its algorithms [2,9,10,18].
For a given problem, a condition number measures the worst-case sensitivity of its solution to small perturbations in the input data, whereas backward errors reveal the stability of a numerical method. Combined with the backward error, a condition number can provide a first-order upper bound on the error in the computational solution [15]. In [22], the authors gave the upper bounds on the normwise, mixed and componentwise condition numbers for the Kronecker product linear systems (A⊗B)x=b. The level-2 condition numbers for the Kronecker product linear systems have studied by Kang and Xiang [17]. Xiang and Wei [23] derived the explicit expressions of the normwise, mixed and componentwise condition numbers for the Kronecker product linear systems with multiple right-hand sides. For the LS problem involving Kronecker products of (1.1), Diao et al. [7] have studied its conditioning theory when A and B are of full column rank. Chen and Li [5] presented explicit expressions for normwise, mixed and componentwise condition numbers for the weighted least squares problem involving Kronecker when A and B are of full column rank. In this paper, we will present the upper bounds for the normwise, mixed and componentwise condition numbers of the LS problem (1.1) when A or B is of rank deficient.
According to the fact that rank(A⊗B)=rank(A)rank(B), it follows that A⊗B is rank deficient when A or B is of rank deficient. In this case, the LS solution to (1.1) always exists but it is nonunique. Therefore the unique minimum norm LS solution xLS=(A⊗B)†b=(A†⊗B†)b is considered, where A† denotes the Moore-Penrose inverse of A. Moreover, when A⊗B is a rank deficient matrix, small changes to A or B can produce large changes to xLS=(A†⊗B†)b, see Example 1. In other words, a condition number of xLS with respect to rank deficient A⊗B does not exist or is "infinite". Hence, in this section, we present the normwise, mixed and componentwise condition numbers of the LS problem (1.1) by restricting changes to the perturbation matrices ΔA or ΔB, i.e. ΔA∈S or ΔB∈T, where
and
Here R(A) denotes the range of A.
Example 1. Let
By simple computations, we have
and
Throughout the paper, for given positive integers m and n, denote by Rn the space of n-dimensional real column vectors, by Rm×n the space of all m×n real matrices, and by ‖⋅‖2 and ‖⋅‖F the 2-norm and Frobenius norm of their arguments, respectively. Given a matrix X=[xij]∈Rm×n, ‖X‖max, X†, XT denote the max norm, given by ‖X‖max=maxi,j|xij|, the Moore-Penrose inverse and the transpose of X, respectively, and |X| is the matrix whose elements are |xij|. For the matrices X=[xij], Y=[yij]∈Rm×n, X≤Y means xij≤yij for all i,j and we define XY=[zij]∈Rm×n by
The following lemmas will be used in the later discussion.
Lemma 1.1. [13] For any matrix X=(xij)∈Rm×n, Y∈Rp×q and Z∈Rn×p, we have
Furthermore, with vec(⋅) stacking columns, we have
where a∈Rm, c∈Rn, Kmn∈Rmn×mn is the permutation matrix defined by
Here each Eij∈Rm×n has entry 1 in position (i,j) and all other entries are zero.
Lemma 1.2. [12] If E∈Rn×n and ‖E‖2<1, then In−E is nonsingular and
Lemma 1.3. [3] If A, ΔA∈Rm×n satisfy ‖A†ΔA‖2<1, R(ΔA)⊆R(A) and R((ΔA)T)⊆R(AT), then
2.
Normwise condition numbers
The following weighted Frobenius norm
was first used by Gratton for deriving the normwise condition number for the linear least squares problem [14]. Here ‖⋄‖F denotes the Frobenius norm of a matrix, and ‖⋄‖2 denotes the spectral norm of a matrix or the Euclidean norm of a vector. We will call the latter 2-norm uniformly later in this paper. Subsequently, this kind of norm was used for the partial condition number for the linear least squares problem [1] and the normwise condition number of the truncated singular value solution of a linear ill-posed problem [4]. A more general weighted Frobenius norm ‖[ATβb]‖F, where T is a positive diagonal matrix, is sometimes chosen. This is the case, for instance, in [6], which gives the explicit expressions for the normwise, mixed and componentwise condition numbers of the Tikhonov regularization using this norm. In this paper, we use the weighted Frobenius norm and weighted 2-norm which defined by
and
with α,β,γ>0. These norms are very flexible since they allow us to monitor the perturbations on A, B and b. For instance, large values of α enable us to obtain condition number problems where mainly B and b are perturbed.
Consider the perturbed LS problem of (1.1)
where ΔA, ΔB and Δb are the perturbations of the input data A, B and b, respectively. The unique minimum norm LS solution to (2.1) is ˜xLS=((A+ΔA)†⊗(B+ΔB)†)(b+Δb). We let the change in the solution be Δx=˜xLS−xLS.
Theorem 2.1. When A and B are of rank deficient, the condition number
satisfies
where P=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(Imn⊗vec(B)) and Q=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(vec(A)⊗Ipq).
Proof. When ‖ΔA‖2 is sufficiently small, we may assume that ‖A†ΔA‖2<1. Then, from Lemmas 2 and 3 with R(ΔA)⊆R(A), R((ΔA)T)⊆R(AT), neglecting the second-order terms gives
Similarly, we have (B+ΔB)†=B†−B†ΔBB† when ‖ΔB‖2 is sufficiently small and ΔB∈T. Thus, for small ΔA and ΔB, the linear term in Δx=((A+ΔA)†⊗(B+ΔB)†)(b+Δb)−(A†⊗B†)b is
Applying the operator vec to (2.2) and using Lemma 1.1, we have
Consequently, we have
Since (A†⊗B†)Δb−(A†⊗B†)(A⊗ΔB+ΔA⊗B)xLS is the linear term in Δx, Theorem 3.1 holds.
The formula for κ(F)(A,B,b)upper1 in Theorem 2.1 involve the permutation matrix Kqm and Kronecker products, which make them inefficient for computation. In order to overcome this shortcoming, the next corollary will provide easily computable upper bound.
Corollary 2.1. For the estimate κ(F)(A,B,b)upper1 in Theorem 2.1, we have
Proof. It follows from Lemma 1.1 that
which together with Theorem 2.1 complete the proof of this corollary.
The normwise condition number under the weighted 2-norm is given in the following theorem.
Theorem 2.2. When A and B are of rank deficient, the condition number
satisfies
Proof. By taking the 2-norm of (2.2) and using Lemma 1.1, we obtain
which together with the definition of κ(2)(A,B,b) complete the proof of this theorem.
We now give the upper bound estimates of the normwise condition numbers for the LS problem (1.1) under the weighted Frobenius norm and weighted 2-norm when only A and b are perturbed. These results can be proved in the same manners as in Theorems 2.1 and 2.2, and hence, we state them in the following theorem without their proofs.
Theorem 2.3. When A is of rank deficient, we have
and
where P=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(Imn⊗vec(B)) and
When B=1, the problem (1.1) reduces to the classical LS problem minx∈Rn‖Ax−b‖2. When B=1, it follows from Theorem 2.3 that
The upper bound estimate in the above inequality was proved to be the exact expression of κ(F)(A,b) in [21,Corollary 2.2]. Therefore, we conjecture that the upper bound estimates in Theorem 2.1, Theorem 2.2 and Theorem 2.3 are the exact expression of the corresponding normwise condition numbers. This needs to be further studied in the future.
3.
Mixed and componentwise condition numbers
The normwise condition number measures both the input and output data errors by norms. Norms can tell us the overall size of a perturbation but not how that size is distributed among the elements it perturbs, and this information can be important when the data is badly scaled or contains many zeros [20]. To take into account the relative of each data component, and, in particular, a possible data sparseness, componentwise condition numbers have been increasingly considered. These are mostly of two kinds: mixed and componentwise. The terminologies of mixed and componentwise condition numbers may be first used by Gohberg and Koltracht [11]. We adopt their terminology and define the mixed and componentwise condition numbers for the LS problem (1.1) are defined as follows:
and
We assume that xLS≠0 for m(A,B,b) and xLS has no zero entries for c(A,B,b).
The following theorem gives the upper bounds for the mixed and componentwise condition numbers of the LS problem (1.1).
Theorem 3.1. When A and B are of rank deficient, we have
and
where P=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(Imn⊗vec(B)) and Q=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(vec(A)⊗Ipq).
Proof. According to |ΔA|≤ϵ|A|, we know that the zero elements of A are not permitted to be perturbed. Therefore,
where DA=diag(vec(A)). Similarly, we have vec(ΔB)=DBD†Bvec(ΔB) and Δb=DbD†bΔb with DB=diag(vec(B)) and Db=diag(b). Thus the linear term (A†⊗B†)Δb−(A†⊗B†)(A⊗ΔB+ΔA⊗B)xLS of Δx can be rewritten as
Taking the infinity norm and using the assumption |ΔA|≤ε|A|, |ΔB|≤ε|B| and |Δb|≤ε|b|, we have
Since (A†⊗B†)Δb−(A†⊗B†)(A⊗ΔB+ΔA⊗B)xLS is the linear term of Δx, m(A,B,b) is bounded by
where e is an mn+mp+pq dimensional vector with all entries equal to one.
Recall that in the definition of c(A,B,b), we assume that xLS has no zero entries. Hence, it follows from (3.1) and the assumption |ΔA|≤ε|A|, |ΔB|≤ε|B| and |Δb|≤ε|b| that
where DxLS=diag(vec(xLS)). Hence, we have
Using Lemma 1.1, we can get
Similarly, we can deduce that
Because of the monotonicity property of the infinity norm, the upper bounds m(A,B,b)upper2 and c(A,B,b)upper2 can be obtained by applying the aforementioned two inequalities to m(A,B,b)upper1 and c(A,B,b)upper1 and by using the matrix norm triangular inequality.
The following theorem gives the upper bound estimates of the mixed and componentwise condition numbers for the LS problem (1.1) when only A and b are perturbed, which can be proved in the same way as Theorem 3.1.
Theorem 3.2. When A is of rank deficient, we have
and
where P=(xTLS⊗A†⊗B†)(In⊗Kqm⊗Ip)(Imn⊗vec(B)).
When B=1, it follows from Theorem 3.2 that
and
which have been obtained in [16].
4.
Numerical experiments
We consider the LS problem (1.1) with
We first compare κ(F)(A,B,b)upper1, κ(F)(A,B,b)upper2, κ(2)(A,B,b)upper2 with the upper bounds of the mixed and componentwise condition numbers given in Theorem 3.1. Thus, upon computations in MATLAB R2015b with precision 2.2204×10−16, we get the results listed in Table 1. From Table 1, we find that as the (1,1)-element of A increases, the upper bounds of the normwise condition numbers become larger and larger, while, comparatively, the upper bounds of the mixed and componentwise condition numbers have no change. This is mainly because the mixed and componentwise condition numbers notice the structure of the coefficient matrix A with respect to scaling, but the normwise condition numbers ignore it.
For i=0, suppose the perturbations are ΔA=10−j×A,ΔB=10−j×B and ΔB=10−j×rand(9,1), where rand(⋅) is the MATLAB function. Obviously, ΔA∈S and ΔB∈T. Define ε1=‖(ΔA,ΔB,Δb)‖F‖(A,B,b)‖F, ε2=‖(ΔA,ΔB,Δb)‖2‖(A,B,b)‖2 and ε3=max{ε:|ΔA|≤ε|A|,|ΔB|≤ε|B||Δb|≤ε|b|}, it follows from the definitions of κ(F)(A,B,b), κ(2)(A,B,b), m(A,B,b) and c(A,B,b) that
and
for small ε1, ε2 and ε3. As shown in Table 2, the error bounds given by the upper bounds of the condition numbers are at most two order of magnitude larger than the actual errors. This illustrates that, the estimates κ(F)(A,B,b)upper1, κ(2)(A,B,b)upper, m(A,B,b)upper1 and c(A,B,b)upper1 can estimate their corresponding condition numbers well.
Acknowledgments
This work was supported by the Foundation for Distinguished Young Scholars of Gansu Province (Grant No.20JR5RA540).
Conflict of interest
The authors declare there is no conflict of interest.