1.
Introduction
The low-rank matrix recovery, also known as robust PCA or sparse and low-rank matrix decomposition, means to find the lowest rank matrices based on fewer linear measurements, arise in many fields such as collaborative filtering [1,2,3], machine learning [4], picture alignment [5,6], signal processing [7], quantum state tomography [8] and more.
The problem of a low-rank matrix recovery was first introduced in [9,10,11,12], which can be regarded as a matrix-analogue of compressed sensing [13] refers to automatically identify the corrupted elements and recover the original matrix when some elements of the matrix were seriously missing. Wright et al. [14] had proved that the low-rank matrix A can be exactly recovered from matrix D by solving the following convex optimization problem,
where ‖A‖∗:=∑rk=1σk(A),σk(A) denotes the k-th largest singular value of A∈Rn1×n2 with rank(A)=r, was called the nuclear norm of A. The sum of the absolute values of the matrix E entries was denoted as ‖E‖1, and λ was a positive weighting parameter.
As for the solution of the model (1.1), there are a lot of studies both from theoretical and algorithmic aspects. For example, the iterative threshold (IT) algorithm (see [14]) had been put forward, which used a soft threshold operator and linear Bremgan iteration to solve model (1.1). The de-capitalize iterative hard threshold (IHT) algorithm (see [15]) had been proposed based on the influence of hard threshold operator on compressed sensing. In summary, the accelerated proximal gradient (APG) method (see [16]), the singular value thresholding (SVT) algorithm (see [17]), the dual algorithm (see [18]), the augmented Lagrangian multiplier (ALM) algorithm (see [19]), and the hybrid augmented Lagrange multiplier (HALM) algorithm (see [20]) had been later presented to deal with the convex problem (1.1). Numerous details and derivations on a low-rank matrix recovery problem can be referred to the references given therein.
This paper aims mainly at establishing a new hybrid thresholding with a diagonal-modify iterative algorithm based on the ALM algorithm for recovering a low-rank matrix. By using a diagonal-modify technique, the sequence matrices generated by the new algorithm were approximated in the true solution well, which saves significant computational cost.
The rest of the paper was organized as follows: In Section 2, the notations used throughout this paper and related preliminaries studies were introduced. In Section 3, we presented the proposed matrix recovery algorithm in detail. The convergence analysis of the new algorithm was given in Section 4. Then, numerical experiments were shown in Section 5. Finally, we ended the paper with the concluding remarks in Section 6.
2.
Notations and preliminaries
In this position, we provide some basic notations and preliminaries that were used in our analysis. Rn1×n2 denotes the set of n1×n2 real matrices; AT is used to express the transpose of a matrix A. diag(d11,d22,…,dnn) stands for a diagonal matrix with the diagonal elements d11,d22,…,dnn. ‖A‖1 denotes the sum of the absolute values of matrix A entries. ‖A‖2 represents the spectral norm, the square root of the maximum eigenvalue of ATA, and the Frobenius norm ‖A‖F=√n2∑j=1n1∑i=1a2ij. The tr(A) represents the trace of A, and the standard inner product between two matrices is denoted by ⟨X,Y⟩=tr(XTY).
Definition 1. (Singular Value Decomposition (SVD) [21]) The singular value decomposition (SVD) of a r-rank matrix A∈Rn1×n2 is defined by
where U∈Rn1×r and V∈Rn2×r are both column orthogonal matrices, and σ1≥σ2≥…≥σr>0 are all positive singular values of A. By the way, ‖A‖∗:=∑rk=1σk(A) denotes the nuclear norm of A.
Definition 2. For each τ≥0, the soft thresholding (shrinkage) operator (see [17]) Sτ was defined by
the hard thresholding operator (see [22]) ητ was defined by
where σ∈R, and τ is called a thresholding.
Definition 3. (Hybrid Singular Value Thresholding Operator [20]) Let X=UΣrVT∈Rn1×n2 be the SVD of X mentioned above. For each τ≥0,z>1, the hybrid thresholding operator Hτ,z is defined by
Moreover, Sτ(A):=USτ(Σ)VT and ητ(A):=Uητ(Σ)VT based on definition 2.
3.
Description of algorithms
In this section, a new fast algorithm was proposed after introducing a related algorithm for solving the problem (1.1).
3.1. The augmented Lagrange multipliers (ALM) algorithm
It is well-known that the partial augmented Lagrange function of (1.1) is
where A,E,Y∈Rn1×n2, μ>0 is the penalty parameter. It is reported that the algorithm of augmented Lagrange multipliers has been applied to the low-rank matrix recovery problem. It is of much better numerical behavior, and it is also of much higher accuracy. Then the augmented Lagrange multipliers algorithm is summarized in the following Algorithm 3.1.
For convenience, [Uk,Σk,Vk]=svd(⋅) denotes the SVD of the corresponding matrix.
3.2. The hybrid augmented Lagrange multiplier algorithm with diagonal-modify
Because the soft thresholding operator compressed the thresholding in a constant way and maybe lost some effective large thresholdings, while the hard thresholding operator was discontinuous, which reduced the smoothness of the matrix. To complement the advantages of the two operators, a hybrid singular value threshold operator had been designed. Based on the augmented Lagrange multiplier algorithm, the new algorithm employs the hybrid singular threshold operator with diagonal-modify H to deal with the singular value in each iteration.
The details of the so-called hybrid singular value threshold operator: keep some large singular values unchanged, compress some middle size singular values in a constant way, and take some small singular values as 0. This operator can make up for the deficiency of a single soft or hard threshold operator, and improve the accuracy of the recovered matrix partially. Further, a diagonal-modify W(A)=AW was used to improve its recovery efficiency at the kth iteration, where the diagonal matrix Wk=diag(w11(k),w22(k),…,wnn(k))∈Rn×n can be obtained by
In fact, Eq (3.1) is easy to compute since it is so simple just some arithmetic operation required, without extra cost. In fact, the exact solution of (3.1) is given by
The new algorithm can be designed as follows:
The difference with the classical ALM algorithm focuses on Steps 2 and 3. Moreover, Agorithm 3.2 includes the HALM algorithm in [20] as a special case when W=I.
4.
Convergence analysis
In this section, we presented briefly the convergence of Algorithm 3.2 by analyzing the properties of the sequences {Ak},{Ek}, and {Yk}. Since the D-HALM algorithm was a progression of the classical ALM algorithm, its proof is a trivial generalization. For details of proofs and techniques, one can refer to [19] and references given therein.
Lemma 4.1. (see [17]) Let A∈Rn1×n2 be a matrix and UΣVT be its SVD. Then the subgradients set of the nuclear norm of A is given by
Lemma 4.2. (see [19])
where (A∗,E∗) and Y∗ are the optimal solutions to the problem (1) and its dual problem, respectively.
Lemma 4.3. Let ¯Yk=Yk−1+μ(D−Ak−Ek−1), then the sequences {¯Yk} and {Yk} are bounded.
Proof. Let ¯Ak+1= UkDμ−1k(Σ)VTk, we have
Thus,
Now we can obtain the following result:
From Ek+1, as shown in Algorithm 3.2, we can obtain
Hence,
The boundness of the sequences {¯Yk} and {Yk} is obtained.
Lemma 4.4. (see [19]) The subgradient of a convex function is a monotone operator. Namely,
under f is a convex function.
Lemma 4.5. (see [19]) The terms of the series
is nonnegative, and the series is convergent provided that μk is nondecreasing.
In Algorithm 3.2, we do not have to solve the sub-problem
exactly. Rather, updating them once when solving this sub-problem is sufficient for Ak and Ek to convergence the local optimal solution of the problem (1). And this leads to an inexact ALM with diagonal-modify, similar to the IALM introduced in [19]. After the analysis on Ak,Ek, and Yk via these Lemmas above, the validity and optimality of Algorithm 3.2 are guaranteed by the following theorem (see [19]).
Theorem 4.1. Suppose that μk is nondecreasing and the series ∞∑i=1μ−1k diverges, then the sequence {(Ak,Ek)} converges to the optimal solution (A∗,E∗) of problem (1.1).
Proof. Because of Y∗∈∂‖A∗‖∗, Y∗∈∂(‖λE∗‖1), Yk+1∈∂(‖λEk+1‖1), and ¯Yk+1=∂‖Ak+1‖∗−∂AL(Ak+1,Ek,Yk,μk), we have
By analysis, we obtain
Obviously, we can obtain
Thus, ‖Ek−E∗‖2+μ−2k‖Yk−Y∗‖2 is non-increasing, as shown in [19]. From the Theorem 2 in [19], it holds
we obtain that (A∗,E∗) is the solution of the low-rank matrix recovery problem (1.1), which completes the proof.
5.
Numerical experiments
In this section, some random data and video sequences were used to demonstrate the proposed algorithm was feasible and effective to solve a low-rank matrix recovery problem (1.1). All the experiments are performed under Windows 11 and MATLAB R2019a running on a DELL laptop.
5.1. Random data
Some original results of four algorithms (APG, ALM, HALM, and D-HALM) were provided for the n×n matrices with different ranks for the problem (1.1). We conduct numerical experiments on the same workstation. By analyzing and comparing iteration numbers (denoted by IT), computing time in seconds (denoted by CPU(s)), and deviation (Error 1, Error 2), defined by
In the experiments, p=mn2 represents the sampling density, where m is the number of observed entries. We denote the optimal solution by the ordered pair (A∗,E∗), and the output by (¯A,¯E). For Algorithm 3.2 (D-HALM), we choose a fixed weighting parameter λ=1√n, set Y0=DJ(D)(J(D)=max(‖D‖2,λ−1‖D‖∞) [19], and we empirically set the parameter μ0=1.25‖D‖2,τ0=0.5‖D‖2,ϵ1=1×10−8,ϵ2=1×10−7,z=1.4. By the way, the ALM, HALM, and D-HALM algorithms share the same parameters.
Tests were conducted for n1=n2,p∈{0.1,0.2,0.3,0.4}, and the rank of the underlying matrix to be reconstructed, r=0.005n. The specific comparison results were shown in Tables 1–4, and the compared curve of the computing time for different algorithms was mapped in Figures 1 and 2.
From Figures 1 and 2, it can be seen that the time of the D-HALM algorithm was always less than other algorithms. The larger the matrix scale, the more obvious the effect.
As can be shown in Tables 1–4, the proposed algorithm, ALM, and HALM algorithms achieve almost the same iterations and relative errors. However, we can see the CPU of our algorithm was obviously less than that of APG, ALM, and HALM algorithms.
By introducing the speed-up (denoted by SP) as follows,
it is noted that the maximum SP values of APG, ALM and HALM algorithms were approximately 7.3, 2.6, and 1.6 when the matrix size n≤10000; however, those of them were approximately 6.1, 5.9 and 1.7. The computational cost of the proposed algorithm was significantly lower than the other algorithms.
5.2. Video sequences
In this subsection, we use four surveillance video sequences (video 1: Hall of a business building; video 2: Mall; video 3: Bootstrap; video 4: Fountain) with different resolutions, respectively, to compare the separated effects of three ADMM algorithms (ALM, HALM, and D-HALM). One of 200 images of four video sequences was randomly selected for testing. Figures 3–5 shown the results of foreground and background separation of four surveillance video sequences under the D=A+E model. The error and running time were reported in Table 5, where
From Table 5, it can be seen that three algorithms shared similar separation accuracy while the running time of D-HALM was cost-effective.
6.
Conclusions
To solve the low-rank matrix recovery problem, we have proposed a hybrid singular value thresholding algorithm with diagonal-modify (D-HALM) based on the classical ALM algorithm. In the proposed algorithm, a hybrid singular value thresholding was used and a diagonal-modify was carried out in each iteration. The iterative sequence generated by Algorithm 3.2 for solving the optimization models converges to the optimal solution, which can be guaranteed by the traditional convergence theory. In experiments, compared with APG, ALM, and HALM algorithms introduced in the past three years. The experiments based on randomly numerical simulation data and video sequence separation have shown both that the proposed algorithm in this paper takes less time and gets more precision for solving the low-rank matrix recovery problem. It was found that the D-HALM algorithm was more efficient for solving low-rank matrix recovery problems than the other algorithms. In addition, it is worth mentioning that the new algorithm can further develop to solve the tensor completion problems.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
The authors are very much indebted to the anonymous referees for their helpful comments and suggestions, which greatly improved the original manuscript of this paper. The authors are so thankful for the support from the NSF of China (12371381), the special fund for science and technology innovation teams of Shanxi province (202204051002018), and the Shanxi higher education science and technology innovation project (2023L243).
Conflict of interest
The authors declare that they have no conflict of interests.