
Hard Thresholding Pursuit (HTP) is one of the important and efficient algorithms for reconstructing sparse signals. Unfortunately, the hard thresholding operator is independent of the objective function and hence leads to numerical oscillation in the course of iterations. To alleviate this drawback, the hard thresholding operator should be applied to a compressible vector. Motivated by this idea, we propose a new algorithm called Compressive Hard Thresholding Pursuit (CHTP) by introducing a compressive step first to the standard HTP. Convergence analysis and stability of CHTP are established in terms of the restricted isometry property of a sensing matrix. Numerical experiments show that CHTP is competitive with other mainstream algorithms such as the HTP, Orthogonal Matching Pursuit (OMP) and Subspace Pursuit (SP) algorithms both in the sparse signal reconstruction ability and average recovery runtime.
Citation: Liping Geng, Jinchuan Zhou, Zhongfeng Sun, Jingyong Tang. Compressive hard thresholding pursuit algorithm for sparse signal recovery[J]. AIMS Mathematics, 2022, 7(9): 16811-16831. doi: 10.3934/math.2022923
[1] | Hengdi Wang, Jiakang Du, Honglei Su, Hongchun Sun . A linearly convergent self-adaptive gradient projection algorithm for sparse signal reconstruction in compressive sensing. AIMS Mathematics, 2023, 8(6): 14726-14746. doi: 10.3934/math.2023753 |
[2] | Habibu Abdullahi, A. K. Awasthi, Mohammed Yusuf Waziri, Issam A. R. Moghrabi, Abubakar Sani Halilu, Kabiru Ahmed, Sulaiman M. Ibrahim, Yau Balarabe Musa, Elissa M. Nadia . An improved convex constrained conjugate gradient descent method for nonlinear monotone equations with signal recovery applications. AIMS Mathematics, 2025, 10(4): 7941-7969. doi: 10.3934/math.2025365 |
[3] | Ayed M. Alrashdi, Masad A. Alrasheedi . Square-root lasso under correlated regressors: Tight statistical analysis with a wireless communications application. AIMS Mathematics, 2024, 9(11): 32872-32903. doi: 10.3934/math.20241573 |
[4] | Bing Xue, Jiakang Du, Hongchun Sun, Yiju Wang . A linearly convergent proximal ADMM with new iterative format for BPDN in compressed sensing problem. AIMS Mathematics, 2022, 7(6): 10513-10533. doi: 10.3934/math.2022586 |
[5] | Sani Aji, Poom Kumam, Aliyu Muhammed Awwal, Mahmoud Muhammad Yahaya, Kanokwan Sitthithakerngkiet . An efficient DY-type spectral conjugate gradient method for system of nonlinear monotone equations with application in signal recovery. AIMS Mathematics, 2021, 6(8): 8078-8106. doi: 10.3934/math.2021469 |
[6] | Ruiping Wen, Wenwei Li . An accelerated alternating directional method with non-monotone technique for matrix recovery. AIMS Mathematics, 2023, 8(6): 14047-14063. doi: 10.3934/math.2023718 |
[7] | Andrew Calcan, Scott B. Lindstrom . The ADMM algorithm for audio signal recovery and performance modification with the dual Douglas-Rachford dynamical system. AIMS Mathematics, 2024, 9(6): 14640-14657. doi: 10.3934/math.2024712 |
[8] | Charu Batra, Renu Chugh, Mohammad Sajid, Nishu Gupta, Rajeev Kumar . Generalized viscosity approximation method for solving split generalized mixed equilibrium problem with application to compressed sensing. AIMS Mathematics, 2024, 9(1): 1718-1754. doi: 10.3934/math.2024084 |
[9] | Nipa Jun-on, Raweerote Suparatulatorn, Mohamed Gamal, Watcharaporn Cholamjiak . An inertial parallel algorithm for a finite family of $ G $-nonexpansive mappings applied to signal recovery. AIMS Mathematics, 2022, 7(2): 1775-1790. doi: 10.3934/math.2022102 |
[10] | Julian Osorio, Carlos Trujillo, Diego Ruiz . Construction of a cryptographic function based on Bose-type Sidon sets. AIMS Mathematics, 2024, 9(7): 17590-17605. doi: 10.3934/math.2024855 |
Hard Thresholding Pursuit (HTP) is one of the important and efficient algorithms for reconstructing sparse signals. Unfortunately, the hard thresholding operator is independent of the objective function and hence leads to numerical oscillation in the course of iterations. To alleviate this drawback, the hard thresholding operator should be applied to a compressible vector. Motivated by this idea, we propose a new algorithm called Compressive Hard Thresholding Pursuit (CHTP) by introducing a compressive step first to the standard HTP. Convergence analysis and stability of CHTP are established in terms of the restricted isometry property of a sensing matrix. Numerical experiments show that CHTP is competitive with other mainstream algorithms such as the HTP, Orthogonal Matching Pursuit (OMP) and Subspace Pursuit (SP) algorithms both in the sparse signal reconstruction ability and average recovery runtime.
One of the important problems in signal processing is to recover an unknown sparse signal from a few measurements, which can be expressed as follows:
min 12‖y−Ax‖22s.t. ‖x‖0≤s, | (1.1) |
where ‖x‖0 represents the total number of nonzero entries of x∈RN, A∈Rm×N is the measurement matrix with m≪N, and y is the measurement vector. This model has been widely applied in many important areas, including machine learning, compressed sensing, signal processing, pattern recognition, wireless communication, etc. Note that the l0-norm is not continuous and has combinatorial structure. Hence, the problem (1.1) is known to be NP-hard.
Over the past decades, many efficient algorithms have been proposed for solving the model (1.1), including convex relaxation methods, greedy methods, and thresholding-based methods, to name a few. For example, basis pursuit [1,2], lp algorithms [3,4], alternating projections [5] and conjugate gradient adaptive filtering [6] have been proposed by using convex optimization techniques. Greedy methods include orthogonal matching pursuit (OMP) [7,8,9], compressive sampling matching pursuit (CoSaMP) [10] and subspace pursuit (SP) [11,12]. Thresholding-based methods provide a simple way to ensure the feasibility of iteration, which includes iterative hard thresholding [13,14,15], hard thresholding pursuit [16,17,18], soft thresholding [19] and Newton-step hard thresholding [20]. Recently, Zhao [21,22] proposed a new thresholding operator called s-optimal thresholding by connecting the s-thresholding directly to the reduction of the objective function. For more information on theoretical analysis and applications of algorithms for solving sparse signal recovery, see [23,24,25,26].
In this paper, we mainly focus on the hard thresholding algorithms due to their simple structure and low computational cost. One of the key steps in a hard thresholding algorithm is
xn+1=Hs(xn+AT(y−Axn)), |
where Hs(⋅) stands for the hard thresholding operator, keeping the s-largest absolute components of the vector and zeroing out the rest of the components. The main aim in this step is to ensure the feasibility of iteration. However, three possible questions exist for this step. First, the objective function can increase in some iterations, i.e., {‖y−Axn‖2} is not a decreasing sequence. Second, if xn+AT(y−Axn) is a dense vector, only keeping the s-part would lose too much important information. Third, even for a non-dense vector, it is possible that some useful indexes may be missed, as the difference between the s-largest components and (s+j)-largest (with j>0) components of xn+AT(y−Axn) is very small. To the best of our knowledge, the first question can be solved by using the s-optimal thresholding operator introduced by Zhao in [21], and the second question has been discussed recently in [27] by using partial gradient technology, i.e., full gradients AT(y−Axn) are replaced by partial ones Hq(AT(y−Axn)) with q≥s.
Inspired by these works, we continue to study the aforementioned third question in this paper. Precisely, a little greater signal basis, say β, which is greater than s, is selected in each iteration first. This undoubtedly increases the chance of correctly identifying the support set of a signal, particularly for the case of the s-largest components close to the β-largest components. A β-sparse vector un+1 is attained by solving a least-square problem over the subspace determined by the index set Un+1:=Lβ(xn+AT(y−Axn)), where Lβ(x) stands for the index set of the β largest absolute entries of x. Subsequently, the hard thresholding operator Hs(⋅) will be applied to un+1 for reconstructing the s-sparse vector. Finally, to further improve numerical performance, we choose an iteration to seek a vector that best fits the measurements over the prescribed support set by using a pursuit step, also referred to as debiasing or orthogonal projection. This leads to the Compressive Hard Thresholding Pursuit (CHTP) algorithm. Numerical experiments demonstrate that CHTP has better performances when β is slightly greater than s.
The notation used in this paper is standard. For a given set S⊆{1,2,…,N}, let us denote by |S| the cardinality of S and by ˉS:={1,2…,N}∖S the complement of S. For a fixed vector x, xS is obtained by retaining the elements of x indexed in S and zeroing out the rest of the elements. The support of x is defined as supp(x):={i|xi≠0}. Given two sets S1 and S2, the symmetrical difference of S1 and S2 is denoted by S1ΔS2:=(S1∖S2)∪(S2∖S1). Clearly,
‖xS1∖S2‖2+‖xS2∖S1‖2≤√2‖xS1ΔS2‖2, ∀x∈RN. |
The paper is organized as follows. The CHTP algorithm is proposed in Section 2. Error estimation and convergence analysis of CHTP are given in Section 3. Numerical results are reported in Section 4.
In order to solve problem (1.1), Blumensath and Davies [13] proposed the following Iterative Hard Thresholding (IHT) algorithm:
xn+1:=Hs(xn+AT(y−Axn)). |
The negative gradient is used as the search direction, and the hard thresholding operator is employed to ensure sparse feasibility. Furthermore, by combining the IHT and CoSaMP algorithms, Foucart [17] proposed an algorithm called Hard Thresholding Pursuit (HTP), which has stronger numerical performance than IHT.
Algorithm: Hard Thresholding Pursuit (HTP).
Input: a measurement matrix A, a measurement vector y and a sparsity level s. Perform the following steps:
S1. Start with an s-sparse x0∈RN, typically x0=0;
S2. Repeat
Un+1:=Ls(xn+AT(y−Axn));xn+1:=argmin{‖y−Az‖2:supp(z)⊆Un+1} | (2.1) |
until a stopping criterion is met.
Output: the s-sparse vector x∗.
In Step (2.1) of HTP, we get |Un+1|=s, which results in only up to s positions being taken as the basis at each iteration in the process of restoring vectors. However, it should be noted that the selected basis of s positions could not accurately represent the full gradient information. From the view of theoretical analysis, the less gradient information of the selected basis there is, the more difficult it is for HTP to achieve good numerical performance. In order to further improve the efficiency of HTP, we propose the following CHTP algorithm. It allows us to choose more elements of gradient information at each iteration. In particular, if β=s, then xn+1=un+1 in CHTP. Hence, CHTP reduces to HTP in this special case.
Algorithm: Compressive Hard Thresholding Pursuit (CHTP).
Input: a measurement matrix A, a measurement vector y and a sparsity level s. Perform the following steps:
S1. Start with an s-sparse x0∈RN, typically x0=0;
S2. Repeat
Un+1:=Lβ(xn+AT(y−Axn)),un+1:=argmin{‖y−Az‖2:supp(z)⊆Un+1},Tn+1:=Ls(un+1),xn+1:=argmin{‖y−Az‖2:supp(z)⊆Tn+1}, |
where β≥s, until a stopping criterion is met.
Output: the s-sparse vector x∗.
The detailed discussion on numerical experiments including special stopping criteria is given in Section 4.
Theoretical analysis for CHTP is carried out in terms of the concept of the restricted isometry property (RIP) of a measurement matrix. First, recall that for a given matrix A and two positive integers p,q, the norm ‖A‖p→q is defined as
‖A‖p→q:=max‖x‖p≤1‖Ax‖q. |
Definition 3.1. [1] Let A∈Rm×N be a matrix with m<N. The s-th restricted isometry constant (RIC) of A, denoted by δs, is defined as
δs:=maxS⊆{1,2,…,N},|S|≤s‖ATSAS−I‖2→2, |
where AS is the submatrix of A created by deleting the columns not in the index set S.
The s-th restricted isometry constant can be defined equivalently as the smallest number δ>0 such that
(1−δ)‖x‖22≤‖Ax‖22≤(1+δ)‖x‖22, ∀x satisfying ‖x‖0≤s. | (3.1) |
In particular, if δs<1, we say that the matrix A has the restricted isometry property (RIP) of order s.
The following inequalities follow from the definition of RIC and play an important role in the theoretical analysis of CHTP.
Lemma 3.1. [23] Given a vector v∈RN and a set S⊆{1,2,…,N}, one has the following statements.
(i). ‖((I−ATA)v)S‖2≤δt‖v‖2 if |S∪supp(v)|≤t.
(ii). ‖(ATv)S‖2≤√1+δt‖v‖2 if |S|≤t.
The following lemma comes from [28], providing an estimate on the Lipschitz constant of the hard thresholding operator. Moreover, [28,Example 2.3] indicates that the constant (√5+1)/2 is tight.
Lemma 3.2. For any vector z∈RN and for any s-sparse vector x∈RN, one has
‖x−Hs(z)‖2≤√5+12‖(x−z)S∪Z‖2, |
where S:=supp(x) and Z:=supp(Hs(z)).
The following result is inspired by [28,Lemma 3.3].
Lemma 3.3. Let y=Ax+e. Given an s-sparse vector z and β≥s, define
V:=Lβ(z+AT(y−Az)). |
Then,
‖xS∖V‖2≤√2(δ2s+β‖xS−z‖2+√1+δs+β‖e′‖2), |
where S:=Ls(x) and e′:=Ax¯S+e.
Proof. The case of S⊆V holds trivially due to xS∖V=0. Now, let us consider the remaining case of S⊊V. Define
ΦS∖V:=‖[z+AT(y−Az)]S∖V‖2 and ΦV∖S:=‖[z+AT(y−Az)]V∖S‖2. | (3.2) |
Since |V|=β≥s=|S|,
|S∖V|=|S−S∩V|≤|V−S∩V|=|V∖S|. | (3.3) |
Meanwhile, from the definition of V, we know that the entries of the vector z+AT(y−Az) supported on S∖V are not greater than the β largest absolute entries, that is,
(z+AT(y−Az))i≤(z+AT(y−Az))j, for any i∈S∖V, j∈V∖S. |
Together with (3.3), this leads to
ΦS∖V=‖(z+AT(y−Az))S∖V‖2≤‖(z+AT(y−Az))V∖S‖2=ΦV∖S. | (3.4) |
Define
ΦSΔV:=‖[(xS−z)−AT(y−Az)]SΔV‖2. |
Since y=AxS+e′ with e′=AxˉS+e,
ΦSΔV=‖((xS−z)−AT(AxS+e′−Az))SΔV‖2=‖((I−ATA)(xS−z)−ATe′)SΔV‖2≤‖((I−ATA)(xS−z))SΔV‖2+‖(ATe′)SΔV‖2≤δ2s+β‖xS−z‖2+√1+δs+β‖e′‖2, | (3.5) |
where the last inequality follows from Lemma 3.1 and the fact that
|supp(xS−z)∪(SΔV)|≤|supp(z)∪S∪V|≤2s+β. |
Due to (S∖V)∩(V∖S)=∅, we have
Φ2SΔV=‖((xS−z)−AT(y−Az))SΔV‖22=‖((xS−z)−AT(y−Az))S∖V‖22+‖((xS−z)−AT(y−Az))V∖S‖22=‖xS∖V−(z+AT(y−Az))S∖V‖22+‖(z+AT(y−Az))V∖S‖22=‖xS∖V−(z+AT(y−Az))S∖V‖22+Φ2V∖S. | (3.6) |
Let us discuss the following two cases separately.
Case 1. ΦV∖S=0. Then, ΦS∖V=0 by (3.4). It follows from (3.2) that (z+AT(y−Az))S△V=0. So,
‖xS∖V‖2=‖xS∖V−(z+AT(y−Az))S△V‖2=ΦSΔV≤δ2s+β‖xS−z‖2+√1+δs+β‖e′‖2, |
where the last inequality results from (3.5).
Case 2. ΦV∖S>0. Let
α:=‖xS∖V−(z+AT(y−Az))S∖V‖2ΦV∖S. | (3.7) |
It follows from (3.6) that
ΦSΔV=√‖xS∖V−(z+AT(y−Az))S∖V‖22+Φ2V∖S=√1+α2ΦV∖S. | (3.8) |
Combining (3.7) and (3.8) yields
‖xS∖V−(z+AT(y−Az))S∖V‖2=αΦV∖S=α√1+α2ΦSΔV. |
Furthermore,
Φ2S∖V=‖(z+AT(y−Az))S∖V‖22=‖xS∖V−(xS−z−AT(y−Az))S∖V‖22=‖xS∖V‖22−2⟨xS∖V,xS∖V−(z+AT(y−Az))S∖V⟩+‖xS∖V−(z+AT(y−Az))S∖V‖22≥‖xS∖V‖22−2‖xS∖V‖2‖xS∖V−(z+AT(y−Az))S∖V‖2+‖xS∖V−(z+AT(y−Az))S∖V‖22=‖xS∖V‖22−2α√1+α2ΦSΔV‖xS∖V‖2+α21+α2Φ2SΔV. | (3.9) |
On the other hand, it follows from (3.4) and (3.8) that
Φ2S∖V≤Φ2V∖S=11+α2Φ2SΔV. | (3.10) |
Putting (3.9) and (3.10) together yields
‖xS∖V‖22−2αΦSΔV√1+α2‖xS∖V‖2+(α2−1)Φ2SΔV1+α2≤0. |
Thus,
‖xS∖V‖2≤12(2αΦSΔV√1+α2+√4α2Φ2SΔV1+α2−4(α2−1)Φ2SΔV1+α2)=1+α√1+α2ΦSΔV≤√2ΦSΔV, | (3.11) |
where the last inequality results from
(1+α√1+α2)2=1+2α1+α2≤2. |
Taking into account (3.5) and (3.11) yields
‖xS∖V‖2≤√2ΦSΔV≤√2(δ2s+β‖xS−z‖2+√1+δs+β‖e′‖2). |
Lemma 3.4. Let v be a vector satisfying supp(v)⊆T and |T|≤s. For any vector x∈RN, let y=AxS+e′ with e′=Ax¯S+e and S=Ls(x). If z∗ satisfies
z∗=argminz{‖y−Az‖2:supp(z)⊆T}, |
then
‖xS−z∗‖2≤‖(xS−v)¯T‖2√1−δ22s+√1+δs‖e′‖21−δ2s. |
Proof. According to the definition of z∗, we have
[AT(y−Az∗)]T=0. |
Hence,
[AT(AxS+e′−Az∗)]T=[ATA(xS−z∗)]T+(ATe′)T=0. |
By the triangle inequality and Lemma 3.1, we get
‖(xS−z∗)T‖2=‖(xS−z∗)T−[ATA(xS−z∗)]T−(ATe′)T‖2≤‖[(I−ATA)(xS−z∗)]T‖2+‖(ATe′)T‖2≤δ2s‖(xS−z∗)‖2+‖(ATe′)T‖2, |
where the last inequality is due to |supp(xS−z∗)∪T|≤2s. This means that
‖xS−z∗‖22=‖(xS−z∗)¯T‖22+‖(xS−z∗)T‖22≤‖(xS−z∗)¯T‖22+(δ2s‖(xS−z∗)‖2+‖(ATe′)T‖2)2=‖(xS−z∗)¯T‖22+δ22s‖(xS−z∗)‖22+2δ2s‖(ATe′)T‖2‖xS−z∗‖2+‖(ATe′)T‖22, |
i.e.,
(1−δ22s)‖xS−z∗‖22−2δ2s‖(ATe′)T‖2‖xS−z∗‖2−‖(xS−z∗)¯T‖22−‖(ATe′)T‖22≤0. |
This is a quadratic inequality on ‖xS−z∗‖2. Hence,
‖xS−z∗‖2≤2δ2s‖(ATe′)T‖2+√4δ22s‖(ATe′)T‖22+4(1−δ22s)(‖(xS−z∗)¯T‖22+‖(ATe′)T‖22)2(1−δ22s)=δ2s‖(ATe′)T‖2+√(1−δ22s)‖(xS−z∗)¯T‖22+‖(ATe′)T‖221−δ22s≤δ2s‖(ATe′)T‖2+√1−δ22s‖(xS−z∗)¯T‖2+‖(ATe′)T‖21−δ22s=‖(xS−z∗)¯T‖2√1−δ22s+‖(ATe′)T‖21−δ2s, | (3.12) |
where the second inequality comes from the fact that √a2+b2≤a+b for all a,b≥0. By Lemma 3.1 and |T|≤s, one has
‖(ATe′)T‖2≤√1+δs‖e′‖2. | (3.13) |
Note that z∗ˉT=vˉT=0. It follows from (3.12) and (3.13) that
‖xS−z∗‖2≤‖(xS−z∗)¯T‖2√1−δ22s+√1+δs‖e′‖21−δ2s=‖(xS−v)¯T‖2√1−δ22s+√1+δs‖e′‖21−δ2s. |
This completes the proof.
Error bounds and convergence analysis of CHTP are summarized as follows.
Theorem 3.1. Let y=Ax+e be the measurement of the signal x and e be the measurement error. If
δ2s+β<√1√4+√5+2, | (3.14) |
then the iterate {xn} generated by CHTP approximates x with
‖xn−xS‖2≤ρn‖x0−xS‖2+τ1−ρn1−ρ‖Ax¯S+e‖2, |
where
ρ:=√δ22s+β(2+(√5+1)δ2s+β)(1−δ2s+β)(1−δ22s)<1, | (3.15) |
τ:=√2+(√5+1)δ2s+β(1−δs+β)(1−δ22s)+(√5+1)√1+δβ2(1−δs+β)√1−δ22s+√1+δs1−δ2s. | (3.16) |
Proof. For the convenience of discussion, we define tn+1:=Hs(un+1). Then,
supp(tn+1)⊆Tn+1=Ls(un+1)⊆Un+1. |
It is clear that (tn+1)Un+1=tn+1. Hence,
‖(xS−tn+1)Un+1‖2=‖xS∩Un+1−tn+1‖2=‖xS∩Un+1−Hs(un+1)‖2≤√5+12‖(xS∩Un+1−un+1)(S∩Un+1)∪Tn+1‖2≤√5+12‖(xS−un+1)Un+1‖2, | (3.17) |
where the first inequality comes from Lemma 3.2, and the second inequality follows from the fact that [(S∩Un+1)∪Tn+1]⊆Un+1.
Since supp(tn+1)⊆Un+1 and supp(un+1)⊆Un+1, (tn+1)¯Un+1=(un+1)¯Un+1=0. Thus,
‖xS−tn+1‖22=‖(xS−tn+1)¯Un+1‖22+‖(xS−tn+1)Un+1‖22=‖(xS−un+1)¯Un+1‖22+‖(xS−tn+1)Un+1‖22,≤‖(xS−un+1)¯Un+1‖22+(ξ‖(xS−un+1)Un+1‖2)2, | (3.18) |
where ξ:=(√5+1)/2, and the last step is due to (3.17).
From Step 2 in CHTP, we know that the following orthogonality condition holds:
⟨Aun+1−y,Az⟩=0, for anysupp(z)⊆Un+1, |
where y=AxS+e′ and e′=Ax¯S+e. Hence, for any z satisfying supp(z)⊆Un+1, we get
⟨un+1−xS,ATAz⟩=⟨Aun+1−AxS,Az⟩=⟨Aun+1−(y−e′),Az⟩=⟨Aun+1−y,Az⟩+⟨e′,Az⟩=⟨e′,Az⟩=⟨ATe′,z⟩. | (3.19) |
Furthermore,
‖(xS−un+1)Un+1‖22=⟨(un+1−xS)Un+1,(un+1−xS)Un+1⟩=⟨(un+1−xS)Un+1,[(I−ATA)(un+1−xS)]Un+1⟩+⟨(un+1−xS)Un+1,[ATA(un+1−xS)]Un+1⟩. | (3.20) |
Lemma 3.1 and |supp(un+1−xS)∪Un+1|≤s+β lead to
⟨(un+1−xS)Un+1,[(I−ATA)(un+1−xS)]Un+1⟩≤δs+β‖xS−un+1‖2‖(xS−un+1)Un+1‖2. | (3.21) |
Note that
⟨(un+1−xS)Un+1,[ATA(un+1−xS)]Un+1⟩=⟨(un+1−xS)Un+1,ATA(un+1−xS)⟩=⟨un+1−xS,ATA(un+1−xS)Un+1⟩=⟨ATe′,(un+1−xS)Un+1⟩=⟨(ATe′)Un+1,(un+1−xS)Un+1⟩≤‖(ATe′)Un+1‖2‖(un+1−xS)Un+1‖2, | (3.22) |
where the third equality comes from (3.19) since supp((un+1−xS)Un+1)⊂Un+1. Putting (3.20), (3.21) and (3.22) together yields
‖(xS−un+1)Un+1‖2≤δs+β‖xS−un+1‖2+‖(ATe′)Un+1‖2. | (3.23) |
Therefore,
‖xS−un+1‖22=‖(xS−un+1)¯Un+1‖22+‖(xS−un+1)Un+1‖22≤‖(xS−un+1)¯Un+1‖22+δ2s+β‖xS−un+1‖22+2δs+β‖(ATe′)Un+1‖2‖xS−un+1‖2+‖(ATe′)Un+1‖22, |
i.e.,
(1−δ2s+β)‖xS−un+1‖22−2δs+β‖(ATe′)Un+1‖2‖xS−un+1‖2−(‖(xS−un+1)¯Un+1‖22+‖(ATe′)Un+1‖22)≤0. |
This is a quadratic inequality on ‖xS−un+1‖2. So,
‖xS−un+1‖2≤2δs+β‖(ATe′)Un+1‖2+√4(1−δ2s+β)‖(xS−un+1)¯Un+1‖22+4‖(ATe′)Un+1‖222(1−δ2s+β)≤√1−δ2s+β‖(xS−un+1)¯Un+1‖2+(1+δs+β)‖(ATe′)Un+1‖21−δ2s+β=‖(xS−un+1)¯Un+1‖2√1−δ2s+β+‖(ATe′)Un+1‖21−δs+β, | (3.24) |
where the second inequality comes from the fact that √a2+b2≤a+b for all a,b≥0.
Due to supp(un+1)⊆Un+1, we obtain (un+1)¯Un+1=0. According to Lemma 3.3, we have
‖(xS−un+1)¯Un+1‖2=‖xS∖Un+1‖2≤√2δ2s+β‖xS−xn‖2+√2(1+δs+β)‖e′‖2. | (3.25) |
Combining (3.23) and (3.24) yields
‖(xS−un+1)Un+1‖2≤δs+β√1−δ2s+β‖(xS−un+1)¯Un+1‖2+δs+β1−δs+β‖(ATe′)Un+1‖2+‖(ATe′)Un+1‖2=δs+β√1−δ2s+β‖(xS−un+1)¯Un+1‖2+11−δs+β‖(ATe′)Un+1‖2. | (3.26) |
Hence, it follows from (3.18) and (3.26) that
‖xS−tn+1‖22≤‖(xS−un+1)¯Un+1‖22+(ξδs+β√1−δ2s+β‖(xS−un+1)¯Un+1‖2+ξ1−δs+β‖(ATe′)Un+1‖2)2≤(√1+ξδ2s+β1−δ2s+β‖(xS−un+1)¯Un+1‖2+ξ1−δs+β‖(ATe′)Un+1‖2)2, |
where the last step is due to the fact that a2+(b+c)2≤(√a2+b2+c)2 for all a,b,c≥0, and ξ2−1=ξ since ξ=(√5+1)/2. Taking the square root of both sides of this inequality and using (3.25) yields
‖xS−tn+1‖2≤√1+ξδ2s+β1−δ2s+β‖(xS−un+1)¯Un+1‖2+ξ1−δs+β‖(ATe′)Un+1‖2≤√1+ξδ2s+β1−δ2s+β(√2δ2s+β‖xS−xn‖2+√2(1+δs+β)‖e′‖2)+ξ1−δs+β‖(ATe′)Un+1‖2≤√2δ22s+β(1+ξδ2s+β)1−δ2s+β‖xS−xn‖2+(√2(1+ξδ2s+β)1−δs+β+ξ√1+δβ1−δs+β)‖e′‖2, | (3.27) |
where the third inequality follows from the fact that ‖(ATe′)Un+1‖2≤√1+δβ‖e′‖2 by Lemma 3.1.
Note that supp(tn+1)⊆Tn+1, |Tn+1|≤s, and
xn+1=argminz{‖y−Az‖2:supp(z)⊆Tn+1}. |
According to Lemma 3.4, we have
‖xn+1−xS‖2≤1√1−δ22s‖(xS−tn+1)¯Tn+1‖2+√1+δs1−δ2s‖e′‖2≤1√1−δ22s‖xS−tn+1‖2+√1+δs1−δ2s‖e′‖2. |
Combining this with (3.27), we obtain
‖xS−xn+1‖2≤√2δ22s+β(1+ξδ2s+β)(1−δ2s+β)(1−δ22s)‖xS−xn‖2+(√2(1+ξδ2s+β)(1−δs+β)(1−δ22s)+ξ√1+δβ(1−δs+β)(√1−δ22s)+√1+δs1−δ2s)‖e′‖2=ρ‖xS−xn‖2+τ‖e′‖2, |
where ρ, τ are given in (3.15) and (3.16), respectively. Hence,
‖xS−xn+1‖2≤ρ‖xS−xn‖2+τ‖e′‖2≤ρ(ρ‖xS−xn−1‖2+τ‖e′‖2)+τ‖e′‖2≤⋯⋯⋯≤ρn+1‖xS−x0‖2+τ1−ρn+11−ρ‖e′‖2. |
Since δs+β≤δ2s+β and δ2s≤δ2s+β, it is easy to get
ρ=√δ22s+β (2+(√5+1)δ2s+β )(1−δ2s+β )(1−δ22s )≤√δ22s+β (2+(√5+1)δ22s+β )(1−δ22s+β )(1−δ22s+β ). |
To ensure ρ<1, it suffices to require the right side of the above equation less than 1, which is guaranteed by the following RIP bound:
δ2s+β<√1√4+√5+2. |
This completes the proof.
The corresponding result for the noiseless case can be obtained immediately.
Corollary 3.1. Let y=Ax be the measurement of an s-sparse signal x. If
δ2s+β<√1√4+√5+2, |
then the iterative sequence {xn} generated by CHTP approximates x with
‖xn−x‖2≤ρn‖x0−x‖2, |
where ρ is given in (3.15).
In the noiseless setting, the foregoing result shows that a sparse signal can be identified by CHTP in a finite number of iterations.
Theorem 3.2. If
δ2s+β<√1√4+√5+2, |
then any s-sparse vector x∈RN can be recovered by CHTP with y=Ax in at most
n=[ln(√2 δ2s+β ‖x0−x‖2θ)ln(1ρ )]+1 |
iterations, where ρ is given by (3.15), θ:=mini∈S|xi| and S:=supp(x).
Proof. It is sufficient to show that
|(xn+ATA(x−xn))k|>|(xn+ATA(x−xn))t|, ∀k∈S, t∈ˉS. | (3.28) |
If (3.28) holds, we can obtain S⊆Un+1 from the definition of Un+1, which leads to un+1=x. Furthermore, we can derive Tn+1=S and xn+1=x from the CHTP algorithm directly.
Next, we aim to prove (3.28). According to Lemma 3.1 and Corollary 3.1, we yield
|((I−ATA)(xn−x))k|+|((I−ATA)(xn−x))t|≤√2‖((I−ATA)(xn−x)){k,t}‖2≤√2δ2s+1‖xn−x‖2≤√2δ2s+β‖xn−x‖2≤√2δ2s+βρn‖x0−x‖2. |
It follows that
|(xn+ATA(x−xn))k|=|xk+((I−ATA)(xn−x))k|≥θ−|((I−ATA)(xn−x))k|≥θ−√2δ2s+βρn‖x0−x‖2+|((I−ATA)(xn−x))t|=θ−√2δ2s+βρn‖x0−x‖2+|(xn+ATA(x−xn))t|, | (3.29) |
where θ=mini∈S|xi|, and the last step is due to the fact xt=0 for t∈ˉS. Taking
n=[ln(√2 δ2s+β ‖x0−x‖2θ )ln(1ρ)]+1, |
we get √2δ2s+βρn‖x0−x‖2<θ. Combining this with (3.29), we conclude that (3.28) holds. This completes the proof.
Now, let us discuss the stability of CHTP. Recall first that the error of the best s-term approximation of a vector x is defined as
σs(x)p:=infv{‖x−v‖p:‖v‖0≤s}. |
In particular, according to [23,Theorem 2.5], we have
σs(z)2≤12√s‖z‖1 ∀z. | (3.30) |
Theorem 3.3. Let
δ2s+β<√1√4+√5+2. |
Then, for all x∈RN and e∈Rm, the iterate {xn} generated by CHTP with y=Ax+e satisfies
‖x−xn‖2≤(12√t+τ1−ρn1−ρ√1+δtt)σk(x)1+τ1−ρn1−ρ‖e‖2+ρn‖xS−x0‖2, |
where ρ is given by (3.15), and
k:={s2 if s is even[s2]+1 if s is odd and t:=s−k. |
Moreover, every cluster point x∗ of the sequence {xn} satisfies
‖x−x∗‖2≤(12√t+τ1−ρ√1+δtt)σk(x)1+τ1−ρ‖e‖2. |
Proof. Let T1:=Lk(x), T2:=Lt(x¯T1), and
T3:=Lt(x¯T1∪T2),⋯,Tl−1:=Lt(x¯T1∪T2∪...Tl−2), Tl:=Lr(x¯T1∪T2∪...Tl−1), |
where l,r satisfy (l−2)t+k+r=N and r≤t. According to the structure of Tj, it follows from [23,Lemma 6.10] that
‖xTj‖2≤1√t‖xTj−1‖1, j=3,…,l. | (3.31) |
Taking into account Theorem 3.1, one has
‖xS−xn‖2≤ρn‖xS−x0‖2+τ1−ρn1−ρ‖Ax¯S+e‖2≤ρn‖xS−x0‖2+τ1−ρn1−ρ(‖AxT3‖2+‖AxT4‖2+...+‖AxTl‖2+‖e‖2). |
It follows from (3.1) and (3.31) that
l∑i=3‖AxTi‖2≤√1+δtl∑i=3‖xTi‖2≤√1+δt1√tl−1∑i=2‖xTi‖1≤√1+δt1√t‖x¯T1‖1. |
Hence,
‖xS−xn‖2≤ρn‖xS−x0‖2+τ1−ρn1−ρ(√1+δtt‖x¯T1‖1+‖e‖2)=ρn‖xS−x0‖2+τ1−ρn1−ρ(√1+δttσk(x)1+‖e‖2). | (3.32) |
Using (3.30), we get
‖x¯S‖2=σt(x¯T1)2≤12√t‖x¯T1‖1=12√tσk(x)1. | (3.33) |
Putting (3.32) and (3.33) together implies
‖x−xn‖2=‖x¯S+xS−xn‖2≤‖x¯S‖2+‖xS−xn‖2≤12√tσk(x)1+ρn‖xS−x0‖2+τ1−ρn1−ρ(√1+δttσk(x)1+‖e‖2)=(12√t+τ1−ρn1−ρ√1+δtt)σk(x)1+τ1−ρn1−ρ‖e‖2+ρn‖xS−x0‖2. |
Since ρ<1, taking n→∞ in the above inequality yields
‖x−x∗‖2≤(12√t+τ1−ρ√1+δtt)σk(x)1+τ1−ρ‖e‖2. |
This completes the proof.
In this section, numerical experiments are carried out to check the effectiveness of CHTP. All experiments were conducted on a personal computer with the processor Intel(R) Core(TM) i5-7200U, CPU @ 2.50 GHz and 4 GB memory. The s-sparse standard Gaussian vector x∗∈RN and standard Gaussian matrix A∈Rm×N with (m,N)=(400,800) are randomly generated, and the positions of non-zero elements of x∗ are uniformly distributed. According to [23,Theorem 9.27], if 0<η,ε<1, and
m≥2η−2[s(1+ln(N/s))+ln(2ε−1)], |
then the restricted isometry constant δs of 1√mA satisfies
δs≤[2+ηg(N,s)]⋅ηg(N,s), |
with probability greater than or equal to 1−ε, where
g(N,s):=1+[2(1+ln(N/s))]−1/2. |
Hence, to satisfy the restricted isometry bound (3.14) with high probability, we replace (1.1) by the following model throughout the numerical experiments:
minx{12‖˜y−˜Ax‖22:‖x‖0≤s}, |
where ˜A:=A/√m, and ˜y:=y/√m. Moreover, the measurement vector y is given by y=Ax∗ in the noiseless setting and y=Ax∗+0.005e in the noisy setting, where e∈Rm is a standard Gaussian random vector. Unless otherwise stated, all algorithms adopt x0=0 as the initial point and the condition
‖xn−x∗‖2/‖x∗‖2≤10−3 |
as the successful reconstruction criterion. OMP is performed with s iterations, while other algorithms are performed with at most 50 iterations. For each sparsity level s, 100 random problem instances are tested to investigate the recovery abilities of all algorithms, as shown in Figures 1–3.
The first experiment is performed to illustrate the effect of the parameter β involved in CHTP on the reconstruction ability. Pick the following different values of β in {s,[1.1s],[1.2s],[1.3s],2s} and s∈{120,123,…,240}, where [⋅] denotes the round function. Numerical results with accurate measurements and inaccurate measurements are shown in Figure 1(a) and 1(b), respectively. Since the results shown in Figure 1(b) are similar to those of Figure 1(a), this indicates that CHTP is stable for weak noise. It is also observed that the recovery ability of CHTP is sensitive to the parameter β, and it becomes worse with the increase of β as β≤[1.3s]. In particular, CHTP reduces to HTP as β=s. Thus, CHTP with β=[1.1s], [1.2s] and [1.3s] performs better than HTP in both noiseless and noisy settings. In addition, it should be noted that CHTP with β=2s performs better than HTP as s≤190, and the success rate of the former suddenly drops to zero as sparsity level s approaches m/2. This phenomenon is similar to the performance of SP in Figure 2. Simulations indicate that the performance of the hard thresholding operator can be improved by introducing a compressive step first.
To investigate the effectiveness of CHTP, we compare it with other four mainstream algorithms: HTP, OMP, CoSaMP and SP. The corresponding results are displayed in Figures 2 and 3. Based on the above discussion on the performances of CHTP, the parameter β in this experiment is set as [1.1s]. Sparsity level s ranged from 120 to 258 with stepsize 3 in Figure 2(a), and it ranged from 1 to 298 with stepsize 3 in Figure 2(b). The comparison of recovery abilities for different algorithms is displayed in Figure 2(a) with accurate measurements and in Figure 2(b) with inaccurate measurements. Moreover, Figure 2(c) is an enlarged image of Figure 2(b). Figure 2 indicates that CHTP is competitive with HTP, OMP and SP in both noiseless and noisy scenarios. Furthermore, the recovery ability of CHTP can remarkably exceed that of CoSaMP. Figure 2(b) indicates that the sparse signal reconstruction abilities are significantly weakened in the presence of noise for all algorithms as the sparsity level s≤10.
The next problem we examine is the average number of iterations and average recovery runtime for different algorithms with accurate measurements. Due to the number of iterations of OMP being set as s, we just compare all algorithms except OMP in Figure 3, in which s≤120 is considered to ensure the success rates of all algorithms are 100%. Figure 3 shows that the larger s is taken, the more average number of iterations and computational time are required for all algorithms. From Figure 3(b), we see that the average recovery runtime for all algorithms are close to each other as s<50. Although CHTP needs more iterations than other algorithms in Figure 3(a), it consumes less time than CoSaMP and SP as s≥50 in Figure 3(b). From this point of view, CHTP is competitive with the other algorithms mentioned above.
In [21,27], the authors point out that the hard thresholding operator is independent of the objective function, and this inherent drawback may cause the increase of the objective function in the iterative process. Furthermore, they suggest that the hard thresholding operator should be applied to a compressible vector to overcome this drawback. Motivated by this idea, we proposed the CHTP algorithm in this paper, which reduces to the standard HTP as β=s. To minimize the negative effect of the hard thresholding operator on the objective function, the orthogonal projection was used twice in CHTP. The convergence analysis of CHTP was established by utilizing the restricted isometry property of the sensing matrix. Numerical experiments indicated that the performance of CHTP with β=[1.1s] is better than in other cases. Simulations showed that CHTP is competitive with other popular algorithms such as HTP, OMP and SP, both in the sparse signal reconstruction ability and the average recovery runtime.
The second author's work is supported by the National Natural Science Foundation of China (11771255), Young Innovation Teams of Shandong Province (2019KJI013) and Shandong Province Natural Science Foundation (ZR2021MA066). The fourth author's work is supported by the Natural Science Foundation of Henan Province (222300420520) and Key Scientific Research Projects of Higher Education of Henan Province (22A110020).
All authors declare no conflicts of interest in this paper.
[1] |
E. J. Candes, T. Tao, Decoding by linear programming, IEEE T. Inform. Theory, 51 (2005), 4203–4215. http://dx.doi.org/10.1109/TIT.2005.858979 doi: 10.1109/TIT.2005.858979
![]() |
[2] |
D. L. Donoho, Y. Tsaig, Fast solution of l1-norm minimization problems when the solution may be sparse, IEEE T. Inform. Theory, 54 (2008), 4789–4812. http://dx.doi.org/10.1109/TIT.2008.929958 doi: 10.1109/TIT.2008.929958
![]() |
[3] |
H. Ge, W. Chen, M. K. Ng, New RIP Bounds for recovery of sparse signals with partial support information via weighted ℓp-minimization, IEEE T. Inform. Theory, 66 (2020), 3914–3928. http://dx.doi.org/10.1109/TIT.2020.2966436 doi: 10.1109/TIT.2020.2966436
![]() |
[4] |
A. H. Wan, Uniform RIP conditions for recovery of sparse signals by ℓp, (0<p≤1) minimization, IEEE T. Signal Proces., 28 (2020), 5379–5394. http://dx.doi.org/10.1109/TSP.2020.3022822 doi: 10.1109/TSP.2020.3022822
![]() |
[5] |
H. Cai, J. F. Cai, T. Wang, G. Yin, Accelerated structured alternating projections for robust spectrally sparse signal recovery, IEEE T. Signal Proces., 69 (2021), 809–821. http://dx.doi.org/10.1109/TSP.2021.3049618 doi: 10.1109/TSP.2021.3049618
![]() |
[6] |
C. H. Lee, B. D. Rao, H. Garudadri, A sparse conjugate gradient adaptive filter, IEEE Signal Proc. Let., 27 (2020), 1000–1004. http://dx.doi.org/10.1109/LSP.2020.3000459 doi: 10.1109/LSP.2020.3000459
![]() |
[7] |
T. Cai, L. Wang, Orthogonal matching pursuit for sparse signal recovery with noise, IEEE T. Inform. Theory, 57 (2011), 4680–4688. http://dx.doi.org/10.1109/TIT.2011.2146090 doi: 10.1109/TIT.2011.2146090
![]() |
[8] |
J. A. Tropp, A. C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE T. Inform. Theory, 53 (2007), 4655–4666. http://dx.doi.org/10.1109/TIT.2007.909108 doi: 10.1109/TIT.2007.909108
![]() |
[9] |
J. Wen, R. Zhang, W. Yu, Signal-dependent performance analysis of orthogonal matching pursuit for exact sparse recovery, IEEE T. Signal Proces., 68 (2020), 5031–5046. http://dx.doi.org/10.1109/TSP.2020.3016571 doi: 10.1109/TSP.2020.3016571
![]() |
[10] |
D. Needell, J. A. Tropp, CoSaMP: Iterative signal recovery from incomplete and inaccurate samples, Appl. Comput. Harmon. A., 26 (2009), 301–321. http://dx.doi.org/10.1145/1859204.1859229 doi: 10.1145/1859204.1859229
![]() |
[11] |
W. Dai, O. Milenkovic, Subspace pursuit for compressive sensing signal reconstruction, IEEE T. Infrom. Theory, 55 (2009), 2230–2249. http://dx.doi.org/10.1109/TIT.2009.2016006 doi: 10.1109/TIT.2009.2016006
![]() |
[12] |
N. Han, S. Li, J. Lu, Orthogonal subspace based fast iterative thresholding algorithms for joint sparsity recovery, IEEE Signal Proces. Let., 28 (2021), 1320–1324. http://dx.doi.org/10.1109/LSP.2021.3089434 doi: 10.1109/LSP.2021.3089434
![]() |
[13] |
T. Blumensath, M. E. Davies, Iterative hard thresholding for compressed sensing, Appl. Comput. Harmon. Anal., 27 (2009), 265–274. http://dx.doi.org/10.1016/j.acha.2009.04.002 doi: 10.1016/j.acha.2009.04.002
![]() |
[14] |
M. Fornasier, H. Rauhut, Iterative thresholding algorithms, Appl. Comput. Harmon. Anal., 25 (2008), 187–208. http://dx.doi.org/10.1016/j.acha.2007.10.005 doi: 10.1016/j.acha.2007.10.005
![]() |
[15] |
S. Foucart, G. Lecue, An IHT algorithm for sparse recovery from subexponential measurements, IEEE Signal Proces. Let., 24 (2017), 1280–1283. http://dx.doi.org/10.1109/LSP.2017.2721500 doi: 10.1109/LSP.2017.2721500
![]() |
[16] |
J. L. Bouchot, S. Foucart, P. Hitczenki, Hard thresholding pursuit algorithms: Number of iterations, Appl. Comput. Harmon. Anal., 41 (2016), 412–435. http://dx.doi.org/10.1016/j.acha.2016.03.002 doi: 10.1016/j.acha.2016.03.002
![]() |
[17] |
S. Foucart, Hard thresholding pursuit: An algorithm for compressive sensing, SIAM J. Numer. Anal., 49 (2011), 2543–2563. http://dx.doi.org/10.1137/100806278 doi: 10.1137/100806278
![]() |
[18] | X. T. Yuan, P. Li, T. Zhang, Gradient hard thresholding pursuit, J. Mach. Learn. Res., 18 (2018), 1–43. |
[19] |
D. L. Donoho, De-noising by soft-thresholding, IEEE T. Inform. Theory, 41 (1995), 613–627. http://dx.doi.org/10.1109/18.382009 doi: 10.1109/18.382009
![]() |
[20] |
N. Meng, Y. B. Zhao, Newton-step-based hard thresholding algorithms for sparse signal recovery, IEEE T. Signal Proces., 68 (2020), 6594–6606. http://dx.doi.org/10.1109/TSP.2020.3037996 doi: 10.1109/TSP.2020.3037996
![]() |
[21] |
Y. B. Zhao, Optimal k-thresholding algorithms for sparse optimization problems, SIAM J. Optimiz., 30 (2020), 31–55. https://doi.org/10.1137/18M1219187 doi: 10.1137/18M1219187
![]() |
[22] |
Y. B. Zhao, Z. Q. Luo, Analysis of optimal thresholding algorithms for compressed sensing, IEEE T. Signal Proces., 187 (2021), 108–148. https://doi.org/10.1016/j.sigpro.2021.108148 doi: 10.1016/j.sigpro.2021.108148
![]() |
[23] | S. Foucart, H. Rauhut, A Mathematical Introduction to Compressive Sensing, Springer, NY, 2013. https://doi.org/10.1007/978-0-8176-4948-7 |
[24] | J. L. Shen, S. Mousavi, Exact support and vector recovery of constrained sparse vectors via constrained matching pursuit, arXiv Preprint, 2020. Available from: https://arXiv.org/abs/1903.07236. |
[25] | Y. B. Zhao, Sparse Optimization Theory and Methods, Boca Raton, FL, CRC Press, 2018. https://doi.org/10.1201/9781315113142 |
[26] |
Y. B. Zhao, Z. Q. Luo, Constructing new weighted l1 algorithms for the sparsest points of polyhedral sets, Math. Oper. Res., 42 (2016), 57–76. https://doi.org/10.1287/moor.2016.0791 doi: 10.1287/moor.2016.0791
![]() |
[27] | N. Meng, Y. B. Zhao, M. Kocvara, Partial gradient optimal thresholding algorithms for a class of sparse optimization problems, arXiv Preprint, 2022. https://arXiv.org/abs/2107.04319v2 |
[28] | Y. B. Zhao, Z. Q. Luo, Improved RIP-based bounds for guaranteed performance of two compressed sensing algorithms, arXiv Preprint, 2020. https://arXiv.org/abs/2007.01451 |
1. | Xiang Sha, Guolong Cui, Yanan Du, DOA Estimation Based on Distributed Array Optimization in Impulsive Noise, 2024, 146, 1937-8718, 103, 10.2528/PIERC24061304 |