Processing math: 100%
Research article Topical Sections

Oversmoothing regularization with 1-penalty term

  • In Tikhonov-type regularization for ill-posed problems with noisy data, the penalty functional is typically interpreted to carry a-priori information about the unknown true solution. We consider in this paper the case that the corresponding a-priori information is too strong such that the penalty functional is oversmoothing, which means that its value is infinite for the true solution. In the case of oversmoothing penalties, convergence and convergence rate assertions for the regularized solutions are difficult to derive, only for the Hilbert scale setting convincing results have been published. We attempt to extend this setting to 1-regularization when the solutions are only in 2. Unfortunately, we have to restrict our studies to the case of bounded linear operators with diagonal structure, mapping between 2 and a separable Hilbert space. But for this subcase, we are able to formulate and to prove a convergence theorem, which we support with numerical examples.

    Citation: Daniel Gerth, Bernd Hofmann. Oversmoothing regularization with 1-penalty term[J]. AIMS Mathematics, 2019, 4(4): 1223-1247. doi: 10.3934/math.2019.4.1223

    Related Papers:

    [1] Wei Yang, Lili Pan, Jinhui Wan . Smoothing gradient descent algorithm for the composite sparse optimization. AIMS Mathematics, 2024, 9(12): 33401-33422. doi: 10.3934/math.20241594
    [2] Fethi Bouzeffour . Inversion formulas for space-fractional Bessel heat diffusion through Tikhonov regularization. AIMS Mathematics, 2024, 9(8): 20826-20842. doi: 10.3934/math.20241013
    [3] El Mostafa Kalmoun, Fatimah Allami . On the existence and stability of minimizers for generalized Tikhonov functionals with general similarity data. AIMS Mathematics, 2021, 6(3): 2764-2777. doi: 10.3934/math.2021169
    [4] Xuemin Xue, Xiangtuan Xiong, Yuanxiang Zhang . Two fractional regularization methods for identifying the radiogenic source of the Helium production-diffusion equation. AIMS Mathematics, 2021, 6(10): 11425-11448. doi: 10.3934/math.2021662
    [5] Hongwu Zhang, Xiaoju Zhang . Tikhonov-type regularization method for a sideways problem of the time-fractional diffusion equation. AIMS Mathematics, 2021, 6(1): 90-101. doi: 10.3934/math.2021007
    [6] JingJun Yu . Some congruences for -regular partitions with certain restrictions. AIMS Mathematics, 2024, 9(3): 6368-6378. doi: 10.3934/math.2024310
    [7] Phuong Nguyen Duc, Erkan Nane, Omid Nikan, Nguyen Anh Tuan . Approximation of the initial value for damped nonlinear hyperbolic equations with random Gaussian white noise on the measurements. AIMS Mathematics, 2022, 7(7): 12620-12634. doi: 10.3934/math.2022698
    [8] Dun-Gang Li, Fan Yang, Ping Fan, Xiao-Xiao Li, Can-Yun Huang . Landweber iterative regularization method for reconstructing the unknown source of the modified Helmholtz equation. AIMS Mathematics, 2021, 6(9): 10327-10342. doi: 10.3934/math.2021598
    [9] M. J. Huntul, Kh. Khompysh, M. K. Shazyndayeva, M. K. Iqbal . An inverse source problem for a pseudoparabolic equation with memory. AIMS Mathematics, 2024, 9(6): 14186-14212. doi: 10.3934/math.2024689
    [10] Zhenping Li, Xiangtuan Xiong, Jun Li, Jiaqi Hou . A quasi-boundary method for solving an inverse diffraction problem. AIMS Mathematics, 2022, 7(6): 11070-11086. doi: 10.3934/math.2022618
  • In Tikhonov-type regularization for ill-posed problems with noisy data, the penalty functional is typically interpreted to carry a-priori information about the unknown true solution. We consider in this paper the case that the corresponding a-priori information is too strong such that the penalty functional is oversmoothing, which means that its value is infinite for the true solution. In the case of oversmoothing penalties, convergence and convergence rate assertions for the regularized solutions are difficult to derive, only for the Hilbert scale setting convincing results have been published. We attempt to extend this setting to 1-regularization when the solutions are only in 2. Unfortunately, we have to restrict our studies to the case of bounded linear operators with diagonal structure, mapping between 2 and a separable Hilbert space. But for this subcase, we are able to formulate and to prove a convergence theorem, which we support with numerical examples.


    In this paper we consider the problem of finding a stable approximate solution to the ill-posed linear operator equation

    Ax=y, (1.1)

    where A:X:=2Y is a bounded linear operator with non-closed range, i.e., range(A)¯range(A), in the separable Hilbert space Y with norm . We recall that for 1p the spaces p are Banach spaces consisting of (infinite) series x={xi}i=1 with norms xp=pi=1|xi|p for 1p< and =supiN|xi|. For p=2 one obtains a Hilbert space. We assume that in (1.1) only noisy data yδ is available satisfying the inequality

    ||yyδ||δ (1.2)

    with known noise level δ>0. In order to determine a stable solution to (1.1) from the noisy data, regularization is required. We are searching for approximations xδα of the exact solution x with Ax=y based on the noisy data yδ, which are calculated as 1-regularized solutions

    xδα:=argminx2{12||Axyδ||2+α||x||1} (1.3)

    with regularization parameters α>0. This approach was extensively studied in the last decade and we refer to the brief discussion in Section 4. We remark that the Banach space 1 is non-reflexive, which makes its analysis challenging. In particular, it was shown in [3,Prop 3.3] that (1.1) is always ill-posed when the domain of A is restricted to 1 and Y is a reflexive Banach space. On the other hand, 1-regularization in the case x1< is well understood as a prominent version of sparsity promoting regularization. Hence, it seems to be of great interest whether the 1-regularization also makes sense in the oversmoothing case, which is characterized by the fact that x21 such that x1=. The scenario of oversmoothing regularization has, to the best of the authors knowledge, only been treated in the setting of regularization in Hilbert scales, see Section 2. There, the specific structure of the problem yields link conditions as tools to jump between different scale levels and thus to handle the occurring analytical problems arising from the oversmoothing character of the penalty. For other space settings the corresponding tools are missing, and this paper is an attempt to overcome this deficit. We therefore stress that in this paper we are not interested in specific applications, nor in improving any sort of reconstruction quality. Equation (1.1) and procedure (1.3) merely constitute a new trial case for modeling the oversmoothing regularization.

    Let us now discuss our motivation for investigating this regularization idea. As we will see in our model problem, oversmoothing 1-regularization already has one advantage over the classical Tikhonov regularization with penalty 22, i.e.,

    xδα:=argminx2{12||Axyδ||2+α||x||22}. (1.4)

    Namely, it does not suffer from the saturation effect of the 2-regularized approach, and the proposed a priori choice of the regularization parameter coincides with the discrepancy principle for all cases under consideration. Hence, one motivation for oversmoothing regularization are possibly improved convergence properties. But there are more potential benefits. An oversmoothing penalty might yield an easier optimization problem. For example, when xp2 with p>2, classical Tikhonov regularization (1.4) would be oversmoothing. However, since it is much easier to find the minimizer of the 2-penalized Tikhonov functional (which corresponds to solving a linear system) than minimizing a functional with arbitrary p-penalty, the former might be preferred. A third aspect for oversmoothing regularization is feature extraction. In terms of our model problem, 1-regularization is known to yield sparse regularized solutions, i.e., only finitely many components are non-zero. As we will see later, 1-regularization in some sense selects the best (least) discretization level to achieve a certain residual. Therefore, the approximate solutions need less storage space than 2-regularized solutions. In general, other features than sparsity could be of interest. Finally, the analysis of oversmoothing regularization is challenging and might provide new insights for general regularization approaches.

    The paper is organized as follows: In Section 2 we sketch the existing results on oversmoothing regularization in Hilbert scales. We proceed with properties of 1-regularized solutions and discuss their implications in Section 3. After that we move to convergence rates for 1-regularization including the oversmoothing case in Section 4. In this context, we restrict our studies to diagonal operators and present a specific simplified model problem, which is the basis for the subsequent numerical experiments. In Section 5 we compare our derived convergence rates with the ones known from classical Tikhonov regularization with 2-penalty. Finally, numerical experiments supporting the theoretical results are presented in Section 6.

    To the best of the authors knowledge, oversmoothing regularization has so far only been investigated in the setting of regularization in Hilbert scales, starting with the seminal paper by Natterer [6]. There, Natterer considers a linear problem (1.1) between Hilbert spaces X and Y. The regularized solutions are obtained by the minimization problem

    xδα:=argminxX{||Axyδ||2+α||x||2p}, (2.1)

    where the penalty is a norm in a Hilbert scale. In this context, a family {Xs}sR of Hilbert spaces with X0:=X is called Hilbert scale if XtXs whenever s<t and the inclusion is a continuous embedding, i.e., there exists cs,t>0 such that ||x||scs,t||x||t. The Hilbert scale is induced by an unbounded, self-adjoint, and strictly positive definite operator T in X such that xs:=TsxX, sR, defines a norm in Xs.

    Under the noise model (1.2) in combination with the smoothness assumption ||x||qρ for some q>0, and provided that there exists some a>0 such that with two constants m,M>0 and for all xX the inequality chain

    m||x||a||Ax||M||x||a (2.2)

    holds true, Natterer shows in [6] the error estimate

    xδαxXCδqa+qρaa+q (2.3)

    whenever the penalty smoothness p in (2.1) is sufficiently high, in the sense that p(qa)/2, and for the a-priori parameter choice

    α(δ)=c(ρ)δ2(a+p)a+q (2.4)

    with appropriately chosen constant c(ρ)>0. It is interesting that here the index p, which characterizes the smoothness of the penalty in (2.1), is limited for obtaining the rate result (2.3) by a lower bound, but not by an upper bound. Really, Natterer states that ``there is nothing wrong with high order regularization, even well above the order of the smoothness of the exact solution''. Even though x may have arbitrarily small q-norms with q>0, one still obtains the order optimal rate of convergence. The only adjustment to be made is an appropriate decrease of the regularization parameter.

    Recently, Natterer's results have been extended to ill-posed inverse problems with nonlinear forward operators by Mathé and Hofmann [7,8]. In the first paper, the regularization parameter is chosen according to the discrepancy principle, while the second paper shows convergence rates under, in principle, the same a-priori parameter choice as occurring in (2.4). In both papers the obtained convergence rate coincides with that in (2.3), i.e., xδαxX=O(δqa+q) as δ0.

    In this section we discuss some basic properties of oversmoothing 1-regularization or the lack thereof. The results are easily generalized to arbitrary penalty functionals in Tikhonov-type regularization, as long as certain standard assumptions are fulfilled. We refer to, e.g., [9,Assumption 3.11 and Assumption 3.22] or [10,Assumption 8.1] for details. Basic properties of 1-regularization are well known. As the following lemma shows, existence and stability of regularized solutions with respect to small perturbations in the data are not influenced by the oversmoothing setting.

    Lemma 3.1. For all fixed α,δ>0, a minimizer of (1.3) exists and has a finite 1-norm. Furthermore, if {yk} is a sequence norm-converging in Y to yδ, i.e. ykyδ, then every associated sequence {xk} with

    xkargminx2{12||Axyk||2+αx1}

    has a subsequence that converges weakly. The weak limit of every such convergent subsequence {x˜k} of {xk} is a minimizer ˜x of (1.3). Moreover, x˜k converges to ˜x.

    Proof. Take α>0 arbitrary but fixed. Since 12, there exists ˜x2 with finite values ||A˜xyδ|| and ˜x1. This implies

    xk11α(12Axkyk2+αxk1)1α(12A˜xyk2+α˜x1)<

    by the optimality of the minimizer. The remaining part of the proof follows from standard arguments [9,10].

    We move on to identify some necessary conditions for the convergence of regularized solutions for δ0 in the case of oversmoothing regularization.

    Theorem 3.1. Let x2 with x1= denote a solution to (1.1). If the Tikhonov regularized solutions (1.3) under consideration are weakly convergent to x for an a priori or a posteriori parameter choice rule α=α(δ,yδ), then the following items must hold for a sequence xk:=xδkαkx as δk0:

    a) limkxk1=,

    b) limkαk=0,

    c) limkαkxk1C<.

    Proof. By Lemma 3.1, xk1< for all kN. If, however, we assume that there is a subsequence {xkj} with xkj1c< uniformly for all jN, then the assumed weak convergence xkjx in X implies that x1c. This contradicts the assumption x1= and yields item a) of the theorem. Now take some fixed ˜x1 and keep in mind the definition of the xk as minimizers of the functional (1.3). It is

    12||Axkyδk||2+αkxk112||A˜xyδk||2+αk˜x1δ2k+||A˜xy||2+αk˜x1.

    Therefore

    αk(xk1˜x1)δ2k+||A˜xy||2C<.

    Since ˜x1< we need limkαk=0 in order to allow limkxk1= as necessary due to a). Additionally, the product αkxk1 has to stay bounded, yielding together b) and c).

    The next step would be to show (weak) convergence of the regularized solutions to the exact solution as δ0. However, no such result is known. Even for the Hilbert scale setting of Section 2 no general convergence assertion appears to be available. In the standard setting x1<, one has due to the optimality of the xδα

    12Axδαyδ2+αxδα112δ2+αx1<. (3.1)

    Requiring α0 and δ2/α0 as δ0 then ensures (weak) convergence of the regularized solutions to the exact one. In particular, the 1-norm of the regularized solutions xδα1 remains bounded by a constant for all δ>0. In the oversmoothing setting, the right-hand side in (3.1) is infinite, hence provides no useful information. It appears natural to replace x by suitable auxiliary elements to bound the Tikhonov functional, but that is not enough. Let {xδ}δ>0 be a sequence with

    xδ1=infx2,Axycδx1

    for fixed constant c>0. Using the {xδ}δ>0 to bound the Tikhonov functional, we obtain

    12Axδαyδ2+αxδα112Axδyδ2+αxδ1(c+1)δ2+αxδ1. (3.2)

    For any choice of α=α(δ) one obtains xδα1cδ2α+xδ1. Even if δ2/α0, this does not yield a bound for xδα1 independent of δ, since xδ1. Therefore one cannot infer the existence of a (weakly) convergent subsequence among the regularized solutions xδα as is the argument in the standard, non-oversmoothing, convergence results. For the oversmoothing regularization one would need that the xδα are bounded in a norm weaker than the 1-norm, for example, in the 2-norm. It is currently not clear how such a connection can be established. At this point we also mention that the oversmoothing approach prevents the use of state-of-the-art regularization theory. In recent years, variational inequalities (sometimes also called variational source conditions) have emerged as a powerful tool in the theory of Banach-space regularization, and we only refer, for example, to the papers [4,5]. For 1 regularization, in [11] a variational inequality of the form

    xx1x1x1+φ(AxAx) (3.3)

    for a well-defined concave index function φ and valid for all x1 was established and used for deriving convergence rates in the case x1. Clearly, this concept is inapplicable when x21 and hence x1=, but it could be an idea to measure the error in 2-norms. Therefore, it seems to be interesting that not even for x1 a variational inequality of the form

    xxκ2x1x1+φ(AxAx)

    with κ=1 or κ=2 is known as an alternative to (3.3).

    In this section, we derive convergence rates for the 1-penalized Tikhonov-type regularization (1.3) to (1.1). This method became a popular and powerful tool in the last decade, sparked by the seminal paper [13]. Since then, many authors have contributed to its theory and application. Here we only mention the papers [14,15,16]. As is typical in 1-regularization, we assume that A is injective. For the non-injective case we refer to [17]. The vast majority of papers connected to 1-regularization assumes sparsity in the sense that x0, i.e., that it has only finitely many non-zero components. However, in [11] for the first time the situation that the exact solution x is not sparse, but only x1, was explored. The results were later refined and extended in [3,12,18,19]. In some sense, this paper is a continuation of this development as we are now interested in the case that x is not even in 1, but x21. Due to this we will employ the 2-norm to measure the speed of convergence, and we seek for an index function φ, i.e., a continuous, monotonically increasing, and concave function with φ(0)=0, such that

    ||xδαx||2Cφ(δ)

    holds with some constant C>0. We will show that an appropriate choice of the regularization parameter yields such a function φ.

    It is well-known that 1-regularization yields sparse solutions. We will see that for showing convergence rates when x1 it is essential to estimate the support of the regularized solutions. In order to do this, we rely on the explicit calculation of the minimizers. Therefore, we have to restrict ourselves to diagonal operators for this paper. We denote by {e(i)}i=1 the canonical orthonormal basis in 2 with components e(k)k=1 and e(i)k=0 for ki. Moreover, we denote by {v(i)}i=1 an orthonormal basis in the range closure ¯range(A) of Y. Then we say that a compact operator A:2Y with decreasingly ordered singular values {σi}i=1 is of diagonal structure if

    Ae(i)=σiv(i)ande(i)=1σiAv(i)(i=1,2,...).

    This model includes compact linear operators mapping between general Hilbert spaces X and Y, since such operators can be diagonalized with respect to their singular system [1,2].

    We now present a way of constructing functions φ that serve as prototypes for convergence rates. To this end, we define, for all nN, the linear projectors

    Pn:22,Pnx={xi}ni=1. (4.1)

    Lemma 4.1. Let A:2Y be a compact linear operator with diagonal structure as introduced above, possessing the singular values

    A=σ1σ2...σnσn+1...0asn,

    and let x2 denote the uniquely determined solution to the operator equation (1.1), and set

    T(x,n):=||(IPn)x||2=i=n+1|xi|2. (4.2)

    Then for any x2 it is, for all nN,

    ||xx||2||(IPn)x||2+σ1n||AxAx||+T(x,n). (4.3)

    Proof. We have, for any x2, with the linear projectors from (4.1) the relation

    ||xx||2||(IPn)x||2+||Pn(xx)||2+||(IPn)x||2 (4.4)

    which holds for all nN. We keep n arbitrary but fixed and start estimating the last term which describes the decay of the tail of the solution and thus its smoothness. It is a fixed function of n which goes to zero as n, and we employ the convention (4.2).

    Next we estimate the middle term ||Pn(xx)||2 in (4.4). In order to do so we recall the notion of the modulus of continuity, given by

    ω(M,θ):=sup{||x||:xM,||Ax||θ}

    where M2. This quantity is essentially related to minimal errors of any regularization method for noisy data. Since Pn(xx)2n:=span{e1,,en}, we can use tools from approximation theory to estimate its norm. In [20,Proposition 3.9] it has been shown that for any finite dimensional space Xn

    ω(Xn,θ)=θΘ(A,n),

    where the modulus of injectivity Θ(A,n) is defined as

    Θ(A,n):=inf0xXn||Ax||||x||.

    For diagonal operators it is Θ(A,n)=σn and thus

    ω(2n,θ)=σ1nθ.

    Noting that for diagonal operators it also holds that

    ||APn(xx)||||A(xx)||, (4.5)

    it is therefore

    ||Pn(xx)||2ω(2n,||APn(xx)||)=σ1n||APn(xx)||σ1n||A(xx)||. (4.6)

    Inserting this and (4.2) into (4.4), we obtain (4.3).

    In the best case scenario the term ||(IPn)x||2 in (4.3) vanishes (note that this is a crucial point to be shown for the 1-regularized approximations to x) and only the two rightmost terms remain. The best possible convergence rate is therefore determined by taking the infimum of those two expressions with respect to n. This yields a prototype rate function

    φ(t)=infnN{tσn+(IPn)x2}. (4.7)

    Note that as infimum over affine functions φ is concave, and it is also monotonically increasing with φ(0)=0. Since for fixed t0 the infimum is taken over a countable set and σ1nt monotonically as n while T(x,n)0 monotonically, the infimum is attained as minimum and the corresponding index

    ninf(t):=argminnN{tσn+(IPn)x2} (4.8)

    is well defined. In order to show ||xx||2φ(AxAx) for the 1-regularized solutions x=xδα from (1.3), we need the following assumptions.

    Assumption 4.1. (a) The singular values of A fulfill, for all nN,

    1σnσn+1Cσ

    with a constant 1Cσ<.

    (b) The tail is monotone, i.e., |xn| is monotonically decreasing for all sufficiently large n.

    (c) The convergence rate φ(t) is not dominated by the tail, i.e., there exists a constant ˜C such that

    T(x,ninf(t))˜Cσ1ninf(t)t.

    (d) For the true solution x it holds

    |xninf(t)+1|Cσ1ninf(t)tnninf(t)

    for a constant 0C< and sufficiently small t>0.

    Part (a) limits the ill-posedness of the forward operators. A polynomial decay of the singular values fulfills the condition, even an exponential decay of order σi=ei is permitted, but for any ilonε>0 a decay σi=ei1+ilonε violates the condition. Note that the left-hand inequality is trivial, but will be used in a proof later. Part (b) is required purely for technical reasons. Part (c) and (d) link the smoothness of the true solution with the ill-posedness of the forward operator. They will be discussed in examples below. We remark that similar restrictions are standing assumptions in the Hilbert-scale setting of Section 2: Condition (2.2) ensures that the singular values of A fall asymptotically as ia, and the condition p(qa)/2 implies that the solution must not be significantly smoother than the penalty.

    For later use we mention two consequences of Assumption 4.1.

    Lemma 4.2. Let A be as in Lemma 4.1, and let Assumption 4.1 (a) and (c) hold. Then

    11+˜Cφ(t)σ1ninf(t)tφ(t). (4.9)

    Furthermore, it is

    APninf(t)xAx˜Ct. (4.10)

    Proof. The right-hand inequality of (4.9) follows from the definition of φ in (4.7). Similarly, it is

    φ(t)=σ1ninf(t)t+T(x,ninf(t))(1+˜C)σ1ninf(t)t,

    which yields the left-hand inequality. To obtain the estimate of the residual, we observe that

    APninf(t)xAx2=i=ninf(t)+1|σixi|2σ2ninf(t)+1i=ninf(t)+1|xi|2σ2ninf(t)T(x,ninf(t))2.

    Inserting Assumption 4.1 (c) immediately yields (4.10).

    In order to show convergence rates based on Lemma 4.1, two terms need to be estimated in (4.3) for x=xδα: the residual AxαAx and the tail (IPn)xδα2.

    Lemma 4.3. Let A be as in Lemma 4.1, and let Assumption 4.1 (c) hold. Then, with the a-priori choice

    α(δ)=cαδ2ninf(δ)φ(δ), (4.11)

    with constant 0<cα<, the minimizers xδα of (1.3) satisfy

    ||Axδαyδ||((˜C+1)2+4cα)δ. (4.12)

    In particular, it is

    AxδαAxcδ (4.13)

    with positive constant c=(˜C+1)2+4cα+1.

    Proof. From the Tikhonov functional (1.3) we have for all nN

    12||Axδαyδ||2+α||xδα||112||APnxyδ||2+α||Pnx||1,

    which with

    ||xδα||1=||Pnxδα||1+||(IPn)xδα||1

    and

    ||Pnx||1||Pn(xxδα)||1+||Pnxδα||1

    yields

    12||Axδαyδ||2+α||(IPn)xδα||112||APnxyδ||2+α||Pn(xxδα)||112APnxyδ2+αnPn(xxδα)2. (4.14)

    Now fix n=ninf(δ). Using (4.6) and (4.9) on the right-hand side we have

    ||Pninf(δ)(xxδα)||2σ1ninf(δ)AxδαAxσ1ninf(δ)(Axδαyδ+Axyδ)2σ1ninf(δ)max{Axδαyδ,δ}2φ(Axδαyδ), (4.15)

    where in the last estimate we have assumed Axδαyδ>δ, as otherwise the assertion of the lemma is trivially fulfilled. We combine this estimate with (4.14) and again set n=ninf(δ). This yields

    12||Axδαyδ||212||APninf(δ)xyδ||2+2αninf(δ)φ(||Axδαyδ)||). (4.16)

    Note that by (4.10) we have ||APninf(δ)xyδ||(˜C+1)δ. Inserting the parameter choice (4.11) into (4.16), we continue analogously to [5,Corollary 1]. Namely, it is by concavity of φ

    12||Axδαyδ||2(˜C+1)22δ2+2cαδ2ninf(δ)φ(δ)ninf(δ)φ(||Axδαyδ)(˜C+1)22δ2+2cαδ2φ(Axδαyδ)φ(δ)δ2((˜C+1)22+2cα)φ(Axδαyδ)φ(δ)δ2((˜C+1)22+2cα)Axδαyδφ(δ)δφ(δ)=δ((˜C+1)22+2cα)Axδαyδ.

    This yields ||Axδαyδ||((˜C+1)2+4cα)δ for α from (4.11).

    The second assertion follows from this, the noise assumption (1.2), and the triangle inequality.

    Lemma 4.4. Let A be as in Lemma 4.1, and let Assumption 4.1 (a)–(d) hold. Let α be chosen according to (4.11) such that cαC(˜C+1). Then the minimizers xδα of (1.3) satisfy

    (IPninf(δ))xδα)2Cσφ(δ). (4.17)

    Proof. The diagonal structure of the operator allows to calculate the minimizer of (1.3) explicitly, and (1.3) reads

    12||Axyδ||2+α||x||1=12iN(σixiyδi)2+αiN|xi|. (4.18)

    Since the components are decoupled, the first order optimality condition for the above functional is

    xi(12iN(σixiyδi)2+αiN|xi|)=0iN,

    i.e., for each iN,

    σi(σixiyδi)+αsgn(xi)=0

    where

    sgn(x):={1x>01x<0[1,1]x=0.

    Consider the case [xδα]i>0. Then

    [xδα]i=yδiσiασ2i>0.

    On the other hand, for [xδα]i<0 we have

    [xδα]i=yδiσi+ασ2i<0

    and consequently

    [xδα]i=0|yδiσi|ασ2i. (4.19)

    We will only consider the case [xδα]i>0 further, the results for [xδα]i<0 are analogous with inverted sign. First let y be exact, noise-free data, i.e., yi=σixi for all iN, where the minimizer of (4.18) (with yδ temporarily replaced by y) is denoted by xα. Then (4.19) yields

    [xα]i=0|xi|ασ2i. (4.20)

    Inserting the parameter choice (4.11) and considering i=ninf(δ)+1, we find with Assumption 4.1 (a) and (d), and with (4.9) that

    ασ2ninf(δ)+1=ασ2ninf(δ)σ2ninf(δ)σ2ninf(δ)+1cα(σ1ninf(δ)δ)2ninf(δ)φ(δ)cα˜C+1σ1ninf(δ)δninf(δ)cαC(˜C+1)|xninf(δ)+1|.

    Now, as long as cαC(˜C+1), this implies [xα]ninf(δ)+1=0. Because ασ2i is increasing in i while |xi| decreases, we conclude that for ninf(δ) sufficiently large (i.e., δ sufficiently small), [xα]ninf+1=0. Under Assumption 4.1 (b) all entries [xα]m, m>ninf(δ), must then also be zero. Hence, in the noise-free case, we have (IPninf(δ))xα2=0. Therefore, for noisy data and under the same parameter choice, any contribution to (IPninf(δ))xδα2 must be due to the noise in the data. Consider the model yyδ=δζ with some ζ2, ||ζ||2=1. Then yyδ=δ and

    [xδα]i>0xi+δζiσiασ2i>0. (4.21)

    The components xi are fixed and decreasing. The regularization parameter α is fixed for all i, hence α/σ2i grows with increasing index. Therefore, the smaller i, the (potentially) larger the components [xδα]i could become. To estimate the tail ||(IPninf(δ))xδα||2 we are only interested in the higher indices i>ninf(δ). Due to the asymptotics of the individual terms, ||(IPninf(δ))xδα||2 therefore becomes largest when all its mass is concentrated in the lowest possible component ninf(δ)+1, i.e., ζninf(δ)+1=1 and (IPninf(δ))xδα=([xδα]ninf(δ)+1,0,). From the noise-free considerations above we already obtained xiασ2i. Hence, any positive contribution in (4.21) comes from the noise. With an analogous argument for the case [xδα]i<0 we therefore get

    |[xδα]ninf(δ)+1|δσninf(δ)+1.

    Using Assumption 4.1 (c) we obtain

    δσninf(δ)+1=δσninf(δ)σninf(δ)σninf(δ)+1Cσσ1ninf(δ)δCσφ(δ).

    Putting all the pieces together yields the main theorem.

    Theorem 4.1. Let A:2Y be a compact linear operator with diagonal structure, possessing the singular values

    A=σ1σ2...σnσn+1...0asn.

    Let x2 denote the uniquely determined solution to the operator equation (1.1), and let Assumption 4.1 hold. Then the 1-regularized solutions xδα from (1.3) with noisy data yδ obeying the noise model yyδδ satisfy for sufficiently small δ>0 the estimate

    ||xδαx||2(Cσ+c)φ(δ)

    with c as in (4.13) and with the concave index function

    φ(t)=infnN{tσn+(IPn)x2},

    provided the regularization parameter is chosen a priori as

    α(δ)=cαδ2ninf(δ)φ(δ),

    with constant cαC(˜C+1). The integers

    ninf(δ):=argminnN{δσn+(IPn)x2}

    can be found for all δ>0.

    Proof. Lemma 4.1 and the discussion thereafter provide us with the prototype of the convergence rate φ(t) (4.7) and with ninf(t) (4.8). Consider (4.3) with x replaced by xδα, i.e.,

    ||xδαx||2||(IPn)xδα||2+σ1n||AxδαAx||+T(x,n). (4.22)

    Fix n=ninf(δ). From Lemma 4.3 we then have AxδαAxcδ with c>1, hence

    ||xδαx||2||(IPninf(δ))xδα||2+c(σ1ninf(δ)δ+T(x,ninf(δ))).

    The term in brackets, by definition, attains the value φ(δ). The remaining term ||(IPninf(δ))xδα||2 was estimated in Lemma 4.4. Therefore

    xδαx2(Cσ+c)φ(δ).

    The a-priori choice requires, in principle, the knowledge of the exact solution and is thus unfeasible in practice. In the following we comment on the discrepancy principle as an a-posteriori parameter choice. We begin with a helpful lemma.

    Lemma 4.5. Let Assumption 4.1 hold and α(δ) be any choice of the regularization parameter in (1.3) such that

    τ1δAxδαyδτ2δ

    for (1+˜C)<τ1τ2<. Then ˉcαα with the a-priori choice of α from (4.11) and ˉc=2cα(τ2+1)τ21(˜C+1)2.

    Proof. From the minimizing property of the Tikhonov-functional we have

    12Axδαyδ2+αxδα112APninf(δ)xyδ2+αPninf(δ)x1.

    Similar to the proof of Lemma 4.3 this implies

    12||Axδαyδ||212||APninf(δ)xyδ||2+αninf(δ)φ(||AxδαAx)||). (4.23)

    Note that for this results we have replaced (4.15) by

    ||Pninf(δ)(xxδα)||2σ1ninf(δ)AxδαAxφ(AxδαAx).

    Using ||APninf(δ)xyδ||(˜C+1)δ (cf. Lemma 4.2) and τ1δAxδαyδ in (4.23) we have

    τ21δ22(˜C+1)22δ2+αninf(δ)φ(||AxδαAx)||),

    and since τ1>(1+˜C)

    δ22τ21(˜C+1)2αninf(δ)φ(||AxδαAx)||).

    Using the upper bound Axδαyδτ2δ and the concavity of φ, it is

    φ(||AxδαAx||)φ(||Axδαyδ||+||yδAx||)(τ2+1)φ(δ)

    and we obtain, with α from (4.11),

    α=cαδ2ninf(δ)φ(δ)cα2(τ2+1)τ21(˜C+1)2α.

    Theorem 4.2. Let A:2Y be a compact linear operator with diagonal structure, possessing the singular values

    A=σ1σ2...σnσn+1...0asn.

    Let x2 denote the uniquely determined solution to the operator equation (1.1), and let Assumption 4.1 hold. Then the 1-regularized solutions xδα from (1.3) with noisy data yδ obeying the noise model yyδδ satisfy for sufficiently small δ>0 the estimate

    ||xδαx||2(τ2+Cσ+1)φ(δ)

    with a constant 0<c< and with the concave index function

    φ(t)=infnN{tσn+(IPn)x2},

    provided the regularization parameter is chosen a posteriori such that

    τ1δAxδαyδτ2δ, (4.24)

    with parameters 1+˜C<τ1τ2< such that 2C(˜C+1)(τ2+1)τ21(˜C+1)21.

    Proof. We start as in the proof of Theorem 4.1 and consider (4.22). Due to the parameter choice (4.24) and the triangle inequality we have

    AxδαAx(τ2+1)δ,

    hence it is, with n=ninf(δ),

    ||xδαx||2||(IPninf(δ))xδα||2+(τ2+1)(σ1ninf(δ)δ+T(x,ninf(δ)))=||(IPninf(δ))xδα||2+(τ2+1)φ(δ).

    According to Lemma 4.3, the regularization parameter α obtained from the discrepancy principle is larger than the a-priori parameter a from (4.11), αα, if ˉc1. The minimal admissible constant cα in Lemma 4.4 is cα=C(˜C+1). Inserting this into ˉc yields αα if 2C(˜C+1)(τ2+1)τ21(˜C+1)21. It now follows analogously as in the proof of Lemma 4.4 that for noise-free data ||(IPninf(δ))xδα||2=0 and in the worst-case noise scenario ||(IPninf(δ))xδα||2Cσφ(δ). Hence

    ||xδαx||2(τ2+Cσ+1)φ(δ).

    In order to exemplify and illustrate the general theory in more detail, we will now consider some simple model scenarios. As before, we assume that A:22 is diagonal, [Ax]i=σixi, i=1,2,. For simplicity, we will denote all constants by a generic constant 0<c<.

    Let σi=iβ and xi=iη for positive values β and η and all iN. In particular, for 12<η1 this yields a case of oversmoothing regularization x1=, whereas for η>1 we have the classical model with x1<.

    Theorem 4.3. Let A be diagonal with singular values σi=iβ, β>0. Let x2 be such that [x]i=iη for η>12. Then the 1-regularized solutions xδα from (1.3) to (1.1), with noisy data yδ obeying yyδδ, satisfy

    ||xδαx||2cδ2η12η+2β1, (4.25)

    provided the regularization parameter is chosen a priori as

    α(δ)=cδ4β+2η2η+2β1 (4.26)

    or according to the discrepancy principle (4.24).

    Proof. It remains to calculate the quantities occurring in the proof of Theorem 4.1 explicitly.

    The tail of the exact solution satisfies

    ||(IPn)x||2=i=n+1|iη|212η1(n+1)12η(2η1)12n12η. (4.27)

    Inserting the structure of A, the rate prototype (4.7) becomes

    φ(t)=infnN{nβt+cηn12η}. (4.28)

    It is simple calculus to show

    φ(t)=ct2η12η+2β1

    and

    ninf(t)=ct22η+2β1 (4.29)

    where denotes the closest integer. Inserting the previous results into the parameter choice (4.11) yields

    α(δ)=δ2ninf(δ)φ(δ)=cδ4β+2η2η+2β1. (4.30)

    We mention that this model problem fulfills Assumption 4.1. Namely, we have xn+1<xn=nη for all nN, and from (4.27) we obtain

    (IPn)x2ncnη.

    Also Assumption 4.1 (c) is fulfilled as

    1σnσn+1nβ(n+1)β=(n+1n)β2β=:Cσ.

    Finally, we observe that

    δ2α=δ2cδ4β+2η2η+2β1=cδ2η22η+2β1

    goes to infinity when 12<η<1, stays constant for η=1, and goes to zero for η>1. That means there is a seamless transition between the regime of oversmoothing regularization and that of classical regularization. Numerical experiments are presented in Section 6.

    We now consider for simplicity the case σi=i1 and xi=ei for all iN. This is no oversmoothing regularization since x1. Using MATHEMATICA®we calculate the tail in (4.7), T(x,n)cen. It is then simple calculus to show

    ninf(t)=cln1t.

    This yields

    φ(t)=σ1ninf(t)t+T(x,ninf(t))=c(tln1t+t).

    In contrast to the previous example, the two terms σ1ninf(δ)δ and T(x,ninf(δ)) are no longer of the same order, but, for sufficiently small δ, σ1ninf(δ)δ is the dominating part, i.e., Assumption 4.1 (c) is fulfilled. Also part (d) of Assumption 4.1 holds, since xninf(δ)+1<xninf(δ)=cδ, whereas σ1ninf(δ)δninf(δ)=cδln(1δ). The predicted rate xδαx2cφ(δ)=cδln1δ can be verified numerically, see the left plot of Figure 5.

    Figure 5.  Numerically observed convergence rate for model cases 4.3.2 (left) and the sparsity case 4.3.5 (right) where Assumption 4.1 holds. In both cases we have a good fit between measured and theoretical convergence rate. Note that in the sparsity case, only the first 4 components of x were non-zero. The regularized approximation xδα had at most 5 non-zero entries.

    Let now σi=ei and, for simplicity, xi=i1 for all iN. We are again in the oversmoothing regime. To formulate the convergence rate, we recall the definition of the Lambert-W-function, defined as z=W(zez). With this, the minimizing argument of

    φ(t)=infnNent+cn12

    can be found, using MATHEMATICA®, to be

    ninf(t)=32W(23t32),

    such that

    φ(t)=teninf(t)+cninf(t)12.

    The first term decays faster than the second one, hence the rate is dominated by the tail of the exact solution, and Assumption 4.1 (c) is violated, see Figure 1. Consequently, Theorem 4.1 is not applicable. Indeed, in a numerical experiment, shown in the left part of Figure 6, the measured convergence rate is different from the one predicted here.

    Figure 1.  δeninf(δ) (red, dashed), ninf(δ)12 (black, solid), and φ(δ) (blue, dash-dotted), for the scenario of Section 4.3.3, demonstrating the violation of Assumption 4.1 (c).
    Figure 6.  Numerically observed convergence rate for the model cases 4.3.3 (left) and 4.3.4 (right) where Assumption 4.1 is violated. In the left-hand plot, predicted and measured convergence rate show a clear mismatch, indicating that Assumption 4.1 is indeed needed for our analysis. On the right-hand side we see a good match between theoretical and measured rates. We attribute this to the phenomenon discussed in Section 4.3.4, namely that Assumption 4.1 (d) does not hold for all δ>0, but for all numerically reasonable δ>δ0>0.

    Let for this example σi=ei and xi=ei for all iN. Then

    φ(t)=infnNent+cen

    and

    ninf(t)=12ln(1t),

    hence

    φ(t)=c(e12ln(1t)t+e12ln(1t))=ct.

    Because xninf(δ)+1<xninf(δ)=cδ, but σ1ninf(δ)δninf(δ)=cδln(1δ), Assumption 4.1 (d) is (formally) violated. Numerically, however, even for values of δ significantly smaller then the machine-ilonε, a constant can be found such that Assumption 4.1 (d) holds, i.e., δC(δ0)δln(1δ) for δ>δ0. A plot exemplifying this can be found in Figure 2. Consequently, our numerically retrieved convergence rate is close to the predicted δ-rate, see the right part of Figure 6.

    Figure 2.  Scenario of Section 4.3.4: In general, this example violates Assumption 4.1 (d), but for δ>1030, the assumption is valid with C=10.

    Our last examples goes back to the setting that sparked the developments around 1-regularization, namely the case of sparse solutions. Let x=(x1,x2,,xn0,0,0,). Let A be as in Theorem 4.1 with no further restrictions. Then

    φ(t)=infnNσ1nt+i=n+1|xi|=min{infnN,nn0σ1nt+n0i=n+1|xi|,σ1n0+1t},

    i.e., (only) for sufficiently small values of δ can a linear convergence rate φ(δ)=cδ be reached. Note that in the literature the linear convergence rate was derived in the 1-norm, xδαx1cδ, whereas we obtain the same rate in the 2-norm xδαx2cδ. Since for sufficiently small δ>0 ninf(δ)=n0+1=const, we also recover from (4.11) the well-known parameter choice α=cδ.

    In order to get a feeling for the derived convergence rate we compare the result from the previous section with classical, non-oversmoothing, 1-regularization and 2-regularization. In order to be able to use explicit expressions for the convergence rates and parameter choice rules, we will only consider the specific model problem of Section 4.3.1. Let us start with classical 1-regularization, i.e., the approximate solution to (1.1) is obtained via (1.3) but under the assumption that ||x||1<. In [19] a convergence rate

    ||xδαx||1cδη1η+β12=:φ1(δ)

    was derived using the parameter choice

    α(δ)δ2φ1(δ)=cδ4β+2η2η+2β1.

    Now let us move to 2-regularization. This corresponds to the classic Tikhonov regularization, i.e., the approximate solution to (1.1) is given by (1.4). It is then well known, see e.g. [1,2], that under the assumption

    xrange((AA)ν) (5.1)

    for some 0ν2 the best possible convergence rate

    ||xδαx||2cδνν+1 (5.2)

    can be shown under the a-priori parameter choice

    α(δ)δ2ν+1.

    In the diagonal setting of Section 4.3 the source condition (5.1) can easily be related to the parameters η and β. Namely, (5.1) holds for all ν with 2η12β>ν [21]. Since we are interested in the largest ν we set ν=2η12β for simplicity, acknowledging that actually we should write ν=2η12β+ilonε for arbitrary but small ilonε>0. With this, the convergence rate becomes

    ||xδαx||2cδνν+1=cδ2η12η+2β1=φ(δ)

    with φ(δ) from (4.25), and the parameter choice is

    α(δ)δ2φ(δ)2=δ2δ4η22η+2β1=δ4β2η+2β1.

    We summarize the convergence rates and parameter choices in Table 1. One sees that the 12-regularization, corresponding to 1-regularization with the 2-norm as error measure which includes the oversmoothing regularization, inherits the parameter choice from the classical 1-regularization and the convergence rate from the classical 2-regularization. The parameter choice influences the residual ||Axδαyδ|| and the penalty value ||xδα||1. Since it is most important to keep the residual on a level of about δ, the 1-parameter choice is used. This is in line with [18] where it has been observed that for 1-regularization the regularization parameter obtained via discrepancy principle and a-priori parameter choice always coincide up to constant factors. It is however somewhat surprising that this property appears to remain even when x1. The less smooth the solution is, the smaller α has to be chosen. On the other hand, the optimal convergence rate in 2 is well known to be given by (5.2). Therefore 1-regularization yields the optimal (2-)convergence rate, even when x1. Even more, it does not saturate like classical 2 regularization which possesses the qualification ν=2η12β2. Take any 12<η1. Then x21. If β<η214, then ν>2 and 2-regularization (1.4) has saturated with the limiting convergence rate φ(δ)δ2/3. Oversmoothing 1-regularization, on the other hand, would yield a higher rate of convergence since it does not saturate.

    Table 1.  Comparison of (classical) 1, (classical) 2 and (oversmoothing) 12 regularization for the diagonal setting of Section 4.1. The parameter choice depends on the penalty functional whereas the convergence rate depends on the norm the regularization error is measured in.
    Type 1 12 2
    rate φ1(δ)=δ2η22η+2β2 φ2(δ)=cδ2η12η+2β1 φ2(δ)=cδ2η12η+2β1
    α δ4β+2η2η+2β1 δ4β+2η2η+2β1 δ4β2η+2β1
    α recipe α=δ2φ1 α=δ2φ2(δ)δ12η+2β1 α=δ2φ2(δ)

     | Show Table
    DownLoad: CSV

    In this section we consider an operator specifically tailored to the setting of Section 4.1. We start with the Voltera operator

    [˜Ax](s)=s0x(t)dt

    and discretize ˜A with the rectangular rule at N=400 points. In order to ensure our desired properties of the model cases of Section 4.3 we compute the SVD of the resulting matrix and manually set its singular values σi accordingly. This means that the actual operator A in (1.1) is an operator possessing the same singular vectors {ui} and {vi} as ˜A, but different singular values {σi}. Using the SVD, we construct our solution such that xi=x,vi holds according to the scenario. We add random noise to the data y=Ax such that ||yyδ||=δ. The range of δ is such that the relative error is between 25% and 0.2%. The solutions are computed via

    xδα=argmin{||Axyδ||2+α||x||1},

    where the 1-norm is taken using the coefficients with respect to the basis originating from the SVD. We compute the reconstruction error in the 2 norm as well as the residuals. For larger values of η we can observe the convergence rate directly. For smaller values of η, we have to compensate for the error introduced by the discretization level. Namely, since we use a discretization level N=400, numerically we actually measure

    ||P400(xδαx)||2

    with the projectors P as before being the cut-off after N=400 elements. In the plots of the convergence rates we show

    ||P400(xδαx)+(IP400)x||2. (6.1)

    The second term can be calculated analytically and is supposed to correct for the fact that we cannot measure the regularization error for coefficients corresponding to larger n, i.e., we add the tail of x that can not be observed. Because xδα has only finitely many non-zero components and we use a sufficiently large discretization, we do not make any error with respect to the tail of the solutions.

    Our main focus is on the fully diagonal case of Section 4.3.1, i.e, σi=iβ for β>0 and xi=iη for η>12. Here, the regularization parameter is chosen a-priori according to (4.26) with cα=1.

    We show selected plots of the convergence rates in Figure 3 for β=1 and Figure 4 for β=2. The plots are given with logarithmic scales for both axis. In each plot of the convergence rates we added the regression line for the assumption ||xδαx||2=cδe. The value of e is given in the legend. The regularization parameter is shown in the title of the figures in the form α=δa, a given. The result of our simulations for a larger number of parameters η are shown in Tables 2 and 3 for β=1 and β=2, respectively. We see that for all values of η the predicted and measured convergence rate coincides nicely. Additionally, the residual remains stable around ||Axδαyδ||δ. For small values of η and β=1 the residual is a bit smaller than expected. We suppose this is due to the cut-off of x due to the discretization. For correct results we would have to include a tail of the residual similar to (6.1). If η is very large, i.e. the components of the solution decay rapidly, the observed convergence rate is basically linear. We suppose this is due to numerical effects as numerically those solutions are de facto sparse.

    Figure 3.  Numerically observed convergence rates for β=1 and η{0.55,0.7}. From the measured reconstruction error (solid line) we calculated the regression for the assumption ||xδαx||2=cδe, shown in the broken line. The theoretical convergence rate (4.25) is matched almost perfectly.
    Figure 4.  Numerically observed convergence rates for β=2 and η{0.55,1.05}. From the measured reconstruction error (solid line) we calculated the regression for the assumption ||xδαx||2=cδe, shown in the broken line. The theoretical convergence rate (4.25) is matched almost perfectly.
    Table 2.  Convergence rates for β=1 and various values η. α in the form α=δa. Measured and predicted regularization error in the form ||xδαx||2=cδe. Residual given in the form ||Axδαyδ||=cδd.
    η α, a measured rate, e predicted rate, e residual, d
    0.55 2.42 0.047 0.048 1.1
    0.6 2.36 0.09 0.091 1.08
    0.7 2.25 0.163 0.166 1.06
    0.8 2.15 0.229 0.23 1.028
    0.9 2.07 0.284 0.286 1.007
    1 2 0.33 0.333 1.01
    1.05 1.97 0.359 0.355 1.006
    1.1 1.94 0.372 0.375 1.006
    1.3 1.83 0.458 0.444 1.01
    1.5 1.75 0.515 0.5 1.01
    2 1.6 0.595 0.6 1.01
    2.5 1.5 0.659 0.667 0.996
    3 1.42 0.698 0.714 0.996
    6 1.23 0.81 0.85 0.997

     | Show Table
    DownLoad: CSV
    Table 3.  Convergence rates for β=2 and various values η. α in the form α=δa. Measured and predicted regularization error in the form ||xδαx||2=cδe. Residual given in the form ||Axδαyδ||=cδd.
    η α=δa, a measured rate, e predicted rate, e residual, d
    0.55 2.22 0.024 0.024 1.0005
    0.6 2.19 0.0473 0.0476 1.002
    0.7 2.13 0.089 0.091 1.01
    0.8 2.09 0.128 0.13 1.006
    0.9 2.04 0.166 0.167 1.006
    1.01 1.996 0.209 0.203 0.999
    1.1 1.96 0.236 0.23 0.994
    1.3 1.89 0.284 0.286 1.002
    1.5 1.83 0.329 0.333 0.999
    1.75 1.76 0.384 0.385 0.996
    2 1.71 0.421 0.428 0.999

     | Show Table
    DownLoad: CSV

    Finally, we compare the theoretically predicted convergence rates from the remaining model cases of Section 4.3 to measured rates from numerical experiments. In all experiments, the regularization parameter was chosen according to the discrepancy principle (4.24) with τ1=1.1 and τ2=1.3. We obtained a good match between the theoretical and observed convergence rates in the cases of sections 4.3.2, 4.3.4, and 4.3.4. Only case 4.3.3 showed a significant difference. This is no surprise, since Assumption 4.1 does not hold in this case.

    We have shown that oversmoothing regularization, i.e., Tikhonov regularization with a penalty that takes the value infinity in the exact solution, yields existence and stability of regularized solutions. We have discussed why it is difficult to show convergence of regularized solutions to the true solution in the limit of vanishing data noise. For the specific case of Tikhonov-regularization with 1-penalty term under a diagonal operator we have derived convergence rates of regularized solutions to the true solution even if those does not belong to 1 anymore. The theoretical convergence rates have been verified numerically for a model problem.

    D. Gerth was funded by Deutsche Forschungsgemeinschaft (DFG), project HO1454/10-1.

    The authors declare there is no conflicts of interest in this paper.



    [1] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Dordrecht: Kluwer Academic Publishers, 1996.
    [2] A. K. Louis, Inverse und schlecht gestellt Probleme, Stuttgart: Teubner, 1989.
    [3] D. Gerth and B. Hofmann, On 1-regularization under continuity of the forward operator in weaker topologies. In: New Trends in Parameter Identification for Mathematical Models (Ed.: B. Hofmann, A. Leitao, J.P. Zubelli), Cham: Birkhäuser, (2018), 67-88.
    [4] B. Hofmann, B. Kaltenbacher, C. Pöschl, et al. A convergence rates result for Tikhonov regularization in Banach spaces with non-smooth operators, Inverse Probl., 23 (2007), 987-1010. doi: 10.1088/0266-5611/23/3/009
    [5] B. Hofmann and P. Mathé, Parameter choice in Banach space regularization under variational inequalities, Inverse Probl., 28 (2012), 104006.
    [6] F. Natterer, Error bounds for Tikhonov regularization in Hilbert scales, Appl. Anal., 18 (1984), 29-37. doi: 10.1080/00036818408839508
    [7] B. Hofmann and P. Mathé, Tikhonov regularization with oversmoothing penalty for non-linear ill-posed problems in Hilbert scales, Inverse Probl., 34 (2018), 015007.
    [8] B. Hofmann and P. Mathé, A priori parameter choice in Tikhonov regularization with oversmoothing penalty for non-linear ill-posed problems, 2019. Available from: https://arxiv.org/abs/1904.02014.
    [9] T. Schuster, B. Kaltenbacher, B. Hofmann, et al. Regularization Methods in Banach spaces, Berlin/Boston: De Gruyter, 2012.
    [10] B. Hofmann, On smoothness concepts in regularization for nonlinear inverse problems in Banach spaces. In: Mathematical and Computational Modeling: With Applications in Natural and Social Sciences, Engineering, and the Arts (Ed.: R. Melnik), New Jersey: John Wiley, (2015), 192-221.
    [11] M. Burger, J. Flemming and B. Hofmann, Convergence rates in 1-regularization if the sparsity assumption fails, Inverse Probl., 29 (2013), 025013.
    [12] J. Flemming, B. Hofmann and I. Veselic, A unified approach to convergence rates for 1-regularization and lacking sparsity, J. Inverse Ill-posed Probl., 24 (2016), 139-148.
    [13] I. Daubechies, M. Defrise and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun. Pur. Appl. Math., 57 (2004), 1413-1457. doi: 10.1002/cpa.20042
    [14] M. Grasmair, M. Haltmeier and O. Scherzer, Sparse regularization with lq penalty term, Inverse Probl., 24 (2008), 055020.
    [15] D. Lorenz, Convergence rates and source conditions for Tikhonov regularization with sparsity constraints, J. Inverse ill-posed Probl, 16 (2008), 463-478.
    [16] R. Ramlau, Regularization properties of Tikhonov regularization with sparsity constraints, Electron. T. Numer. Anal., 30 (2008), 54-74.
    [17] J. Flemming, Convergence rates for l1-regularization without injectivity-type assumptions, Inverse Probl., 32 (2016), 095001.
    [18] D. Gerth, J. Flemming, Injectivity and weak*-to-weak continuity suffice for convergence rates in l1-regularization, J. Inverse Ill-posed Probl., 26 (2018), 85-94. doi: 10.1515/jiip-2017-0008
    [19] D. Gerth, Convergence rates for 1-regularization without the help of a variational inequality, Electron. T. Numer. Anal., 46 (2017), 233-244.
    [20] B. Hofmann, P. Mathé and M. Schieck, Modulus of continuity for conditionally stable ill-posed problems in Hilbert space, J. Inverse Ill-posed Probl., 16 (2008), 567-585.
    [21] D. Gerth, Using Landweber iteration to quantify source conditions - a numerical study, J. Inverse Ill-posed Probl., 27 (2019), 367-383. doi: 10.1515/jiip-2018-0071
  • This article has been cited by:

    1. Daniel Gerth, Bernd Hofmann, Christopher Hofmann, 2020, Chapter 9, 978-981-15-1591-0, 177, 10.1007/978-981-15-1592-7_9
    2. Philip Miller, Thorsten Hohage, Maximal spaces for approximation rates in 1-regularization, 2021, 149, 0029-599X, 341, 10.1007/s00211-021-01225-4
    3. De-Han Chen, Bernd Hofmann, Irwin Yousept, Oversmoothing Tikhonov regularization in Banach spaces * , 2021, 37, 0266-5611, 085007, 10.1088/1361-6420/abcea0
  • Reader Comments
  • © 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(5138) PDF downloads(482) Cited by(3)

Figures and Tables

Figures(6)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog