Loading [MathJax]/jax/output/SVG/jax.js
Research article

Convergence of distributed approximate subgradient method for minimizing convex function with convex functional constraints

  • Received: 12 April 2024 Revised: 20 May 2024 Accepted: 27 May 2024 Published: 07 June 2024
  • MSC : 65K05, 65K10, 90C25

  • In this paper, we investigate the distributed approximate subgradient-type method for minimizing a sum of differentiable and non-differentiable convex functions subject to nondifferentiable convex functional constraints in a Euclidean space. We establish the convergence of the sequence generated by our method to an optimal solution of the problem under consideration. Moreover, we derive a convergence rate of order O(N1a) for the objective function values, where a(0.5,1). Finally, we provide a numerical example illustrating the effectiveness of the proposed method.

    Citation: Jedsadapong Pioon, Narin Petrot, Nimit Nimana. Convergence of distributed approximate subgradient method for minimizing convex function with convex functional constraints[J]. AIMS Mathematics, 2024, 9(7): 19154-19175. doi: 10.3934/math.2024934

    Related Papers:

    [1] Jia-Tong Li . A redistributed cutting plane bundle-type algorithm for multiobjective nonsmooth optimization. AIMS Mathematics, 2022, 7(7): 12827-12841. doi: 10.3934/math.2022710
    [2] Lu-Chuan Ceng, Shih-Hsin Chen, Yeong-Cheng Liou, Tzu-Chien Yin . Modified inertial subgradient extragradient algorithms for generalized equilibria systems with constraints of variational inequalities and fixed points. AIMS Mathematics, 2024, 9(6): 13819-13842. doi: 10.3934/math.2024672
    [3] Francis Akutsah, Akindele Adebayo Mebawondu, Austine Efut Ofem, Reny George, Hossam A. Nabwey, Ojen Kumar Narain . Modified mildly inertial subgradient extragradient method for solving pseudomonotone equilibrium problems and nonexpansive fixed point problems. AIMS Mathematics, 2024, 9(7): 17276-17290. doi: 10.3934/math.2024839
    [4] Habib ur Rehman, Wiyada Kumam, Poom Kumam, Meshal Shutaywi . A new weak convergence non-monotonic self-adaptive iterative scheme for solving equilibrium problems. AIMS Mathematics, 2021, 6(6): 5612-5638. doi: 10.3934/math.2021332
    [5] Ziqi Zhu, Kaiye Zheng, Shenghua Wang . A new double inertial subgradient extragradient method for solving a non-monotone variational inequality problem in Hilbert space. AIMS Mathematics, 2024, 9(8): 20956-20975. doi: 10.3934/math.20241020
    [6] Yanfang Li, Yanmin Liu, Xianghu Liu, He Jun . On the approximate controllability for some impulsive fractional evolution hemivariational inequalities. AIMS Mathematics, 2017, 2(3): 422-436. doi: 10.3934/Math.2017.3.422
    [7] Lu-Chuan Ceng, Li-Jun Zhu, Tzu-Chien Yin . Modified subgradient extragradient algorithms for systems of generalized equilibria with constraints. AIMS Mathematics, 2023, 8(2): 2961-2994. doi: 10.3934/math.2023154
    [8] Adisak Hanjing, Panadda Thongpaen, Suthep Suantai . A new accelerated algorithm with a linesearch technique for convex bilevel optimization problems with applications. AIMS Mathematics, 2024, 9(8): 22366-22392. doi: 10.3934/math.20241088
    [9] Rose Maluleka, Godwin Chidi Ugwunnadi, Maggie Aphane . Inertial subgradient extragradient with projection method for solving variational inequality and fixed point problems. AIMS Mathematics, 2023, 8(12): 30102-30119. doi: 10.3934/math.20231539
    [10] Abdulkarim Hassan Ibrahim, Poom Kumam, Auwal Bala Abubakar, Umar Batsari Yusuf, Seifu Endris Yimer, Kazeem Olalekan Aremu . An efficient gradient-free projection algorithm for constrained nonlinear equations and image restoration. AIMS Mathematics, 2021, 6(1): 235-260. doi: 10.3934/math.2021016
  • In this paper, we investigate the distributed approximate subgradient-type method for minimizing a sum of differentiable and non-differentiable convex functions subject to nondifferentiable convex functional constraints in a Euclidean space. We establish the convergence of the sequence generated by our method to an optimal solution of the problem under consideration. Moreover, we derive a convergence rate of order O(N1a) for the objective function values, where a(0.5,1). Finally, we provide a numerical example illustrating the effectiveness of the proposed method.



    Let Rk be a Euclidean space with inner product , and its induced norm . Let m be a fixed natural number. In this work, we focus on the convex optimization problem of the following form:

    minimizeϕ(x):=f(x)+h(x),subject toxX:=X0mi=1Lev(gi,0), (1.1)

    where f:RkR is a real-valued differentiable convex function, h:RkR is a real-valued (possibly) non-differentiable convex function, and the constrained set X is the intersection of a simple closed convex set X0Rk and a finite number of a level set Lev(gi,0):={xRk:gi(x)0} of a real-valued convex function gi:RkR for all index i=1,,m. Throughout this work, we denote by X and ϕ the set of all minimizers and the optimal value of the problem (1.1), respectively. The problem in the form of problem (1.1) arises in some practical situations such as image processings [1,2,3], signal recovery [4,5], and statistics [6,7,8], to name but a few.

    As the function f is a differentiable convex function and the function h is a convex function, it is well known that the objective function f+h in the problem (1.1) is, of course, a non-differentiable convex function. Therefore, one might attempt to solve problem (1.1) by using existing methods for non-differentiable convex optimization problems, for instance, the subgradient methods or proximal methods. It has been suggested and discussed that the proximal algorithm is generally preferable to the subgradient algorithm since it can converge without any additional assumption on step-size sequence and can archive a convergence rate of order O(1/N) for the objective function values. Nevertheless, computing the proximal operator for the sum of functions can be challenging. In this situation, methods for solving problems with the additive structure of the objective function, like the problem (1.1), often utilize the specific structure of each function f and h when constructing the solution methods; see [9] for more information. Focusing on iterative methods for dealing with the objective function in the form of the sum of two convex functions, the well-known one is nothing else than the so-called proximal gradient method, which suggests constructing a sequence {xn}n=0 as follows:

    xn=argminxX{h(x)+12αn1xxn12+f(xn1),xxn1},n1, (1.2)

    where αn is a positive step size, and f(xn) is a gradient of f at xn.

    Let us notice that the proximal gradient method given in (1.2) may not be appropriate for the problem (1.1) by virtue of the fact that the constraint set X of the current form is the intersection of a finite number of closed convex sets. This is because, in updating the iterate xn+1 for every iteration n0, one is required to solve a constrained optimization subproblem over the intersection of finitely many closed convex sets. To tackle this, Nedić and Necoara [10] proposed a subgradient type method [10, Methods (2a)–(2c)] for solving the problem (1.1) in the case when the objective function is only a function h with a strong convexity assumption and the constrained set is an infinite number of constraint functions. The strategy is to separate the problem into two parts, namely the step for minimizing the objective function ϕ over the simple set X0, and the second one is a parallel computation for the feasible intersection mi=1Lev(gi,0) via the classical subgradient scheme of each constraint function gi for all index i=1,,m. They analyzed its convergence results and showed that the method had a sublinear convergence rate. Note that this strategy reduces the difficulty of dealing with the whole constrained set by minimizing the function over a simple set and then minimizing feasibility violations through parallel computation on each component of the functional constraints.

    Since the calculation of the subgradient of each function gi is needed in the feasibility update, it may face time-consuming difficulty due to the complication structures of the functions gi, i=1,,m. To overcome this drawback, the concept of the approximate subgradient of the functions gi has been utilized. Apart from this mentioned issue, the concept of approximate subgradient also arises in the duality theorem [11] and network optimization [12]. Moreover, the notion of an approximate subgradient has been widely studied in various aspects when solving optimization problems, such as ϵ-subgradient methods [13,14,15], projection ϵ-subgradient methods [16,17], and its variant methods [18,19,20,21]. Even if the main contributions of the approximate subgradient type methods is to reduce the complication of the subgradient computation, it can be noted that within some acceptable error tolerance ϵ, the method with an approximate subgradient can improve the efficiency of its non-error of tolerance type method, (see Table 1 below).

    Table 1.  Comparison of algorithm runtimes for different choices of error tolerances ϵn,i=ϵ(n+1)2 for all i=1,,m.
    k m ϵ=0 ϵ=0.1 ϵ=0.3 ϵ=0.5 ϵ=0.7 ϵ=0.9
    1 50 0.1149 0.1047 0.1050 0.1049 0.1055 0.1049
    100 0.2075 0.2181 0.2078 0.2102 0.2123 0.2076
    500 1.0343 1.0512 1.0258 1.0838 1.0251 1.0256
    2 50 0.1918 0.1913 0.1912 0.1912 0.1913 0.1911
    100 0.5473 0.5369 0.5303 0.5298 0.5299 0.5290
    500 4.1264 4.1254 4.1258 4.1381 4.1371 4.1382
    3 50 0.2924 0.2920 0.2919 0.2913 0.2916 0.2912
    100 0.9176 0.9173 0.9181 0.9155 0.9144 0.9146
    500 6.2060 6.2058 6.2070 6.2039 6.2050 6.2054
    4 50 0.3440 0.3440 0.3439 0.3444 0.3439 0.3439
    100 0.8419 0.8418 0.8422 0.8428 0.8419 0.8417
    500 5.9674 5.9638 5.9637 5.9644 5.9637 5.9645
    5 50 0.3977 0.3973 0.3974 0.3978 0.3977 0.3976
    100 0.8236 0.8245 0.8241 0.8246 0.8244 0.8256
    500 9.7686 9.7669 9.7706 9.7702 9.7689 9.7716
    10 50 0.5751 0.5698 0.5478 0.5472 0.5310 0.5144
    100 1.4231 1.4067 1.3932 1.3783 1.3651 1.3489
    500 10.3818 10.2863 10.2342 10.1825 10.1492 10.1352
    20 50 2.2897 2.2592 2.2045 2.1485 2.1349 2.0676
    100 5.6454 5.5774 5.4425 5.3605 5.3188 5.2937
    500 14.6842 14.6564 14.6162 14.5711 14.5680 14.5123
    30 50 4.1547 4.1089 4.0093 3.9392 3.8959 3.8518
    100 6.4662 6.4111 6.3702 6.3347 6.2562 6.2411
    500 16.6648 16.6611 16.6482 16.6586 16.6437 16.6342
    40 50 7.0435 7.0567 6.9973 6.8546 6.9748 6.7436
    100 9.9004 9.8689 9.8010 9.6928 9.6127 9.5034
    500 24.9762 24.8886 24.8768 24.9037 24.8722 24.8628

     | Show Table
    DownLoad: CSV

    Motivated by the above discussions, we present in this work a distributed approximate subgradient method based on the ideas of the proximal gradient method and the approximate subgradient method. The main difference between the proposed method and the method proposed by Nedić and Necoara [10] is that we used the proximal gradient method for dealing with the objective function and the approximate subgradient method for the feasibility constrained set. The remainder of this work is organized as follows: In Section 2, we recall the notations and auxiliary results that are needed for our convergence work. In Section 3, we present an approximate subgradient method for solving the considered problems. Subsequently, after building all the needed tools, we investigate the convergence results and the convergence rate in Section 4. In Section 5, we present numerical experiments. Finally, in Section 6, we give a conclusion.

    In this section, we recall some basic definitions and useful facts that will be used in the following sections; readers may consult the books [22,23,24].

    Let f:RkR be a real-valued function. We call f a convex function if

    f((1α)x+αy)(1α)f(x)+αf(y),

    for all x,yRk and α(0,1). For each αR, an α-level set (in short, level set) of f at the level α is defined by

    Lev(f,α):={xRk:f(x)α}.

    Note that, if f is continuous, the α-level set Lev(f,α) is a closed set for all αR. Moreover, if f is a convex function, then its α-level set Lev(f,α) is a convex set for all αR.

    For a given xRk and ϵ0, we call a vector sf(x)Rk an ϵ-subgradient of f at x if

    f(y)f(x)+sf(x),yxϵ,

    holds for all yRk. The set of all ϵ-subgradients of f at x is denoted by ϵf(x) and is called the ϵ-subdifferential of f at x. In the case of ϵ=0, we obtain a (usual) subgradient of f at x and denoted the subdifferential set by f(x):=0f(x). For a convex function f:RkR and xRk, we note that

    f(x)ϵ1f(x)ϵ2f(x),

    for all ϵ1,ϵ20 with ϵ1<ϵ2.

    We note that convexity is a sufficient condition for approximate subdifferentiability. Namely, for a convex function f:RkR, the ϵ-subdifferential set ϵf(x) is a nonempty set for all xRk and ϵ0, see [24, Theorem 2.4.9] for more details. Additionally, if X0Rk is a nonempty bounded set, then the set xX0ϵf(x) is a bounded set for all ϵ0, see [24, Theorem 2.4.13].

    Let X0Rk be a nonempty closed convex set, and xRk. The normal cone to X0 at x is given by

    NX0(x):={yRk:y,zx0,zX0}.

    The indicator function of X0, δX0:Rk(,], is the function defined by

    δX0(x)={0if xX0,if xX0.

    If the function f:RkR is convex and the set X0Rk is nonempty closed convex, then for every xX0, we have

    (f+δX0)(x)=f(x)+NX0(x).

    Let f:RkR be a function, and X0Rk be a nonempty closed convex set. The set of all minimizers of f over X0 is denoted by

    argminxX0f(x):={zX0:f(z)f(x),xX0}.

    If the function f:RkR is convex and the set X0Rk is a nonempty closed convex, then xargminxX0f(x) if and only if,

    0f(x)+NX0(x),

    that is, there exists sf(x)f(x) for which

    sf(x),xx0,

    for any xX0.

    Let X0Rk be a nonempty closed convex set, and xRk. We call a point yX0 the projection of x onto X0 if

    yxzx,

    for all zX0 and denoted by y=:PX0(x). It is well known that the projection onto X0 is uniquely determined. Actually, we denote the distance from x to X0 by dist(x,X0):=PX0(x)x.

    In this section, we start our investigation by proposing a distributed approximate subgradient method for solving the considered constrained convex optimization problem (1.1). We subsequently discuss the important assumptions for analyzing the convergence behaviors of the proposed method.

    Some important comments relating to Algorithm 1 are in order.

    Algorithm 1: Distributed approximate subgradient method.
    Initialization: Given a step size {αn}n=0(0,), error tolerances {ϵn,i}n=1[0,) for all i=1,2,,m, and put d0Rk{0}. Let the initial point x0X0 be arbitrary.
    Iterative Step: For an iterate xn1X0(n=1,2,3,), compute
    vn=argminuX0{h(u)+12αn1uxn12+f(xn1),uxn1}.
    For i=1,2,,m, compute
    dn,i{ϵn,ig+i(vn){0}if g+i(vn)>0,{d0}if g+i(vn)=0,
    where g+i(vn)=max{gi(vn),0}, and compute
    zn,i=vng+i(vn)(max{dn,i,1})2dn,i,i=1,2,,m.
    Compute
    ˉzn=1mmi=1zn,i,
    and
    xn=PX0(ˉzn).
    Update n:=n+1

    Remark 3.1. (i) Since the objective function h()+12αn1xn12+f(xn1),xn1 is a strongly convex function and the constrained set X0Rk is a nonempty closed convex set, we can ensure the existence and uniqueness of its minimizer, namely the iterate vn for all n1. This means that the iterate vn is well-defined for all n1.

    (ii) Since the function gi is a real-valued convex function, we note that the function g+i=max{gi,0} is also a convex function. This implies that the ϵn,i-subdifferential set ϵn,ig+i(vn) is nonempty for all n1. Moreover, if g+i(vn)>0, it follows that 0g+i(vn). Indeed, since Yi and Yi={xRk:g+i(x)0}, there exists a point xRk such that g+i(x)0 and hence minxRkg+i(x)0<g+i(vn) which implies that vn is not a minimizer of the function g+i, and hence 0g+i(vn). Also, it follows from the properties of the ϵn,i-subdifferential set that g+i(vn)ϵn,ig+i(vn) which implies that the well-definiteness choice of a nonzero vector dn,i is guaranteed.

    The following assumption will play an important role throughout the convergence results of this work:

    Assumption 3.2. The constrained set X0 is bounded.

    As a consequence of Assumption 3.2, we state here the boundedness properties of some related sequences and subdifferential sets as the following proposition.

    Proposition 3.3. The following statements hold true:

    (i) There exists a positive constant M such that for all i=1,,m, we have

    g+i(x)M,

    for all xX0.

    (ii) There exists a positive constant B such that

    max{f(x),sh(x),sϕ(x)}B,

    for all xX0.

    (iii) There exists a positive constant D such that for all n1 and for all i=1,,m, we have

    0<dn,iD.

    (iv) The sequences {xn}n=0, {vn}n=1, {ˉzn}n=1, {dn,i}n=1 and {zn,i}n=1, i=1,,m, are bounded.

    Proof. (ⅰ) For each i=1,2,,m, since the function g+i is continuous and the set X0 is compact, we obtain that the image of g+i is also bounded over the set X0. Hence, such an M>0 exists.

    (ⅱ) Since the ϵ-subdifferential set is bounded on a bounded set X0, for each ϵ0, we get that the vectors f(x),sh(x), and sϕ(x) are bounded for all xX0, which implies (ⅱ).

    (ⅲ) By the same reasoning in (ⅱ) and the definition of vn and dn,i that dn,i is bounded for all n1 and for all i=1,2,,m.

    (ⅳ) The boundedness of X0 implies that the sequences {xn}n=0 and {vn}n=1 are bounded, while the boundedness of dn,i and vn and the definition of ¯zn and zn,i implies that {ˉzn}n=1 and {zn,i}n=1,i=1,2,,m, are bounded.

    The following conditions on parameters are needed to guarantee the convergence result of Algorithm 1.

    Assumption 3.4. The sequence {αn}n=0 and {ϵn,i}n=1,i=1,,m, satisfy the following properties:

    (i) n=0αn= and n=0α2n<.

    (ii) n=1ϵn,i< for all i=1,2,,m.

    Remark 3.5. One may choose sequences αn:=α(n+1)a and ϵn,i:=ϵ(n+1)b, where a,b,α,ϵ>0 such that a(12,1] and b>1, for particular examples of the sequences {αn}n=0 and {ϵn,i}n=1, i=1,2,,m, satisfy Assumption 3.4.

    The following assumption forms a key tool in proving the convergence result.

    Assumption 3.6. There exists a real number c>0 such that

    dist2(x,X)cmmi=1(g+i(x))2for all xX0.

    Remark 3.7. Assumption 3.6 can be seen as a deterministic version of the assumptions proposed in [25, Assumption 2] and [10, Assumption 2]. The condition given in Assumption 3.6 is also related to the notion of linear regularity of a finite collection of sets; see [25, pages 231–232] for further details. A simple example of the constraint set X that satisfies Assumption 3.6 is, for instance, X:=X0Y1Y2 where X0:=[0,1]×[0,1], Y1={x:=(x1,x2)R2:g1(x):=x1x20} and Y2={x:=(x1,x2)R2:g2(x):=2x1+x20}. It can be seen that for all c1, we have dist2(x,X)c2((g+1(x))2+(g+2(x))2) for all xX0.

    In this section, we will consider the convergence properties of the generated sequences. We divide this section into three parts. Namely, we start with the first subsection by providing some useful technical relations for the generated sequences. We subsequently prove the convergence of the generated sequences to an optimal solution in the second subsection. We close this section by deriving the rate of convergence of the function values of iterate to the optimal value of the considered problem.

    The following lemma provides an essential relation between the iterates vn+1 and xn which are used to derive some consequence relations for the generated iterates.

    Lemma 4.1. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. Then, for every n0, η>0 and xX0, we have

    vn+1x2xnx21η+1xnvn+122αn(ϕ(xn)ϕ(x))+4(1+η)ηα2nB2.

    Proof. Let n0, η>0, and xX0 be given. We first note that

    vn+1x2=xnx2xnvn+12+2xnvn+1,xvn+1. (4.1)

    Now, it follows from the definition of vn+1 and the optimality condition for constrained optimization that

    0(h()+12αnxn2+f(xn),xn)(vn+1)+NX0(vn+1),

    which is the same as

    1αn(xnvn+1)f(xn)h(vn+1)+NX0(vn+1)=h(vn+1)+δX0(vn+1)=(h+δX0)(vn+1).

    This, along with the facts that xX0 and vn+1X0, yield

    1αn(xnvn+1)f(xn),xvn+1(h+δX0)(x)(h+δX0)(vn+1),=h(x)+δX0(x)h(vn+1)δX0(vn+1),=h(x)h(vn+1),

    or, equivalently, that

    xnvn+1,xvn+1αn(h(x)h(vn+1))+αnf(xn),xvn+1.. (4.2)

    Thus, we employ the relation (4.2) in (4.1) and obtain the following:

    vn+1x2xnx2xnvn+12+2αn(h(x)h(vn+1))+2αnf(xn),xvn+1,=xnx2xnvn+12+2αn(h(x)h(xn))+2αn(h(xn)h(vn+1))+2αnf(xn),xvn+1,=xnx2xnvn+12+2αn(h(x)h(xn))+2αn(h(xn)h(vn+1))+2αnf(xn),xxn+2αnf(xn),xnvn+1. (4.3)

    We note from the first-order characterization of the convex function that

    2αnf(xn),xxn2αn(f(x)f(xn)). (4.4)

    Now, for the upper bound of the term 2αnf(xn),xnvn+1, we note from the well known Young's inequality that

    2αnf(xn),xnvn+12(1+η)ηα2nf(xn)2+η2(1+η)xnvn+122(1+η)ηα2nB2+η2(1+η)xnvn+12. (4.5)

    Moreover, for a given subgradient sh(xn)h(xn), we note that

    2αn(h(xn)h(vn+1))2αnsh(xn),xnvn+1,2(1+η)ηα2nsh(xn)2+η2(1+η)xnvn+122(1+η)ηα2nB2+η2(1+η)xnvn+12. (4.6)

    By using the obtained relations (4.4)–(4.6) in the inequality (4.3), we derive that

    vn+1x2xnx21η+1xnvn+122αn(ϕ(xn)ϕ(x))+4(1+η)ηα2nB2,

    which is nothing else than the required inequality.

    Lemma 4.2. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. Then, for every n0, η>0 and xX, we have

    vn+1x2+αn(ϕ(PX(xn))ϕ(x))xnx21η+1xnvn+12+2αnBPX(xn)xn+4(1+η)ηα2nB2.

    Proof. Let n0, η>0, and xX be given. For a given sϕ(x)ϕ(x), we note from the definition of subgradient that

    ϕ(xn)ϕsϕ(x),xnx,=sϕ(x),PX(xn)x+sϕ(x),xnPX(xn),BPX(xn)xn, (4.7)

    where the second inequality holds true by the necessary and sufficient optimality conditions for convex constrained optimization. Similarly, for a given sϕ(PX(xn))ϕ(PX(xn)), we have

    ϕ(xn)ϕ=ϕ(xn)ϕ(PX(xn))+ϕ(PX(xn))ϕ(x),sϕ(PX(xn)),PX(xn)xn+ϕ(PX(xn))ϕ(x),BPX(xn)xn+ϕ(PX(xn))ϕ(x). (4.8)

    By adding these two obtained relations (4.7) and (4.8) and subsequently multiplying by 12, we obtain

    ϕ(xn)ϕ12(ϕ(PX(xn))ϕ(x))BPX(xn)xn.

    Applying this together with the inequality in Lemma 4.1, we obtain that

    vn+1x2+αn(ϕ(PX(xn))ϕ)xnx21η+1xnvn+12+2αnBPX(xn)xn+4(1+η)ηα2nB2,

    as desired.

    We now derive a relation between the iterates vn and zn,i.

    Lemma 4.3. Let {vn}n=1 and {zn,i}n=1, i=1,,m, be the sequences generated by Algorithm 1. Then, for every n1, i=1,2,,m and xX, we have

    zn,ix2vnx2(g+i(vn))2(max{dn,i,1})2+2g+i(vn)ϵn,i.

    Proof. Let n1, i=1,2,,m and xX be given. We note from the definition of zn,i that

    zn,ix2=vng+i(vn)(max{dn,i,1})2dn,ix2=vnx22g+i(vn)(max{dn,i,1})2vnx,dn,i+g+i(v)(max{dn,i,1})2dn,i2=vnx2+2g+i(vn)(max{dn,i,1})2xvn,dn,i+(g+i(vn))2(max{dn,i,1})4dn,i2. (4.9)

    If g+i(vn)=0, then it is clearly that

    zn,ix2=vnx2.

    We now consider the case g+i(vn)>0 as follows: Since dn,iϵn,ig+i(vn){0}, we note that

    0=g+i(x)xvn,dn,i+g+i(vn)ϵn,i,

    which is

    xvn,dn,ig+i(vn)+ϵn,i.

    We also note that

    (dn,imax{dn,i,1})21,

    and

    1(max{dn,i,1})21.

    Applying these obtained relations in (4.9), we get

    zn,ix2vnx2(g+i(vn))2(max{dn,i,1})2+2g+i(vn)ϵn,i,

    as desired.

    In order to derive the relation between dist2(xn,X) and dist2(vn,X), we need the following fact:

    Proposition 4.4. Let z1,z2,,zmRk and ˉz=1mmi=1zi be given. Then, for every xRk

    ˉzx2=1mmi=1zix21mmi=1ziˉz2.

    Proof. It is straightforward based on the properties of the inner product and norm.

    We are now considering the relation between dist2(xn,X) and dist2(vn,X) in the following lemma.

    Lemma 4.5. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. Then, for every n1, we have

    dist2(xn,X)dist2(vn,X)1m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i,

    where ¯D=max{D,1}.

    Proof. Let n1 be given. Since xn=PX0(ˉzn), it is noted, for all xXX0, that

    xnx2ˉznx2xnˉzn2. (4.10)

    Since ˉzn=1mmi=1zn,i, we note from Proposition 4.4 that for all xX,

    ˉznx2=1mmi=1zn,ix21mmi=1zn,iˉzn2,

    which implies that the inequality (4.10) becomes

    xnx21mmi=1zn,ix21mmi=1zn,iˉzn2xnˉzn2, (4.11)

    for all xX. By summing up the inequality obtained in Lemma 4.3 for i=1 to m, and, subsequently, dividing by m for both sides of the inequality, we have

    1mmi=1zn,ix2vnx21mmi=1(g+i(vn))2(max{dn,i,1})2+2mmi=1g+i(vn)ϵn,i.

    This, together with inequality (4.11), means that for all xX,

    xnx2vnx21mmi=1(g+i(vn))2(max{dn,i,1})2+2mmi=1g+i(vn)ϵn,i1mmi=1zn,iˉzn2xnˉzn2.

    Invoking the bounds given in Proposition 3.3, we have for every xX that

    xnx2vnx21m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i. (4.12)

    Putting x=PX(vn)X in the inequality (4.12), we obtain

    xnPX(vn)2vnPX(vn)21m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i,

    and hence

    dist2(xn,X)dist2(vn,X)1m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i.

    As a consequence of the preceding lemmas, we obtain the following relation that is used to prove the convergence of the sequence {vnx2}n=1 for all xX.

    Lemma 4.6. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. Then, for every n0, η>0 and xX, we have

    vn+1x2+αn(ϕ(PX(xn))ϕ)vnx21η+1xnvn+121m¯D2mi=1(g+i(vn))2+ηdist2(xn,X)+2Mmmi=1ϵn,i+5+4ηηα2nB2,

    where ¯D=max{D,1}.

    Proof. Let n1, η>0, and xX be given. By using the inequality (4.12) and replacing xX by x, which also belongs to X, we note that

    xnx2vnx21m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i.

    Furthermore, by applying Young's inequality, we note that

    2αnBPX(xn)xn=2(αnη1B)(ηPX(xn)xn)1ηα2nB2+ηPX(xn)xn2.

    Invoking these two relations in the inequality obtained in Lemma 4.2, we get that

    vn+1x2+αn(ϕ(PX(xn))ϕ)vnx21η+1xnvn+121m¯D2mi=1(g+i(vn))2+2Mmmi=1ϵn,i+1ηα2nB2+ηPX(xn)xn2+4(1+η)ηα2nB2,

    which is the required inequality.

    In order to obtain the existence of the limit of the sequence {vnx}n=1, we need the following proposition:

    Proposition 4.7. [26] Let {an}n=1, {bn}n=1 and {cn}n=1 be sequences of nonnegative real numbers. If it holds that an+1an+bncn for all n1, and n=1bn<, then limnan exists and n=1cn<.

    Now, we are in a position to prove the main convergence theorem. The theorem guarantees that both generated sequences {vn}n=1 and {xn}n=0 converge to a point in the solution set X.

    Theorem 4.8. The sequences {vn}n=1 and {xn}n=0 generated by Algorithm 1 converge to an optimal solution in X.

    Proof. Let n1 be given. Since vnX0, it follows from Assumption 3.5 that there exists a constant c>0 such that

    dist2(vn,X)cmmi=1(g+i(vn))2. (4.13)

    Now, putting ¯c>0 such that

    ¯c>max{c,1¯D2}, (4.14)

    we have

    0<q:=1¯c¯D2<1. (4.15)

    This, together with (4.13) and Lemma 4.5, imply that

    1¯cdist2(xn,X)1¯cdist2(vn,X)1¯cm¯D2mi=1(g+i(vn))2+2M¯cmmi=1ϵn,i,(1q)1mmi=1(g+i(vn))2+2M¯cmmi=1ϵn,i,

    and then

    1mmi=1[g+i(vn)]21¯c(1q)dist2(xn,X)2M¯cm(1q)mi=1ϵn,i.

    By applying this obtained relation together with the inequality in Lemma 4.6, we have for all η>0

    vn+1x2+αn(ϕ(PX(xn))ϕ(x))vnx21η+1xnvn+12+2Mmmi=1ϵn,i1ˉc¯D2(1q)dist2(xn,X)+2Mm¯c¯D2(1q)mi=1ϵn,i+ηdist2(xn,X)+5+4ηηα2nB2=vnx21η+1xnvn+12(q(1q)η)dist2(xn,X)+5+4ηηα2nB2+2Mm(1q)mi=1ϵn,i.

    Now, by putting η:=q(1q)>0 in the above inequality, we can neglect the non-negative term (q(1q)η)dist2(xn,X) so that the above inequality can be written as

    vn+1x2+αn(ϕ(PX(xn))ϕ(x))vnx21η+1xnvn+12+5+4ηηα2nB2+2Mm(1q)mi=1ϵn,i, (4.16)

    which implies that

    vn+1x2vnx21η+1xnvn+12+5+4ηηα2nB2+2Mm(1q)mi=1ϵn,i. (4.17)

    Invoking Assumption 3.4 (ⅱ) together with Proposition 4.7 in (4.17), we conclude that the limit

    limnvnxexists,

    and, as well as,

    limnxnvn+1=0. (4.18)

    On the other hand, by applying the inequality (4.12) in the relation obtained in Lemma 4.1 by replacing x by x, we note that

    2αn(ϕ(xn)ϕ)vnx2vn+1x21m¯D2mi=1(g+i(vn))2+4(1+η)ηα2nB2+2Mmmi=1ϵn,i, (4.19)

    and then

    2αn(ϕ(xn)ϕ)vnx2vn+1x2+4(1+η)ηα2nB2+2Mmmi=1ϵn,i.

    Now, let us fix a positive integer N1. Summing up the above relation for n=1 to N, we have

    2Nn=1αn(ϕ(xn)ϕ)v1x2+4(1+η)ηB2Nn=1α2n+2Mmmi=1Nn=1ϵn,i. (4.20)

    By approaching N, we obtain that

    n=1αn(ϕ(xn)ϕ)<.

    Next, we show that lim infn(ϕ(xn)ϕ)0. Suppose, to the contrary, that there exists NN and β>0 such that ϕ(xn)ϕβ for all nN. Since

    =βn=Nαnn=Nαn(ϕ(xn)ϕ)<,

    we also have lim infn(ϕ(xn)ϕ)0, that is, lim infnϕ(xn)ϕ.

    Now, since the sequence {xn}n=1 is bounded, there exists a subsequence {xnl}l=1 of {xn}n=1 such that limlϕ(xnl)=lim infnϕ(xn)ϕ. Moreover, since {xnl}l=1 is also a bounded sequence, there exists a subsequence {xnlj}i=1 of {xnl}l=1 and a point ˆxRk in which xnljˆx. Thus, by using (4.18), we have vnljˆx.

    On the other hand, by utilizing the relation in (4.19), we note that for all j1

    1m¯D2mi=1(g+i(vnlj))2vnljx2vnlj+1x2+4(1+η)ηα2nljB2+2Mmmi=1ϵnlj,i+2αnlj|ϕ(xnlj)ϕ|.

    Now, by using the fact that the sequence {|ϕ(xn)ϕ|}n=1 is bounded, the continuity of each g+i, i=1,2,,m and letting j, we obtain

    g+i(ˆx)=0    for all i=1,2,,m,

    which implies that ˆxmi=1Lev(gi,0). Since the sequence {xnlj}j=1X0, the closedness of X0 yields that ˆxX0, and hence ˆxX:=X0mi=1Lev(gi,0).

    Finally, by using the continuity of ϕ, we obtain that

    ϕ(ˆx)=limjϕ(xnlj)=limlϕ(xnl)ϕ,

    which implies that ˆxX. Therefore, we conclude that the sequence {vn}n=1 converges to the point ˆxX. This yields that the sequnce {xn}n=1 also converges to the point ˆxX. This completes the proof.

    In this subsection, we consider the rate of convergence of the objective values {ϕ(xn)}n=0 to the optimal value ϕ.

    Theorem 4.9. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. Then, for a positive integer N1, it holds that

    min1nNϕ(xn)ϕdist2(v1,X)+4¯c¯D2B2Nn=1α2n+2Mmmi=1Nn=1ϵn,i2Nn=1αn,

    where ¯c>max{c,1¯D2} and ¯D=max{D,1}.

    Proof. We note from the inequality (4.20) that for all positive integers N1

    2Nn=1αn(ϕ(xn)ϕ)v1x2+4(1+η)ηB2Nn=1α2n+2Mmmi=1Nn=1ϵn,i,

    which implies that

    min1nNϕ(xn)ϕv1x2+4(1+η)ηB2Nn=1α2n+2Mmmi=1Nn=1ϵn,i2Nn=1αn.

    By using the property of ¯c given in (4.14) that ¯c>max{c,1¯D2}, the definition of q in (4.15) that 0<q:=1¯c¯D2<1 and the definition of η that η:=q(1q)>0, we note that

    (1+η)η=1q=¯c¯D2.

    This yields that

    min1nNϕ(xn)ϕv1x2+4¯c¯D2B2Nn=1α2n+2Mmmi=1Nn=1ϵn,i2Nn=1αn,

    and hence

    min1nNϕ(xn)ϕdist2(v1,X)+4¯c¯D2B2Nn=1α2n+2Mmmi=1Nn=1ϵn,i2Nn=1αn,

    as required.

    To obtain the convergence rate of the objective function, we need the following proposition:

    Proposition 4.10. [22, Lemma 8.26] Let f:[a1,b+1]R be a continuous, nonincreasing real-valued function over [a1,b+1], where a and b are integers such that ab. Then

    b+1af(t)dtf(a)+f(a+1)++f(b)ba1f(t)dt.

    We close this subsection by considering a particular stepsize sequence {αn}n=0 in Theorem 4.9 to obtain the O(1N1a) rate of convergence of the function values of iterate to the optimal value of the considered problem, where a(0.5,1).

    Corollary 4.11. Let {xn}n=0 and {vn}n=1 be the sequences generated by Algorithm 1. If the sequence {αn}n=0 is given by

    αn:=1(n+1)a,

    for all n0, where a(0.5,1), then for a positive integer N1, it holds that

    min1nNϕ(xn)ϕO(1N1a).

    Proof. Let us note from Proposition 4.10 that

    Nn=11(n+1)2aN01(t+1)2adt12a+1,

    and

    Nn=11(n+1)aN+111(t+1)adt(N+2)1a21a1a,

    which implies that

    (Nn=11(n+1)a)11a(N+2)1a21a(1a)1(N+2)1a21a(N+3)1a(N+3)a1.

    Furthermore, we note that (N+2N+3)1a(34)1a and (2N+3)1a(12)1a, we have

    (Nn=11(n+1)a)11a(N+2)1a21a(1a)1(34)1a(12)1a(N+3)a1.

    Hence, by putting M1:=2Mmmi=1n=1ϵn,i and applying the inequality derived in Theorem 4.9, we obtain that

    min1nNϕ(xn)ϕdist2(v1,X)+M1+4¯c¯D2B2Nn=1α2n2Nn=1αn(1a)2dist2(v1,X)+M1+4¯c¯D2B22a+1(34)1a(12)1a(N+3)a1O(1N1a),

    and the proof is completed.

    In this section, we consider the numerical behaviors of the proposed method (Algorithm 1) for solving the minimum-norm solution to the intersection of a finite number of closed balls and a box constraint of the following form. Let ciRk, i=1,,m, be given vectors, and a,bR. The problem is to find a vector xRk that solves the problem

    minimize0.5x2,subject toxci1,i=1,2,,m,x[a,b]k.

    This problem can be written in another form as

    minimize0.5x2,subject tox[a,b]kmi=1Lev(xci1,0), (5.1)

    which is clearly a particular situation of the problem (1.1) in the case when f(x)=0.5x2, h(x)=0, gi(x)=xci1 for all i=1,,m, and X0=[0,1.5]k.

    In the first experiment, we examine the influence of the step size αn:=α(n+1). We perform Algorithm 1 for solving the problem (5.1) when the number of target sets m=1000. We set ϵn,i=0 for all n0 and i=1,,m. We choose the vector ci by randomly choosing its coordinates in the interval (1,1.5), and the initial vector x0 is the vector whose all coordinates are 0.1. We consider the parameter α[0.1,0.9] in the step size αn and the dimensions k=2,5 and 10. We perform 10 independent random tests for each choice of α and k and terminate Algorithm 1 when the relative error vn+1vnvn+1 reaches the optimal error tolerance of 105. The averaged results of computational runtimes in seconds and the number of iterations and computational runtimes in seconds for each choice of α and k are plotted in Figure 1. As we can see from Figure 1, for each dimension k, the parameter α=0.1 gives the least number of iterations and computational runtimes.

    Figure 1.  Comparison of number of iterations and computational runtime for different choices of step sizes αn:=α(n+1).

    In the following experiment, we examine the influence of the error tolerances {ϵn,i}n=1 for all i=1,2,,m. We consider the parameter ϵ=0,0.1,0.3,0.5,0.7 and 0.9 in the error tolerance ϵn,i=ϵ(n+1)2 with various dimensions k and number of target sets m. It is noted that for this tested problem, Algorithm 1 with ϵ=0 is a particular case of the method proposed by Nedic and Necoara [10], where the batch size is m. We set the step size αn:=0.1(n+1) and performed the above experiment. The averaged results of computational runtimes in seconds are given in Table 1.

    One can see from the results presented in Table 1 that the averaged runtime increases for all k and m increases. It can be seen that the proposed method with the error-tolerance parameter ϵ0 requires less averaged runtimes compared to the case when ϵ=0 for almost all the number of target sets m. Even if we can not point out which choice of error-tolerance parameters ϵ0 yields the best performance for all k and m, the results show us that the averaged runtime can be improved by some suitable choices of error tolerances. This also underlines the benefit of the approximate subgradient-type method proposed in this work.

    We concentrated on addressing the convex minimization problem across the intersection of a finite number of convex level sets. Our approach centered on introducing the distributed approximate subgradient method tailored to tackle this particular problem. To guarantee the convergence of our proposed method, we provided a rigorous proof demonstrating that the sequences generated by the method converge to an optimal solution. Furthermore, the O(1N1a) rate of convergence of the function values of iterate to the optimal value of the considered problem, where a(0.5,1). Additionally, we illustrated our findings through several numerical examples aimed at examining the impact of error tolerances. While identifying the optimal error tolerance remains a significant consideration, our experimental results indicate that the average runtime can be enhanced by selecting suitable nonzero error tolerances as opposed to omitting them altogether. This observation suggests an intriguing avenue for future research exploration.

    Jedsadapong Pioon: Conceptualization, Methodology, Validation, Convergence analysis, Investigation, Writing-original draft preparation, Writing-review and editing; Narin Petrot: Conceptualization, Methodology, Software, Validation, Convergence analysis, Investigation, Writing-review and editing, Visualization; Nimit Nimana: Conceptualization, Methodology, Software, Validation, Convergence analysis, Investigation, Writing-original draft preparation, Writing-review and editing, Visualization, Supervision, Project administration, Funding acquisition. All authors have read and approved the final version of the manuscript for publication.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    The authors would like to thank three anonymous referees for their useful suggestions and comments.

    This work has received funding support from the NSRF via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation [grant number B05F650018].

    The authors declare that there is no conflict of interest regarding the publication of this article.



    [1] M. A. T. Figueiredo, R. D. Nowak, An EM algorithm for wavelet-based image restoration, IEEE T. Image Process., 12 (2003), 906–916. http://dx.doi.org/10.1109/TIP.2003.814255 doi: 10.1109/TIP.2003.814255
    [2] M. A. T. Figueiredo, J. M. Bioucas-Dias, R. D. Nowak, Majorization-minimization algorithms for wavelet-based image restoration, IEEE T. Image Process., 16 (2007), 2980–2991. http://dx.doi.org/10.1109/TIP.2007.909318 doi: 10.1109/TIP.2007.909318
    [3] A. Beck, M. Teboulle, Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems, IEEE T. Image Process., 18 (2009), 2419–2434. http://dx.doi.org/10.1109/TIP.2009.2028250 doi: 10.1109/TIP.2009.2028250
    [4] A. Beck, M. Teboulle, 2-Gradient-based algorithms with applications to signal-recovery problems, Cambridge: Cambridge University Press, 2009, 42–88. https://doi.org/10.1017/CBO9780511804458.003
    [5] P. L. Combettes, V. R. Wajs, Signal recovery by proximal forward-backward splitting, Multiscale Model. Sim., 4 (2005), 1168–1200. https://doi.org/10.1137/050626090 doi: 10.1137/050626090
    [6] I. Necoara, General convergence analysis of stochastic first-order methods for composite optimization, J. Optim. Theory Appl., 189 (2021), 66–95. https://doi.org/10.1007/s10957-021-01821-2 doi: 10.1007/s10957-021-01821-2
    [7] Y. Nesterov, Gradient methods for minimizing composite functions, Math. Program., 140 (2013), 125–161. https://doi.org/10.1007/s10107-012-0629-5 doi: 10.1007/s10107-012-0629-5
    [8] V. N. Vapnik, Statistical learning theory, New York: John Wiley & Sons, 1998.
    [9] D. P. Bertsekas, Convex optimization algorithms, Belmont: Athena Scientific, 2015.
    [10] A. Nedić, I. Necoara, Random minibatch subgradient algorithms for convex problems with functional constraints, Appl. Math. Optim., 80 (2019), 801–833. https://doi.org/10.1007/s00245-019-09609-7 doi: 10.1007/s00245-019-09609-7
    [11] D. P. Bertsekas, Nonlinear programming, J. Oper. Res. Soc., 48 (1997), 334. https://doi.org/10.1057/palgrave.jors.2600425
    [12] A. Nedić, D. P. Bertsekas, The effect of deterministic noise in subgradient methods, Math. Program., 125 (2010), 75–99. https://doi.org/10.1007/s10107-008-0262-5 doi: 10.1007/s10107-008-0262-5
    [13] S. Bonettini, A. Benfenati, V. Ruggiero, Scaling techniques for ϵ-Subgradient methods, SIAM J. Optimiz., 26 (2016), 1741–1772. https://doi.org/10.1137/14097642X doi: 10.1137/14097642X
    [14] E. S. Helou, L. E. A. Simões, ϵ-subgradient algorithms for bilevel convex optimization, Inverse Probl., 33 (2017), 055020. https://doi.org/10.1088/1361-6420/aa6136 doi: 10.1088/1361-6420/aa6136
    [15] R. D. Millán, M. P. Machado, Inexact proximal ϵ-subgradient methods for composite convex optimization problems, J. Global Optim., 75 (2019), 1029–1060. https://doi.org/10.1007/s10898-019-00808-8 doi: 10.1007/s10898-019-00808-8
    [16] Y. I. Alber, A. N. Iusem, M. V. Solodov, On the projected subgradient method for nonsmooth convex optimization in a Hilbert space, Math. Program., 81 (1998), 23–35. https://doi.org/10.1007/BF01584842 doi: 10.1007/BF01584842
    [17] K. C. Kiwiel, Convergence of approximate and incremental subgradient methods for convex optimization, SIAM J. Optimiz., 14 (2004), 807–840. https://doi.org/10.1137/S1052623400376366 doi: 10.1137/S1052623400376366
    [18] R. S. Burachik, J. E. Martínez-Legaz, M. Rezaie, M. Théra, An additive subfamily of enlargements of a maximally monotone operator, Set-Valued Var. Anal., 23 (2015), 643–665. https://doi.org/10.1007/s11228-015-0340-9 doi: 10.1007/s11228-015-0340-9
    [19] X. L. Guo, C. J. Zhao, Z. W. Li, On generalized ϵ-subdifferential and radial epiderivative of set-valued mappings, Optim. Lett., 8 (2014), 1707–1720. https://doi.org/10.1007/s11590-013-0691-9 doi: 10.1007/s11590-013-0691-9
    [20] A. Simonetto, H. Jamali-Rad, Primal recovery from consensus-based dual decomposition for distributed convex optimization, J. Optim. Theory Appl., 168 (2016), 172–197. https://doi.org/10.1007/s10957-015-0758-0 doi: 10.1007/s10957-015-0758-0
    [21] M. V. Solodov, B. F. Svaiter, A hybrid approximate extragradient-proximal point algorithm using the enlargement of a maximal monotone operator, Set-Valued Var. Anal., 7 (1999), 323–345. https://doi.org/10.1023/A:1008777829180 doi: 10.1023/A:1008777829180
    [22] A. Beck, First-ordered methods in optimization, Philadelphia: SIAM, 2017. https://doi.org/10.1137/1.9781611974997
    [23] H. H. Bauschke, P. L. Combettes, Convex analysis and monotone operator theory in Hilbert spaces, New York: Springer, 2017. https://doi.org/10.1007/978-3-319-48311-5
    [24] C. Zalinescu, Convex analysis in general vector spaces, Singapore: World Scientific, 2002. https://doi.org/10.1142/5021
    [25] A. Nedić, Random algorithms for convex minimization problems, Math. Program., 129 (2011), 225–273. https://doi.org/10.1007/s10107-011-0468-9 doi: 10.1007/s10107-011-0468-9
    [26] B. T. Polyak, Introduction to optimization, New York: Optimization Software, 1987.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1091) PDF downloads(69) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog