Processing math: 64%
Research article Special Issues

Forecasting net charge-off rates of banks: What model works best?

  • The purpose of this paper is to focus on the losses of two very big banks, Citigroup (Citi) and Wells Fargo & Company (Wells Fargo), and two very small banks, First Busey Corporation (Busey) and Capital City Bank Group (Capital), over the period 1991–2016. The federal government actually bailed out the two big banks, as measured by total assets, whereas neither of the two small banks required a bail out. Clearly, if one is able to use a variety of predictor variables to forecast accurately the losses of banks of various sizes, in different geographical locations, and operating a variety of business models, this may help identify potential causes of future banking problems and thereby lessen, if not eliminate, the need for future bailouts. This is important for both the banks and the bank regulatory authorities. In particular, those banks expected to suffer significant losses on loans may be in a position to increase their provisioning and thus loan loss allowances. If such banks are unable to take this type of action or other corrective action to address expected losses, regulatory action may become necessary in response to this situation. The motivation for our paper is this very issue: can one obtain accurate forecasts of losses, or the net charge-off rates, of banks? We provide an answer to this question by examining the four banks mentioned using several hundred predictor variables and several different forecast techniques.

    Citation: James R. Barth, Sumin Han, Sunghoon Joo, Kang Bok Lee, Stevan Maglic, Xuan Shen. Forecasting net charge-off rates of banks: What model works best?[J]. Quantitative Finance and Economics, 2018, 2(3): 554-589. doi: 10.3934/QFE.2018.3.554

    Related Papers:

    [1] Marco Bramanti, Sergio Polidoro . Fundamental solutions for Kolmogorov-Fokker-Planck operators with time-depending measurable coefficients. Mathematics in Engineering, 2020, 2(4): 734-771. doi: 10.3934/mine.2020035
    [2] Tommaso Barbieri . On Kolmogorov Fokker Planck operators with linear drift and time dependent measurable coefficients. Mathematics in Engineering, 2024, 6(2): 238-260. doi: 10.3934/mine.2024011
    [3] Youchan Kim, Seungjin Ryu, Pilsoo Shin . Approximation of elliptic and parabolic equations with Dirichlet boundary conditions. Mathematics in Engineering, 2023, 5(4): 1-43. doi: 10.3934/mine.2023079
    [4] Gabriel B. Apolinário, Laurent Chevillard . Space-time statistics of a linear dynamical energy cascade model. Mathematics in Engineering, 2023, 5(2): 1-23. doi: 10.3934/mine.2023025
    [5] Marco Sansottera, Veronica Danesi . Kolmogorov variation: KAM with knobs (à la Kolmogorov). Mathematics in Engineering, 2023, 5(5): 1-19. doi: 10.3934/mine.2023089
    [6] Masashi Misawa, Kenta Nakamura, Yoshihiko Yamaura . A volume constraint problem for the nonlocal doubly nonlinear parabolic equation. Mathematics in Engineering, 2023, 5(6): 1-26. doi: 10.3934/mine.2023098
    [7] Zaffar Mehdi Dar, M. Arrutselvi, Chandru Muthusamy, Sundararajan Natarajan, Gianmarco Manzini . Virtual element approximations of the time-fractional nonlinear convection-diffusion equation on polygonal meshes. Mathematics in Engineering, 2025, 7(2): 96-129. doi: 10.3934/mine.2025005
    [8] Edgard A. Pimentel, Miguel Walker . Potential estimates for fully nonlinear elliptic equations with bounded ingredients. Mathematics in Engineering, 2023, 5(3): 1-16. doi: 10.3934/mine.2023063
    [9] Giovanni Cupini, Paolo Marcellini, Elvira Mascolo . Local boundedness of weak solutions to elliptic equations with p,qgrowth. Mathematics in Engineering, 2023, 5(3): 1-28. doi: 10.3934/mine.2023065
    [10] Rita Mastroianni, Christos Efthymiopoulos . Kolmogorov algorithm for isochronous Hamiltonian systems. Mathematics in Engineering, 2023, 5(2): 1-35. doi: 10.3934/mine.2023035
  • The purpose of this paper is to focus on the losses of two very big banks, Citigroup (Citi) and Wells Fargo & Company (Wells Fargo), and two very small banks, First Busey Corporation (Busey) and Capital City Bank Group (Capital), over the period 1991–2016. The federal government actually bailed out the two big banks, as measured by total assets, whereas neither of the two small banks required a bail out. Clearly, if one is able to use a variety of predictor variables to forecast accurately the losses of banks of various sizes, in different geographical locations, and operating a variety of business models, this may help identify potential causes of future banking problems and thereby lessen, if not eliminate, the need for future bailouts. This is important for both the banks and the bank regulatory authorities. In particular, those banks expected to suffer significant losses on loans may be in a position to increase their provisioning and thus loan loss allowances. If such banks are unable to take this type of action or other corrective action to address expected losses, regulatory action may become necessary in response to this situation. The motivation for our paper is this very issue: can one obtain accurate forecasts of losses, or the net charge-off rates, of banks? We provide an answer to this question by examining the four banks mentioned using several hundred predictor variables and several different forecast techniques.


    Several important evolution equations arising in kinetic theory, mathematical physics and probability can be written in the form

    (t+XY)f=Q(f,Xf,X,Y,t), (1.1)

    where (X,Y,t):=(x1,...,xm,y1,...,ym,t)Rm×Rm×R=RN+1, N=2m, m1, and the coordinates X=(x1,...,xm) and Y=(y1,...,ym) are, respectively, the velocity and the position of the system. In its simplest form,

    Q(f,Xf,X,Y,t)=XXf=ΔXf,

    the equation in (1.1) was introduced and studied by Kolmogorov in a famous note published in 1934 in Annals of Mathematics, see [25]. In this case Kolmogorov noted that the equation in (1.1) is an example of a degenerate parabolic operator having strong regularity properties and he proved that the equation has a fundamental solution which is smooth off its diagonal. In fact, in this case the equation in (1.1) is hypoelliptic, see [24].

    In kinetic theory, f represents the evolution of a particle distribution

    f(X,Y,t):UX×UY×R+R,UX, UYRm,

    subject to geometric restrictions and models for the interactions and collisions between particles. In this case the left-hand side in (1.1) describes the evolution of f under the action of transport, with the free streaming operator. The right-hand side describes elastic collisions through the nonlinear Boltzmann collision operator. The Boltzmann equation is an integro- (partial)-differential equation with nonlocal operator in the kinetic variable X. The Boltzmann equation is a fundamental equation in kinetic theory in the sense that it has been derived rigorously, at least in some settings, from microscopic first principles. In the case of so called Coulomb interactions the Boltzmann collision operator is ill-defined and Landau proposed an alternative operator for these interactions, this operator is now called the Landau or the Landau-Coulomb operator. This operator can be stated as in (1.1) with

    Q(f,Xf,X,Y,t)=X(A(f)Xf+B(f)f), (1.2)

    where again A(f)=A(f)(X,Y,t) and B(f)=B(f)(X,Y,t) are nonlocal operators in the variable X. In this case the equation in (1.1) is a nonlinear, or rather quasilinear, drift-diffusion equation with coefficients given by convolution like averages of the unknown. As mentioned above the Landau equation is considered fundamental because of its close link to the Boltzmann equation for Coulomb interactions.

    In the case of long-range interactions, the Boltzmann and Landau-Coulomb operators show local ellipticity provided the solution enjoys some pointwise bounds on the associated hydrodynamic fields and the local entropy. Indeed, assuming certain uniform in (Y,t)UY×I bounds on local mass, energy, and entropy, see [30,33], one can prove that

    0<Λ1IA(f)(X,Y,t)ΛI,|B(f)(X,Y,t)|Λ,

    for some constant Λ1 and for (X,Y,t)UX×UY×I, i.e., under these assumptions the operator Q in (1.2) and in the Landau equation becomes locally uniformly elliptic. As a consequence, and as global well posedness for the Boltzmann equation and the construction of solutions in the large is an outstanding open problem, the study of conditional regularity for the Boltzmann and Landau equations has become a way to make progress on the regularity issues for these equations. We refer to [11,13,14,15,28,33,36,37] for more on the connections between Kolmogorov-Fokker-Planck equations, the Boltzmann and Landau equation, statistical physics and conditional regularity.

    Based on the idea of conditional regularity one is lead to study the local regularity of weak solutions to the equation in (1.1) with

    Q(f,Xf,X,Y,t)=X(A(X,Y,t)Xf)+B(X,Y,t)Xf, (1.3)

    assuming that A is measurable, bounded and uniformly elliptic, and that B is bounded. In [20], see also [21,22,23] for subsequent developments, the authors extended, for equations as in (1.1) assuming (1.3), the De Giorgi-Nash-Moser (DGNM) theory, which in its original form only considers elliptic or parabolic equations in divergence form, to hypoelliptic equations with rough coefficients including the one in (1.1) assuming (1.3). [20] has spurred considerable activity in the field, see below for a literature review, as the results proved give the correct scale- and translation-invariant estimates for local Hölder continuity and the Harnack inequality for weak solutions.

    In this paper we consider equations as in (1.1) with

    Q(f,Xf,X,Y,t)=X(A(Xf,X,Y,t)), (1.4)

    subject to conditions on A which allow A to be a nonlinear function of Xf. In this case we refer to the equations in (1.1) as nonlinear Kolmogorov-Fokker-Planck type equations with rough coefficients. Our contributions is twofold. First, we establish higher integrability (Theorem 1.1) and local boundedness (Theorem 1.2) of weak sub-solutions, weak Harnack and Harnack inequalities (Theorem 1.3), and Hölder continuity with quantitative estimates (Theorem 1.4), for the equation

    (t+XY)u=X(A(Xu,X,Y,t)). (1.5)

    Second, we establish existence and uniqueness, in certain bounded X, Y and t dependent domains, for a Dirichlet problem involving the equation in (1.5) also allowing for boundary data and a right hand side (Theorem 1.5). In the linear case, if A(X,Y,t) is a uniformly elliptic positive definite matrix with bounded measurable coefficients, then A(ξ,X,Y,t)=A(X,Y,t)ξ satisfies the hypothesis we impose on the symbol A, and in this case the equation in (1.5) reduces to the equation

    (t+XY)u=X(A(X,Y,t)Xu). (1.6)

    Concerning regularity, our results therefore generalize [20,22,23], to nonlinear Kolmogorov-Fokker-Planck type equations with rough coefficients.

    To the best of our knowledge, nonlinear equations of the form in (1.5) have so far not been investigated in the literature, and the purpose of this paper is to contribute to the regularity and existence theory for these equations. We believe that generalizations of the De Giorgi-Nash-Moser (DGNM) theory to nonlinear Kolmogorov-Fokker-Planck type equations with rough coefficients are relevant and interesting. We also believe that our treatment of the Dirichlet problem is new and enlightening.

    We consider equations as in (1.5) subject to conditions on A. Concerning the symbol A our baseline assumption is that A belongs to the class M(Λ), where Λ[1,) is a constant. In our treatment of the Dirichlet problem we will need to impose stronger conditions on A and we will assume that A belongs to the class R(Λ). In the following denotes the standard Euclidean scalar product in Rm.

    Definition 1. Let Λ[1,). Then A is said to belong to the class M(Λ) if A=A(ξ,X,Y,t):Rm×Rm×Rm×RRm is continuous with respect to ξ, measurable with respect to X,Y and t, and

    (i)|A(ξ,X,Y,t)|Λ|ξ|,(ii)A(ξ,X,Y,t)ξΛ1|ξ|2,(iii)A(λξ,X,Y,t)=λA(ξ,X,Y,t)λR{0}, (1.7)

    for almost every (X,Y,t)RN+1 and for all ξRm.

    Definition 2. Let Λ[1,). Then A is said to belong to the class R(Λ) if A=A(ξ,X,Y,t):Rm×Rm×Rm×RRm is continuous with respect to ξ, measurable with respect to X,Y and t, and

    (i)|A(ξ1,X,Y,t)A(ξ2,X,Y,t)|Λ|ξ1ξ2|,(ii)(A(ξ1,X,Y,t)A(ξ2,X,Y,t))(ξ1ξ2)Λ1|ξ1ξ2|2,(iii)A(λξ,X,Y,t)=λA(ξ,X,Y,t)λR{0}, (1.8)

    for almost every (X,Y,t)RN+1 and for all ξ1,ξ2,ξRm.

    Remark 1.1. Note that (1.8)-(iii) implies that A(0,X,Y,t)=0 for a.e. (X,Y,t)RN+1. Hence we deduce from (1.8)-(i),(ii) and (iii) that R(Λ)M(Λ).

    We will often use the notation (Z,t)=(X,Y,t)RN+1 to denote points. The natural family of dilations for our operators and equations, (δr)r>0, on RN+1, is defined by

    δr(X,Y,t)=(rX,r3Y,r2t), (1.9)

    for (X,Y,t)RN+1, r>0. Our classes of operators are closed under the group law

    (˜Z,˜t)(Z,t)=(˜X,˜Y,˜t)(X,Y,t)=(˜X+X,˜Y+Y+t˜X,˜t+t), (1.10)

    where (Z,t), (˜Z,˜t)RN+1. Note that

    (Z,t)1=(X,Y,t)1=(X,Y+tX,t), (1.11)

    and hence

    (˜Z,˜t)1(Z,t)=(˜X,˜Y,˜t)1(X,Y,t)=(X˜X,Y˜Y(t˜t)˜X,t˜t), (1.12)

    whenever (Z,t), (˜Z,˜t)RN+1. Given (Z,t)=(X,Y,t)RN+1 we let

    (Z,t)=(X,Y,t):=|(X,Y)|+|t|12,|(X,Y)|=|X|+|Y|1/3. (1.13)

    Given r>0 and (˜Z,˜t)=(˜X,˜Y,˜t)RN+1, we let

    Qr:={(X,Y,t):|X|<r,|Y|<r3,r2<t<0},Qr(˜Z,˜t):=(˜Z,˜t)Qr. (1.14)

    We refer to Qr(˜Z,˜t) as a cylinder centered at (˜Z,˜t) and of radius r.

    We here state the regularity part of our results, Theorem 1.1–Theorem 1.4. These theorem are derived under the assumption that the symbol A belongs to the class M(Λ) introduced in Definition 1. For the notions of weak sub-solutions, super-solutions and solutions, we refer to Definition 3 below. For the definitions of function spaces used we refer to the bulk of the paper.

    Theorem 1.1 (Higher integrability). Let (Z0,t0)=(X0,Y0,t0)RN+1,0<r1<r01 and let u be a non-negative weak sub-solution to (1.5) in an open set of RN+1 containing Qr0(Z0,t0) in the sense of Definition 3 below. Then for any q[2,2+1/m) and s[0,1/3), we have*

    *Ws,1Y denotes the fractional Sobolev space.

    uLq(Qr1(Z0,t0))c1(2+1mq)1uL2(Qr0(Z0,t0)), (1.15)
    uL1t,XWs,1Y(Qr1(Z0,t0))c2(13s)1uL2(Qr0(Z0,t0)). (1.16)

    Here

    c1=(1+1r0r1)c,c2=r1+2m0(1+1r0r1)c,

    where

    c=c(m,Λ)(1+1(r0r1)2+|X0|+r0(r0r1)r21+1(r0r1)r1),

    for some constant c(m,Λ)1.

    Theorem 1.2 (Local boundedness). Let (Z0,t0)=(X0,Y0,t0)RN+1,0<r<r01 and let u be a non-negative weak sub-solution to (1.5) in an open set of RN+1 containing Qr0(Z0,t0) in the sense of Definition 3 below. Then for any p>0, there exists a constant c=c(m,Λ)1 and θ=θ(m)>1 such that

    supQr(Z0,t0)uc(1+|X0|r2(r0r)3)θpuLp(Qr0(Z0,t0)). (1.17)

    Theorem 1.3 (Harnack inequalities). Let u be a non-negative weak super-solution to (1.5) in an open set of RN+1 containing Q1 in the sense of Definition 3 below. Then there exists ζ>0 and c1, both depending only on m and Λ such that

    (˜Qr0/2uζ(X,Y,t)dXdYdt)1/ζcinfQr0/2u, (1.18)

    where r0=1/20 and ˜Qr0/2:=Qr0/2(0,0,19r20/8). Furthermore, if u is a non-negative weak solution to (1.5) in an open set of RN+1 containing Q1, then

    sup˜Qr0/4ucinfQr0/4u, (1.19)

    where ˜Qr0/4:=Qr0/4(0,0,19r20/8).

    Theorem 1.4 (Hölder continuity). Let u be a weak solution to (1.5) in an open set of RN+1 containing Q2 in the sense of Definition 3 below. Then there exists α(0,1) and c1, both depending only on m and Λ such that

    |u(X1,Y1,t1))u(X2,Y2,t2)|(X2,Y2,t2)1(X1,Y1,t1)αcuL2(Q2), (1.20)

    whenever (X1,Y1,t1),(X2,Y2,t2)Q1,(X1,Y1,t1)(X2,Y2,t2).

    We here state the existence and uniqueness part of our results, Theorem 1.5. Throughout the paper we let UXRm be a bounded Lipschitz domain and let VY,tRm×R be a bounded domain with boundary which is C1,1-smooth, i.e., C1 with respect to Y as well as t. Let NY,t denote the outer unit normal to VY,t. We establish existence and uniqueness of weak solutions to a formulation of the Dirichlet problem

    {X(A(Xu,X,Y,t))(t+XY)u=gin UX×VY,t,u=gon K(UX×VY,t). (1.21)

    Here

    K(UX×VY,t):=(UX×VY,t){(X,Y,t)¯UX×VY,t(X,1)NY,t<0}. (1.22)

    K(UX×VY,t) will be referred to as the Kolmogorov boundary of UX×VY,t, and the Kolmogorov boundary serves, in our context, as the natural substitute for the parabolic boundary used in the context of the Cauchy-Dirichlet problem for uniformly elliptic parabolic equations. In particular, we study weak solutions in the sense of Definition 4. For the definition of the functional setting we refer to Section 2. We believe that the following result is of independent interest in particularly as we allow the symbol A to depend nonlinearly on Xu.

    Theorem 1.5 (Existence and uniqueness). Let (g,g)W(UX×VY,t)×L2Y,t(VY,t,H1X(UX)) and assume that A belongs to the class R(Λ) introduced in Definition 2. Then there exists a unique weak solution u to the problem in (1.21) in the sense of Definition 4 below. Furthermore, there exists a constant c, depending only on m, Λ and UX×VY,t, such that

    ||u||W(UX×VY,t)c(||g||W(UX×VY,t)+||g||L2Y,t(VY,t,H1X(UX))). (1.23)

    As mentioned, the equation in (1.6), possibly also allowing for lower order terms, has attracted considerable attention in recent years. Anceschi-Cinti-Pascucci-Polidoro-Ragusa [2,12,34] proved local boundedness of weak sub-solutions of (1.6) and some versions thereof. Their approach is based on the Moser's iteration technique, the use of fundamental solutions and a Sobolev type inequality is crucial. It is worth noting that while the results in these papers are stated assuming only bounded and measurable coefficients, an implicit regularity assumption on the coefficients is imposed as the authors use a stronger notion of weak solutions assuming also (t+XY)uL2loc. It is unclear for what assumptions on the coefficients such weak solutions can be constructed. Bramanti-Cerutti- Manfredini-Polidoro-Ragusa [8,32,35] proved Lp estimates, interior Sobolev regularity and local Hölder continuity of weak solutions of (1.6) imposing additional assumptions on the coefficients beyond bounded, measurable and elliptic. In fact it was only recently that Golse-Imbert- Mouhot-Vasseur [20] proved local boundedness, Harnack inequality and local Hölder continuity of (true) weak solutions of (1.6) based on De-Giorgi and Moser's iteration technique. Still, it seems unclear to us how the authors actually resolve questions concerning the existence of weak solutions unless smooth coefficients are assumed qualitatively. However, subsequent developments have appeared in [22,23]. A weak Harnack inequality for weak super-solutions of (1.6) has been obtained by Guerand-Imbert [22] and this has been generalized by Anceschi-Rebucci [3]. In [23], Guerand-Mouhot revisited the theory for the linear equation in (1.6), also allowing for lower order terms, and gave lucid, novel and short proofs of the De Giorgi intermediate-value Lemma, weak Harnack and Harnack inequalities, and the Hölder continuity with quantitative estimates. [23] is an essentially self-contained account of the linear theory. Local Hölder continuity results are also proved in Wang-Zhang [38,39,40] for various linear analogues of (1.6). We emphasize that all results mentioned concern linear equations. Zhu [41] proved local boundedness and local Hölder continuity of weak solutions of (1.6) when the drift term t+XY is replaced by t+b(X)Y for some nonlinear function b.

    Boundary value problems for equations as in (1.6) but in non-divergence form were studied by Manfredini [31] who proved existence of strong solutions for the Dirichlet problem assuming Hölder continuous coefficients. Lanconelli-Lascialfari-Morbidelli [26,27] considered a quasilinear case, still in non-divergence form, allowing the coefficients to depend not only on (X,Y,t) but also the solution u, and as a function of (X,Y,t) the coefficients are assumed to be with Hölder continuous. In fact, functional analytic approaches to weak solutions to Kramers equation and Kolmogorov- Fokker-Planck equations have only recently been developed. Albritton-Armstrong-Mourrat- Novack [1] have developed a functional analytic approach to study well-posedness of Kramers equation, and its parabolic analogue

    tuΔXu+XXu+XYu+bXu=g, (1.24)

    for suitable g. Equation (1.24) is often referred to as the kinetic Fokker-Planck equation. Litsgård-Nyström [29] studied existence and uniqueness results for the (linear) Dirichlet problem associated with (1.6), with rough coefficients A. In particular, in [29] Theorem 1.5 is proved in the case when A(ξ,X,Y,t)=A(X,Y,t)ξ. However, existence and uniqueness for (1.5) do not seem to have been studied in the literature so far. It is important to note that Theorem 1.5 states, similar to [29], the existence of a unique weak solution u to the problem in (1.21) in the sense of Definition 4 below. The latter is, as it assumes no knowledge of underlying traces, trace spaces and extension operators in the functional setting considered, a weaker formulation of the Dirichlet problem compared to what one usually aims for. Indeed, this is one way to formulate a weak form of the Dirichlet problem which circumvents a largely open problem in the context of kinetic Fokker-Planck equations, linear as well as non-linear, and that is the problem of a well defined trace operator and trace inequality. We refer to Section 6 for more.

    The regularity part of our results is modelled on the approach of Golse-Imbert-Mouhot-Vasseur [20] and the work of Guerand-Mouhot [23]. In fact, as can be seen from the very formulations of our regularity results, this part of our work is strongly influenced by [23] and armed with Theorem 1.1 and Theorem 1.2 we can to large extent refer to the corresponding arguments in [23] for the proofs of Theorem 1.3 and Theorem 1.4. The new difficulties in our case stem from the nonlinearity of A in Xu. However, as we learn from the regularity theory for quasi-linear parabolic PDEs, see [16] for example, a careful development of the De Giorgi-Nash-Moser theory tends to be robust enough to handle the type of non-linearities considered in this paper. The higher integrability result in Theorem 1.1 is proved by combining the energy estimate in Lemma 3.1 with a Sobolev regularity estimate and here it is important that A has linear growth in Xu. In particular, in the proof of Theorem 1.1 one is lead, after preliminaries and the use of an appropriate cut-off function, to conduct estimates for a (global) weak sub-solution u1 to the equation

    (t+XY)u1XA(Xu1,X,Y,t)+g, g:=(XF1+F0) in RN+1, (1.25)

    where F1,F0 are in L2(RN+1) and u1,F1,F0 are supported in Qr0(0,0,0). To close the argument, as u1 is only a weak sub-solution, it seems important to replace it by a function which actually solves an equation. In particular, to make this operational one needs to construct a weak solution v to

    (t+XY)v=XA(Xv,X,Y,t)+g, (1.26)

    such that v bounds u1 from above. One approach to Sobolev regularity estimates is then attempt to use an approach based on Bouchut [7] which implies a Sobolev embedding

    H1/3X,Y,t(RN+1)LqX,Y,t(RN+1),q:=6(2m+1)6m+1>2. (1.27)

    To get hold of the H1/3X,Y,t(RN+1) norm of v one uses a result of Bouchut [7] which gives control of D1/3Yv, D1/3tv given energy estimates. To be able to bound u1 from above by v as in (1.26) one seems to need Theorem 1.5 and the comparison principle that we prove in Theorem 5.1 below. As the result of Bouchut [7] requires a solution which exists globally in time one can make this approach operational using Theorem 1.5 to prove Theorem 1.1 with the cylinders in (1.14) replaced by centered cylinders. An alternative approach to Sobolev regularity estimates, which in the end gives Theorem 1.1 as stated, is to first observe that if u1 satisfies (1.25), then one deduces that the weak formulation of (1.25) induces a positive distribution. One is therefore lead to prove estimates for v satisfying

    (t+XY)v=XA(Xv,X,Y,t)+gμ, (1.28)

    where μ is now a positive measure. Due to the structure of g, Sobolev regularity estimates can then be deduced using a semi-classical approach via the fundamental solution associated to the linear equation (t+XY)f=ΔXf originally studied by Kolmogorov, see Lemma 10 in [23]. In the end, we follow this approach and here it is again important that A has linear growth in Xu. Armed with the Sobolev regularity estimates the proofs of Theorem 1.1–Theorem 1.4 can be completed along the lines of the corresponding arguments in the linear case. Finally, to prove the existence and uniqueness result in Theorem 1.5 we use a variational approach and proceed along the lines of [1,4,29]. In particular, our argument is similar to the proof of Theorem 1.1 in [29].

    In Section 2 we introduce the functional setting and the notion of weak solutions. Section 3 is devoted to a number of preliminary technical results to be used in the proofs of Theorem 1.1– Theorem 1.4. Theorem 1.1–Theorem 1.4 are proved in Section 4, and in the proof of Theorem 1.3 and Theorem 1.4 we for brevity mainly refer to the corresponding arguments in [23]. Theorem 1.5 is proved in Section 5. In Section 6 we mention a number of challenging problems for future research which we hope will inspire the community to look further into the topic of nonlinear Kolmogorov-Fokker-Planck type equations.

    We denote by H1X(UX) the Sobolev space of functions gL2(UX) whose distributional gradient in UX lies in (L2(UX))m, i.e.,

    H1X(UX):={gL2X(UX)Xg(L2(UX))m},

    and we set

    ||g||H1X(UX):=(||g||2L2(UX)+|||Xg|||2L2(UX))1/2, gH1X(UX).

    We let H1X,0(UX) denote the closure of C0(UX) in the norm of H1X(UX) and we recall, as UX is a bounded Lipschitz domain, that C(¯UX) is dense in H1X(UX). In particular, equivalently we could define H1X(UX) as the closure of C(¯UX) in the norm ||||H1X(UX). Note that as H1X,0(UX) is a Hilbert space it is reflexive, hence (H1X,0(UX))=H1X(UX) and (H1X(UX))=H1X,0(UX), where () denotes the dual. Based on this we let H1X(UX) denote the dual to H1X,0(UX) acting on functions in H1X,0(UX) through the duality pairing ,:=,H1X(UX),H1X,0(UX). We let L2Y,t(VY,t,H1X,0(UX)) be the space of measurable function u:VY,tH1X,0(UX) equipped with the norm

    ||u||2L2Y,t(VY,t,H1X(UX)):=VY,t||u(,Y,t)||2H1X(UX)dYdt.

    L2Y,t(VY,t,H1X(UX)) is defined analogously. In analogy with the definition of H1X(UX), we let W(UX×VY,t) be the closure of C(¯UX×VY,t) in the norm

    ||u||W(UX×VY,t):=(||u||2L2Y,t(VY,t,H1X(UX))+||(t+XY)u||2L2Y,t(VY,t,H1X(UX)))1/2. (2.1)

    In particular, W(UX×VY,t) is a Banach space and uW(UX×VY,t) if and only if

    uL2Y,t(VY,t,H1X(UX))and(t+XY)uL2Y,t(VY,t,H1X(UX)). (2.2)

    Note that the dual of L2Y,t(VY,t,H1X,0(UX)), denoted by (L2Y,t(VY,t,H1X,0(UX))), satisfies

    (L2Y,t(VY,t,H1X,0(UX)))=L2Y,t(VY,t,H1X(UX)),

    and, as mentioned above,

    (L2Y,t(VY,t,H1X(UX)))=L2Y,t(VY,t,H1X,0(UX)).

    Finally, the spaces L2Y,t,loc(VY,t,H1X,loc(UX)), L2Y,t,loc(VY,t,H1X,loc(UX)), and Wloc(UX×VY,t) are defined in the natural way. The topological boundary of UX×VY,t is denoted by (UX×VY,t). Let NY,t denote the outer unit normal to VY,t. We define a subset K(UX×VY,t)(UX×VY,t), the Kolmogorov boundary of UX×VY,t, as in (1.22). We let CK,0(¯UX×VY,t) and CX,0(¯UX×VY,t) be the set of functions in C(¯UX×VY,t) which vanish on K(UX×VY,t) and {(X,Y,t)UXׯVY,t}, respectively. We let W0(UX×VY,t) and WX,0(UX×VY,t) denote the closure in the norm of W(UX×VY,t) of CK,0(¯UX×VY,t) and CX,0(¯UX×VY,t), respectively.

    We here introduce the notion of weak solutions.

    Definition 3. Let gL2Y,t(VY,t,H1X(UX)). A function uWloc(UX×VY,t) is said to be a weak sub-solution (or super-solution) to the equation

    (t+XY)uX(A(Xu,X,Y,t))+g=0 in  UX×VY,t, (2.3)

    if for every VX×VY×JUX×VY,t, and for all non-negative ϕL2Y,t(VY×J,H1X,0(VX)), we have

    VX×VY×JA(Xu,X,Y,t)XϕdXdYdt+VY×J g(,Y,t)+(t+XY)u(,Y,t),ϕ(,Y,t)dYdt0( or ). (2.4)

    We say that uWloc(UX×VY,t) is a weak solution to the Eq (2.3) if equality holds in (2.4) without a sign restriction on ϕ.

    Note that if u is a weak sub-solution (or super-solution) of (2.3) in the sense of Definition 3 above, with g0, then

    VX×VYu(X,Y,t2)ϕ(X,Y,t2)dXdYVX×VYu(X,Y,t1)ϕ(X,Y,t1)dXdYt2t1u(t+XY)ϕdXdYdt+t2t1A(Xu,X,Y,t)XϕdXdYdt0( or ), (2.5)

    whenever ϕC((t1,t2),C0(VX×VY)), is non-negative function. Furthermore, equality holds in (2.5) for every weak solution u of (2.3) without a sign restriction on ϕ.

    Remark 2.1. Assume g0. (i) From Definition 3, it is clear that, if u is a weak sub-solution (resp. super-solution or solution) of (2.3) in UX×VY,t, then for any kR, the function v=(uk) is also weak sub-solution (resp. super-solution or solution) of (2.3) in UX×VY,t.

    (ii) Using the homogeneity property (iii) of A, it follows that, (a) for any c0, cu is a weak sub-solution (resp. super-solution or solution) of (2.3) in UX×VY,t, provided u is a weak sub-solution (resp. super-solution or solution) of (2.3) in UX×VY,t and (b) u is a weak solution of (2.3) in UX×VY,t if and only if u is a weak solution of (2.3) in UX×VY,t.

    Theorem 1.5 is a statement concerning existence and uniqueness of weak solutions to a formulation of the Dirichlet problem in (1.21). In particular, we study weak solutions in the following sense.

    Definition 4. Consider (g,g)W(UX×VY,t)×L2Y,t(VY,t,H1X(UX)). Given (g,g), u is said to be a weak solution to the problem in (1.21) if

    uW(UX×VY,t),(ug)W0(UX×VY,t), (2.6)

    and if

    UX×VY,t A(Xu,X,Y,t)XϕdXdYdt+VY,t g(,Y,t)+(t+XY)u(,Y,t),ϕ(,Y,t)dYdt=0, (2.7)

    for all ϕL2Y,t(VY,t,H1X,0(UX)) and where ,=,H1X(UX),H1X,0(UX) is the duality pairing in H1X(UX). If in (2.7), = is replaced by () whenever ϕ0, then u is said to be a weak sub- (super-) solution of (1.21) respectively.

    In this section we prove a number of technical results to be used in the proof of Theorem 1.1–Theorem 1.4. Throughout the rest of the paper, we use the notation s+:=max{s,0} for sR. Moreover, from Sections 3 and 4, we assume that the symbol A belongs to the class M(Λ) introduced in Definition 1.

    Lemma 3.1. Let Z0=(X0,Y0,t0)RN+1, 0<r1<r0, be such that Qr0(Z0,t0)UX×VY,t. Let u be a weak sub-solution of the Eq (1.5) in UX×VY,t in the sense of Definition 3. Then

    supt0r21<t<t0Qt(Z0,r1)u2(X,Y,t)dXdY+Λ1Qr1(Z0,r1)|Xu|2dXdYdtcc0,1Qr0(Z0,t0)u(X,Y,t)2dXdYdt, (3.1)

    where Qt(Z0,r):={(X,Y):(X,Y,t)Qr(Z0,t0)} for r>0, c=c(m,Λ)1 and

    c0,1:=1(r0r1)2+r0+|X0|(r0r1)r21+1(r0r1)r1+1.

    Proof. Let t1:=t0r20 and t2:=t0. Considering l1, l2, such that t1<l1<l2<t2, we introduce for ϵ>0 the function θϵW1,((t1,t2)) by

    θϵ(t):={0 if t1tl1ϵ,1+tl1ϵ, if l1ϵ<tl1,1 if l1<tl2,1tl2ϵ if l2tl2+ϵ,0 if l2+ϵ<tt2. (3.2)

    Let ψ[0,1] be smooth in Qr0(Z0,t0) such that ψ1 on Qr1(Z0,t0) and ψ0 outside Qr0(Z0,t0) satisfying

    |Xψ|cr0r1,|Yψ|c(r0r1)r21,|tψ|c(r0r1)r1,

    for some constant c=c(m)1.

    Consider the function ϕ(X,Y,t)=2u(X,Y,t)ψ2(X,Y,t)θϵ(t). We intend to test (2.5) with ϕ and the following deductions are formal. However, as u is a weak sub-solution of the Eq (1.5) in UX×VY,t in the sense of Definition 3, we know that uWloc(UX×VY,t) and as W(UX×VY,t) is defined as the closure of C(¯UX×VY,t) in the norm introduced in (2.1) our deduction can be made rigorous a posteriori. Testing (2.5) with ϕ(X,Y,t), letting ϵ0, and then adding

    u2t(ψ2)dXdYdt

    on both sides of the resulting inequality, we deduce that

    I(l2)I(l1)+2A(Xu,X,Y,t)X(uψ2)dXdYdtu2(t+XY)ψ2dXdYdt, (3.3)

    where

    I(t):=ψ2(X,Y,t)u2(X,Y,t)dXdY.

    Using (1.7), (3.3) yields

    I(l2)I(l1)+2ψ2A(Xu,X,Y,t)XudXdYdtu2(t+XY)ψ2dXdYdt4uψA(Xu,X,Y,t))XψdXdYdtu2{(t+XY)ψ2+4Λ3(ψ+|Xψ|)2}dXdYdt+Λ1ψ2|Xu|2dXdYdt. (3.4)

    Furthermore, using (1.7)-(i),(ii) we can continue the above estimate and conclude that

    I(l2)I(l1)+Λ1Qt(Z0,r0)×(l1,l2)ψ2|Xu|2dXdYdtu2{(t+XY)ψ2+4Λ3(ψ+|Xψ|)2}dXdYdt. (3.5)

    Using the properties of ψ and first letting l1t1, and then letting l2t2 in (3.5), we obtain

    Λ1Qr1(Z0,t0)|Xu|2dXdYdtQr0(Z0,t0)u2{(t+XY)ψ2+4Λ3(ψ+|Xψ|)2}dXdYdtcc0,1Qr0(Z0,t0)u(X,Y,t)2dXdYdt, (3.6)

    where c=c(m,Λ)1 and

    c0,1:=1(r0r1)2+r0+|X0|(r0r1)r21+1(r0r1)r1+1.

    Again using the properties of ψ and first letting l1t1 in (3.5), then taking supremum over l2[t0r21,t0) and noting that for such l2, ψ1, we also have

    supt0r21<t<t0Qt(Z0,r0)u2(X,Y,t)dXdY (3.7)
    Qr0(Z0,r0)u2{(t+XY)ψ2+4Λ3(ψ+|Xψ|)2}dXdYdtcc0,1Qr0(Z0,t0)u(X,Y,t)2dXdYdt. (3.8)

    This completes the proof.

    Lemma 3.2. Let u be a weak sub-solution of the Eq (1.5) in UX×VY,t in the sense of Definition 3. Let kR. Then (uk)+ is also a weak sub-solution of the Eq (1.5) in UX×VY,t in the sense of Definition 3.

    Proof. By Remark 2.1, it is enough to prove that u+ is a weak sub-solution of (1.5). Let ϵ>0 and ϕL2Y,t(VY×J,H1X,0(VX)) be a non-negative test function in (2.4). Then u+(u++ϵ)ϕL2Y,t(VY×J,H1X,0(VX)) is also a non-negative test function in (2.4). Using u+(u++ϵ)ϕ as a test function in (2.4), we obtain

    VX×VY×JA(Xu,X,Y,t)Xϕu+(u++ϵ)dXdYdt+ϵVX×VY×JA(Xu,X,Y,t)Xu+(u++ϵ)2ϕdXdYdt+VY×J (t+XY)u(,Y,t),u+(u++ϵ)ϕ(,Y,t)dYdt0. (3.9)

    Letting ϵ0, we obtain

    VY×J(t+XY)u+(,Y,t),ϕ(,Y,t)dYdt+VX×VY×JA(Xu+,X,Y,t)XϕdXdYdt+liminfϵ0ϵVX×VY×JA(Xu+,X,Y,t)Xu+(u++ϵ)2ϕdXdYdt0. (3.10)

    However, by (1.7)-(ii)

    VX×VY×JA(Xu+,X,Y,t)Xu+(u++ϵ)2ϕdXdYdt0. (3.11)

    Hence,

    VY×J(t+XY)u+(,Y,t),ϕ(,Y,t)dYdt+VX×VY×JA(Xu+,X,Y,t)XϕdXdYdt0. (3.12)

    This proves that u+ is a weak sub-solution.

    The following result follows from [23,Lemma 10].

    Lemma 3.3. Let f0 be locally integrable such that

    (t+XYΔX)f=XF1+F2μ, (3.13)

    where F1,F2L1L2(R2m×R) and μM1(R2m×R) is a non-negative measure with finite mass in R2m×R such that F1,F2 and μ have compact support, in the time variable, included in (τ,0]. Then for any p[2,2+1/m) and σ[0,1/3) we have

    fLp(R2m×R)c(2+1mp)1(F1L2(R2m×R)+F2L2(R2m×R)) (3.14)

    and

    fL1t,XWσ,1Y(R2m×R)c(13σ)1(F1L1(R2m×R)+F2L1(R2m×R))+c(13σ)1μM1(R2m×R), (3.15)

    for some constant c=c(τ).

    The lemmas stated so far will be sufficient for our proof of Theorem 1.1 and Theorem 1.2.

    Lemma 3.4 (Weak Poincaré inequality). Let ϵ(0,1) and σ(0,13). Then every non-negative weak sub-solution u of (1.5) in Q5 in the sense of Definition 3 satisfies

    (uuQ1)+L1(Q+1)c(1ϵm+2XuL1(Q5)+ϵσ(13σ)1uL2(Q5)), (3.16)

    for some constant c=c(m,Λ)1, where Q1:=Q1(0,0,1) and uQ1:=1|Q1|Q1u.

    Proof. Using that u is a non-negative weak sub-solution of (1.5), Theorem 1.1 and the property 1.7-(i), the conclusion of the lemma follows from the lines of the proof of [23,Proposition 13,pages 8-10].

    Lemma 3.5 (Intermediate value lemma). Let δ1,δ2(0,1) be given. Then there exists constants θ=c(m,Λ)(δ1δ2)10m+15, r0=120, and νc(m,Λ)(δ1δ2)5m+8, such that the following holds. Let u:Q1R be a weak sub-solution of (1.5) in Q5 in the sense of Definition 3, assume that u1 in Q12, and that

    |{u0}Qr0|δ1|Qr0|and|{u1θ}Qr0|δ2|Qr0|, (3.17)

    where Qr0:=Qr0(0,0,2r20). Then

    |{0<u<1θ}Q12|ν|Q12|. (3.18)

    Proof. Using Lemma 3.1, Lemma 3.2 and Lemma 3.4, the result follows from the lines of the proof of [23,Theorem 3,pages 11-12].

    Lemma 3.6 (Measure to pointwise upper bound). Given δ(0,1) and r0=120, there exists a positive constant γ:=γ(δ)=c(m,Λ)δ2(1+δ10m16)>0 such that the following holds. Let u be a weak sub-solution of (1.5) in Q1 in the sense of Definition 3, assume that u1 in Q12 and that

    |{u0}Qr0|δ|Qr0|, (3.19)

    where Qr0:=Qr0(0,0,2r20). Then

    u1γinQr02.

    Proof. Using Remark 2.1, Theorem 1.2 and Lemma 3.5, the result follows by proceeding along the lines of the proof of [23,Lemma 16,page 12].

    In this section we prove Theorem 1.1–Theorem 1.4. We first note that since our class of operators is closed under the group law defined in (1.10), and by our definition of Qr0(Z0,t0), we can throughout the second without loss of generality assume that (Z0,t0)=0. Note that Qr0=Qr0(0,0)=VX×VY×J where VX=B(0,r0), VY=B(0,r30), J=(r20,0), and where B(0,ρ) denotes the standard Euclidean ball with center at 0 and radius ρ in Rm.

    As discussed in subsection 1.7, since u is a weak sub-solution of (1.5), there exists a non-negative measure ˉμ such that

    (t+XY)u=X(A(Xu,X,Y,t))ˉμ.

    We define r2:=r0+r12. Let ϕ1[0,1] be smooth such that ϕ11 in Qr1(Z0,t0) and ϕ10 outside Qr2(Z0,t0) satisfying

    |Xϕ1|cr0r2,|Yϕ1|c(r0r2)r22,|tϕ1|c(r0r2)r2, (4.1)

    for some constant c=c(m)1. Then we observe that v=uϕ1 is a weak solution of

    (t+XYΔX)v=XF1+F2μ in RN+1, (4.2)

    where

    F1=A(Xu)ϕ1ϕ1XuuXϕ1,
    F2=A(Xu)Xϕ1+u(t+XY)ϕ1 and μ=ˉμϕ1.

    By Lemma 3.3, we have

    vLq(R2m×R)c(2+1mq)1(F1L2(R2m×R)+F2L2(R2m×R)) (4.3)

    and

    vL1t,XWs,1Y(R2m×R)c(13s)1(F1L1(R2m×R)+F2L1(R2m×R)+μM1(R2m×R)), (4.4)

    for some uniform constant c and for every q[2,2+1m) and s[0,13). Using (4.1), (1.7)-(i), Lemma 3.1 and that 0<r1<r01, it follows that

    F1L2(R2m×R)+F2L2(R2m×R)c1, (4.5)

    where

    c1=c(m,Λ)(1+1r0r1)(1+1(r0r1)2+|X0|+r0(r0r1)r21+1(r0r1)r1). (4.6)

    Using (4.5) in (4.3), the estimate (1.15) follows. To obtain the estimate (1.16), let ϕ2[0,1] be smooth such that ϕ21 in Qr2(Z0,t0) and ϕ20 outside Qr0(Z0,t0) satisfying (4.1). Choosing ϕ2 as a test function in (4.2) and proceeding similarly as in the proof of energy estimate in Lemma 3.1, we get

    μM1(Qr2(Z0,t0))ϕ2μM1(R2m×R)r1+2m0c1uL2(Qr0(Z0,t0)),

    where c1 is given by (4.6). The last estimate, combined with (4.4) and (4.5), yields the estimate (1.16).

    As mentioned, we can, without loss of generality, assume that (Z0,t0)=(0,0,0). For nN{0}, we define

    rn=r+(r0r)2n,Tn=r2n,kn=12(12n),un=(ukn)+,

    and

    An:=supt(Tn,0)B(0,rn)×B(0,r3n)u2n(,,t)dXdY.

    By Lemma 3.2 we know that un is a weak sub-solution of (1.5). Thus applying Lemma 3.1 we obtain

    Anccn1,nQrn1u2ndXdYdt,n1, (4.7)

    where c=c(m,Λ)1 and

    cn1,n:=1(rn1rn)2+rn1(rn1rn)r2n+1(rn1rn)rn+122nr2(r0r)2. (4.8)

    Now we will estimate the integral in the right hand side of (4.7). Let q=2+12m. By Hölder's inequality we have

    Qrn1u2ndXdYdt(Qrn1uqndXdYdt)2q|{un>0}Qrn1|12q. (4.9)

    Since kn>kn1, we get unun1. Using this fact, that 0<r<r01, and Theorem 1.1, we get

    (Qrn1uqndXdYdt)2q(Qrn1uqn1dXdYdt)2qc2An2(c(m,Λ)23nr2(r0r)3)2An2, (4.10)

    for every n2, where we have used that

    c=c(m,Λ)(1+1rn2rn1)cn2,n1c(m,Λ)23nr2(r0r)3,

    with cn2,n1 is as defined in (4.8).

    Next, we observe that

    Qrn1u2n1dXdYdt{un12n1}Qrn1u2n1dXdYdt22n2|{un12n1}Qrn1|.

    Moreover,

    |{un>0}Qrn1||{unknkn1}Qrn1|=|{un2n1}Qrn1|.

    Combining the preceding two estimates and using 0<r<r01, we get

    |{un>0}Qrn1|22n+2An1. (4.11)

    Using the estimates (4.10) and (4.11) in (4.9), we get

    Qrn1u2ndXdYdtc(m,Λ)(24nr2(r0r)3)2A22qn2,n2, (4.12)

    where we have also used that An1An2. Note that the latter is true since un1un2,rn1<rn2 and Tn2<Tn1 for every n2. Using (4.12) in (4.7), we obtain

    Anc(m,Λ)212nr6(r0r)8Aαn2,

    where α=22q>1, since q>2. Therefore, defining Sn:=A2n, we get

    SnβnSαn1n1,

    where

    β=c(m,Λ)224r6(r0r)8.

    Recursively we get

    Snβn+(n1)α++αn1Sαn11(βα2(α1)2S1)αn1(c(m,Λ)c0,1βα2(α1)2u2L2(Qr0))αn1, (4.13)

    where we have used (4.7) and the estimate

    n+α(n1)++αn1αn+1(α1)2.

    Let

    v:=12c(m,Λ)c0,1βα2(α1)2uuL2(Qr0).

    We observe that

    γ:=c(m,Λ)c0,1βα2(α1)2v2L2(Qr0)=12<1.

    Note that, by the property (ⅱ) in Remark 2.1, v is again a weak sub-solution of (1.5). Thus the estimate (4.13) holds by replacing u with v. This fact combined with γ<1 gives v12 a.e. in Qr. As a consequence we get

    supQru2c(m,Λ)c0,1βα2(α1)2uL2(Qr0)c(1r2(r0r)3)θ2uL2(Qr0),

    for some c=c(m,Λ)1 and θ=θ(m)>1. Now, arguing similarly as in the proof of [23,Proposition 12,pages 7-8], the result follows.

    Using Remark 2.1, Theorem 1.2 along with Lemma 3.6, and following the lines of the proof of [23,Theorem 5,pages 13-14], the result follows.

    Using Remark 2.1, Lemma 3.6, and following the lines of the proof of [23,Theorem 7,pages 14-15], the result follows.

    The purpose of the section is to prove Theorem 1.5. As gW(UX×VY,t) we can in the following assume, without loss of generality, that g0.

    In domains of the form UX×UY×I instead of UX×VY,t, one may attempt different approaches to prove Theorem 1.5, and perhaps the most natural first approach is to add the term ϵΔY to the operator and to instead consider the problem

    {X(A(Xuϵ,X,Y,t))+ϵΔYuϵ(t+XY)uϵ=gin UX×UY×I,uϵ=0on p(UX×UY×I). (5.1)

    Here p(UX×UY×I) is now the (standard) parabolic boundary of UX×UY×I, i.e.,

    p(UX×UY×I):=((UX×UY)ׯI)((UX×UY)×{0}).

    The existence and uniqueness of weak solutions to (5.1) is classical and one easily deduces that

    |Xuϵ|2L2(UX×UY×I)+ϵ|Yuϵ|2L2(UX×UY×I)cgL2Y,t(UY×I,H1X(UX))×|uϵ|+|Xuϵ|L2(UX×UY×I), (5.2)

    for some positive constant c, independent of ϵ. By the standard Poincaré inequality, applied on UX to uϵ(,Y,t) with (Y,t) fixed, we have

    uϵL2(UX×UY×I)c|Xuϵ|L2(UX×UY×I). (5.3)

    Hence, using Cauchy-Schwarz we can conclude that

    uϵ2L2(UX×UY×I)+|Xuϵ|2L2(UX×UY×I)+ϵ|Yuϵ|2L2(UX×UY×I)cg2L2Y,t(UY×I,H1X(UX)), (5.4)

    for a constant c which is independent of ϵ. The idea is then to let ϵ0 and in this way construct a solution to the problem in (2.7). To make this operational, already in the linear case, A(ξ,X,Y,t)=A(X,Y,t)ξ, one seems to need some uniform estimates up to the Kolmogorov boundary K(UX×UY×I) to get a solution in the limit. In addition, in the nonlinear case considered in this paper we also need to ensure that XuϵXu pointwise a.e. as ϵ0 and how to achieve this is even less clear. One approach is to try to adapt the techniques of Boccardo and Murat [6] but it seems unclear how to make this approach operational in our case due to the presence of the term ϵΔYuϵ in the approximating equation.

    In this paper we will instead prove Theorem 1.5 by using a variational approach recently explored in Albritton-Armstrong-Mourrat-Novack [1] and Litsgård-Nyström [29]. We will prove that the solution to (1.21) can be obtained as the minimizer of a uniformly convex functional. The fact that a parabolic equation can be cast as the first variation of a uniformly convex integral functional was first discovered by Brezis-Ekeland [9,10] and for a modern treatment of this approach, covering uniformly elliptic parabolic equations of second order in the more general context of uniformly monotone operators, we refer to [4] which in turn is closely related to [19], see also [18].

    To make the approach operational we will use a variational representation of the mapping ξA(ξ,X,Y,t), for each (X,Y,t)RN+1, that we learned from [4] and [5] and we refer to these papers for more background. Indeed, by [5,Theorem 2.9], there exists ˜ALloc(Rm×Rm×RN+1) satisfying the following properties, for Γ:=2Λ+1 and for each (X,Y,t)RN+1. First, the mapping

    (ξ,η)˜A(ξ,η,X,Y,t)12Γ(|ξ|2+|η|2)is convex. (5.5)

    Second, the mapping

    (ξ,η)˜A(ξ,η,X,Y,t)Γ2(|ξ|2+|η|2)is concave. (5.6)

    Third, for every ξ,ηRm, we have

    ˜A(ξ,η,X,Y,t)ξη, (5.7)

    and

    ˜A(ξ,η,X,Y,t)=ξηη=A(ξ,X,Y,t). (5.8)

    Note that the choice of ˜A is in general not unique. Note also that (5.5) and (5.6) imply, in particular that

    12Γ|ξ1ξ2|212˜A(ξ1,η,X,Y,t)+12˜A(ξ2,η,X,Y,t)˜A(12ξ1+12ξ2,η,X,Y,t)Γ2|ξ1ξ2|2. (5.9)

    To ease the notation we will in the following at instances use the notation

    W:=W(UX×VY,t),W0:=W0(UX×VY,t),

    and we let

    Lu:=X(A(Xu,X,Y,t))(t+XY)u.

    Given an arbitrary pair (f,j) such that

    fL2Y,t(VY,t,H1X(UX)) and jL2(VY,t,L2(UX)))m, (5.10)

    we introduce

    J[f,j]:=UX×VY,t(˜A(Xf,j,X,Y,t)Xfj)dXdYdt. (5.11)

    Using this notation, and given an arbitrary pair (f,f) such that

    fL2Y,t(VY,t,H1X(UX)) and f, f+(t+XY)fL2Y,t(VY,t,H1X(UX)), (5.12)

    we set

    J[f,f]:=infUX×VY,tJ[f,g]dXdYdt, (5.13)

    where the infimum is taken with respect to the set

    {g(L2(VY,t,L2(UX)))mXg=f+(t+XY)f}. (5.14)

    The condition

    Xg=f+(t+XY)f,

    appearing in (5.14), should be interpreted as stating that

    UX×VY,tgXϕdXdYdt=VY,tf(,Y,t)+(t+XY)f(,Y,t),ϕdYdt, (5.15)

    for all ϕL2(VY,t,H1X,0(UX)). Finally, for gL2Y,t(VY,t,H1X(UX)) fixed we introduce

    A(g):={(f,j)W0×(L2(VY,t,L2(UX)))mXj=g+(t+XY)f}. (5.16)

    Lemma 5.1. Let gL2Y,t(VY,t,H1X(UX)) be fixed and let A(g) be the set introduced in (5.16). Then A(g) is non-empty.

    Proof. Take fW0 and consider the equation

    ΔXv(X,Y,t)=(g(X,Y,t)+(t+XY)f(X,Y,t))H1X(UX), (5.17)

    for dYdt-a.e. (Y,t)VY,t. By the Lax-Milgram theorem this equation has a (unique) solution v()=v(,Y,t)H1X,0(UX) and

    ||Xv||L2Y,t(VY,t,L2(UX))c||g+(t+XY)f||L2Y,t(VY,t,H1X(UX))<, (5.18)

    as fW0. In particular,

    (f,Xv)A(g), (5.19)

    and hence A(g) is non-empty.

    Lemma 5.2. The functional J introduced in (5.11) is uniformly convex on A(g).

    Proof. Note that if (f,j)A(g) and (˜f,˜j)A(0), then (f+˜f,j+˜j)A(g) and (f˜f,j˜j)A(g). Consider (f,j)A(g). We first consider the term

    UX×VY,tXfjdXdYdt.

    We have

    UX×VY,tXfjdXdYdt=UX×VY,tfXjdXdYdt=VY,tg(,Y,t)+(t+XY)f(,Y,t)),f(,Y,t)dYdt=VY,tg(,Y,t),f(,Y,t)dYdt+VY,t(t+XY)f(,Y,t)),f(,Y,t)dYdt. (5.20)

    Recall that W0=W0(UX×VY,t) is the closure in the norm of W(UX×VY,t) of CK,0(¯UX×VY,t). In particular, there exists {fj}, fjCK,0(¯UX×VY,t) such that

    ||ffj||W0 as j,

    and consequently

    ||(t+XY)(ffj)||L2Y,t(VY,t,H1X(UX))0 as j.

    Using this we see that

    VY,t(t+XY)f(,Y,t),f(,Y,t)dYdtlim infjVY,t(t+XY)fj(,Y,t),fj(,Y,t)dYdt. (5.21)

    However, using that fjCK,0(¯UX×VY,t) we see that

    VY,t(t+XY)fj(,Y,t),fj(,Y,t)dYdt=UX×VY,t(t+XY)fjfjdXdYdt=12UX×VY,t(t+XY)f2jdXdYdt=12UXVY,tf2j(X,1)NY,tdσY,tdX0, (5.22)

    by the divergence theorem and the definition of the Kolmogorov boundary. Hence,

    UX×VY,tXfjdXdYdt=UX×VY,tfXjdXdYdtVY,tg(,Y,t),f(,Y,t)dYdt. (5.23)

    Using this, and observing that,

    12UX×VY,t(X(f+˜f)(j+˜j)+X(f˜f)(j˜j)2Xfj)dXdYdt=UX×VY,tX˜f˜jdXdYdt, (5.24)

    we can conclude that

    12UX×VY,t(X(f+˜f)(j+˜j)+X(f˜f)(j˜j)2Xfj)dXdYdt0, (5.25)

    over the set \mathcal{A}(g^\ast) . Hence it suffices to prove that

    \begin{equation*} \iiint_{U_X\times V_{Y, t}} \tilde A(\nabla_Xf, {\bf{j}}, X, Y, t)\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t \end{equation*}

    is uniformly convex over the set \mathcal{A}(g^\ast) . With (f, {\bf{j}})\in \mathcal{A}(g^*) and (\tilde f, \tilde {\bf{j}})\in \mathcal{A}(0) as above, (5.5) implies that

    \begin{equation*} \frac 1 2 \tilde A(\nabla_X (f+\tilde f), {\bf{j}} + \tilde {\bf{j}}, \cdot) + \frac 1 2 \tilde A(\nabla_X (f-\tilde f), {\bf{j}} - \tilde {\bf{j}}, \cdot) - \tilde A(\nabla_X f, {\bf{j}}, \cdot) \ge \frac 1 {2\Gamma} \left( |\nabla_X\tilde f|^2 + |\tilde {\bf{j}}|^2 \right) . \end{equation*}

    We also have

    \begin{align*} \|(\partial_t+X\cdot\nabla_Y) \tilde f\|_{L_{Y, t}^2(V_{Y, t}, H^{-1}(U_X))} \le \|\tilde {\bf{j}}\|_{L^2(U_X\times V_{Y, t})}. \end{align*}

    Thus

    \begin{align*} & \iiint_{U_X\times V_{Y, t}}\biggl (\frac 1 2 \tilde A(\nabla_X (f+\tilde f), {\bf{j}} + \tilde {\bf{j}}, \cdot) + \frac 1 2 \tilde A(\nabla_X (f-\tilde f), {\bf{j}} - \tilde {\bf{j}}, \cdot) - \tilde A(\nabla_X f, {\bf{j}}, \cdot)\biggr )\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ &\geq \frac{1}{4\Gamma} \left( ||\nabla_X\tilde f||^2_{L^2(U_X\times V_{Y, t})}+\|(\partial_t+X\cdot\nabla_Y) \tilde f\|^2_{L_{Y, t}^2(V_{Y, t}, H^{-1}(U_X))} + ||\tilde {\bf{j}}||^2_{L^2(U_X\times V_{Y, t})} \right)\notag\\ &\geq \frac{1}{4c\Gamma} \left( ||\tilde f||^2_{W(U_X\times V_{Y, t})}+\||\tilde {\bf{j}}||^2_{L^2(U_X\times V_{Y, t})} \right), \end{align*}

    by using the (standard) Poincaré inequality. Hence {\mathcal J} is uniformly convex on \mathcal{A}(g^*) .

    As the functional {\mathcal J} is uniformly convex over \mathcal{A}(g^\ast) there exists a unique minimizing pair (f_1, {\bf{j}}_1)\in \mathcal{A}(g^\ast) such that

    \begin{align*} (f_1, {\bf{j}}_1): = &\mathop {{\rm{arg}}\;{\rm{min}}}\limits_{(f, {\bf{j}})\in \mathcal{A}(g^\ast)} {\mathcal J}[f, {\bf{j}}]\notag\\ = &\mathop {{\rm{arg}}\;{\rm{min}}}\limits_{(f, {\bf{j}})\in \mathcal{A}(g^\ast)} \iiint_{U_X\times V_{Y, t}} (\tilde A(\nabla_Xf, {\bf{j}}, X, Y, t)-\nabla_Xf\cdot {\bf{j}}) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t. \end{align*}

    Note that

    \begin{align*} \min\limits_{(f, {\bf{j}})\in \mathcal{A}(g^\ast)} {\mathcal J}[f, {\bf{j}}] = \min\limits_{f\in W_0} J[f, g^*]. \end{align*}

    Moreover, by construction of \tilde A , see (5.7), we have

    \begin{equation} J[f_1, g^\ast] \ge 0. \end{equation} (5.26)

    Lemma 5.3. There is a one-to-one correspondence between weak solutions in the sense of Eq (2.7) to {\mathcal L} u = g^\ast in U_X\times V_{Y, t} , such that u\in W_0 , and null minimizers of J[\cdot, g^\ast] .

    Proof. To prove the lemma we need to prove that for every f \in W_{0} , we have

    \begin{align*} f { \;{\rm{solves}}\; {\mathcal L} u = g^\ast \;{\rm{in}} \;{\rm{the}} \;{\rm{weak}}\; {\rm{sense}} \;{\rm{in}} \;U_X\times V_{Y, t} } \iff J[f, g^\ast] = 0. \end{align*}

    Indeed, the implication " \implies " is clear since if f solves {\mathcal L} u = g^\ast in the weak sense, then

    \begin{equation*} (f, A(\nabla_X f, X, Y, t)) \in \mathcal{A}(g^\ast) \quad \text{ and } \quad {\mathcal J}[f, A(\nabla_X f, X, Y, t)] = 0 = J[f, g^\ast]. \end{equation*}

    Conversely, if J[f, g^\ast] = 0 , then f = f_1 and

    \begin{equation} {\mathcal J}[f_1, {\bf{j}}_1] = \iiint_{U_X\times V_{Y, t}} (\tilde A(\nabla_Xf_1, {\bf{j}}_1, X, Y, t)-\nabla_Xf_1\cdot {\bf{j}}_1) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t = 0. \end{equation} (5.27)

    Using (5.8), we see that the identity (5.27) implies that

    \begin{equation*} {\bf{j}}_1 = A(\nabla f_1, \cdot, \cdot, \cdot) \quad \text{a.e. in } U_X\times V_{Y, t}, \end{equation*}

    and by the definition of the set \mathcal{A}(g^\ast) ,

    \begin{equation*} \nabla_X\cdot {\bf{j}}_1 = g^\ast+(\partial_t+X\cdot\nabla_Y)f_1. \end{equation*}

    Hence f_1 indeed solves

    \begin{equation*} \nabla_X\cdot A(\nabla f_1, \cdot, \cdot, \cdot) -(\partial_t+X\cdot\nabla_Y)f_1 = g^\ast \end{equation*}

    in the weak sense. I.e., we recover that f = f_1 is indeed a weak solution of {\mathcal L} u = g^\ast . In particular, the fact that there is at most one solution to {\mathcal L} u = g^* is clear.

    Using (5.26) and Lemma 5.3 we see that to complete the proof of Theorem 1.5 it remains to prove that

    \begin{equation} J[f_1, g^*] \le 0. \end{equation} (5.28)

    In order to do so, we introduce the perturbed convex minimization problem defined, for every f^* \in L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)) , by

    \begin{equation*} G(f^*) : = \inf\limits_{f \in W_{0}}\bigl ( J[f, f^*+g^* ] {- \iint_{ V_{Y, t}} \langle f^*(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\bigr ).} \end{equation*}

    As

    \begin{equation*} G(0) = \inf\limits_{f \in W_{0}} J[f, g^* ], \end{equation*}

    we see that to prove (5.28) is suffices to prove that G(0) \le 0 .

    Lemma 5.4. G is a convex, locally bounded from above and lower semi-continuous functional on L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)) .

    Proof. For every pair (f, {\bf{j}}) \in \mathcal A(f^*+g^*) , we have

    \begin{equation*} \nabla_X \cdot {\bf{j}} = f^*+g^* +(\partial_t+X\cdot\nabla_Y)f, \end{equation*}

    and thus

    \begin{align*} {\mathcal J}[f, {\bf{j}}] & = \iiint_{U_X\times V_{Y, t}} (\tilde A(\nabla_Xf, {\bf{j}}, X, Y, t)-\nabla_Xf\cdot {\bf{j}}) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ & = \iiint_{U_X\times V_{Y, t}} \tilde A(\nabla_Xf, {\bf{j}}, X, Y, t) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t{+\iint_{V_{Y, t}} \langle (f^*+g^*)(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t}\notag\\ &+\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t. \end{align*}

    Hence

    \begin{align*} &{\mathcal J}[f, {\bf{j}}] {-\iint_{V_{Y, t}} \langle f^*(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t} \notag\\ & = \iiint_{U_X\times V_{Y, t}} \tilde A(\nabla_Xf, {\bf{j}}, X, Y, t) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t+\iint_{V_{Y, t}} \langle g^*(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\\ &+\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t. \end{align*}

    Taking the infimum over all (f, {\bf{j}}) satisfying the affine constraint (f, {\bf{j}}) \in \mathcal A(f^*+g^*) we obtain the quantity G(f^*) , i.e., G(f^*) can be expressed as

    \begin{equation*} G(f^*) = \inf\limits_{(f, {\bf{j}}):\ (f, {\bf{j}}) \in \mathcal A(f^*+g^*)}\bigl ( {\mathcal J}[f, {\bf{j}}] { - \iint_{V_{Y, t}} \langle f^*(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t} \bigr ). \end{equation*}

    In particular, G(f^*) can be expressed as the infimum of

    \begin{align} &\iiint_{U_X\times V_{Y, t}} \tilde A(\nabla_Xf, {\bf{j}}, X, Y, t) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t+\iint_{V_{Y, t}} \langle g^*(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t \end{align} (5.29)

    with respect to (f, {\bf{j}}) such that (f, {\bf{j}}) \in \mathcal A(f^*+g^*) . We now recall the argument in (5.21) and (5.22). In particular, given f\in W_0 there exists \{f_j\} , f_j\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) such that

    \begin{equation} ||f-f_j||_W\to 0\mbox{ as }j\to \infty, \end{equation} (5.30)

    and consequently

    ||(\partial_t+X\cdot\nabla_Y)(f-f_j)||_{L_{Y, t}^2(V_{Y, t}, {H}_X^{-1}(U_X))}\to 0\mbox{ as }j\to \infty.

    Using (5.21) and (5.22) we have

    \begin{equation} \label{apauu} \begin{split} &\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\notag\\ & = \lim\limits_{j\to\infty} \frac 12 \int_{U_X}\iint_{\partial V_{Y, t}} f_j^2|(X, 1)\cdot N_{Y, t}| \, {{\text{d}}} \sigma_{Y, t}{{\text{d}}} X. \end{split} \end{equation}

    Obviously we get the same limit in (5.31) independent of what sequence \{f_j\} chosen as long as (5.30) holds. Now consider f, g\in W_0 and let \{f_j\} , f_j\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) , \{g_j\} , g_j\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) , be such that

    \begin{equation} ||f-f_j||_W+||g-g_j||_W\to 0\mbox{ as }j\to \infty, \end{equation} (5.31)

    Then

    \begin{equation} ||(\tau f+(1-\tau)g)-(\tau f_j+(1-\tau)g_j)||_W\to 0\mbox{ as }j\to \infty, \end{equation} (5.32)

    for all \tau\in [0, 1] . Hence

    \begin{align} &\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)(\tau f+(1-\tau)g)(\cdot, Y, t), (\tau f+(1-\tau)g)(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\\ & = \lim\limits_{j\to\infty} \frac 12 \int_{U_X}\iint_{\partial V_{Y, t}} (\tau f_j+(1-\tau)g_j)^2|(X, 1)\cdot N_{Y, t}| \, {{\text{d}}} \sigma_{Y, t}{{\text{d}}} X\\ &\leq\lim\limits_{j\to\infty} \frac 12 \int_{U_X}\iint_{\partial V_{Y, t}} \tau f_j^2|(X, 1)\cdot N_{Y, t}| \, {{\text{d}}} \sigma_{Y, t}{{\text{d}}} X\\ &+\lim\limits_{j\to\infty} \frac 12 \int_{U_X}\iint_{\partial V_{Y, t}} (1-\tau)g_j^2|(X, 1)\cdot N_{Y, t}| \, {{\text{d}}} \sigma_{Y, t}{{\text{d}}} X, \end{align} (5.33)

    and we deduce that

    \begin{align} &\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)(\tau f+(1-\tau)g)(\cdot, Y, t), (\tau f+(1-\tau)g)(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\\ &\leq\tau\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\\ &+(1-\tau)\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)g(\cdot, Y, t), g(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t. \end{align} (5.34)

    In particular, we can conclude that the mapping

    f\to \iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)(\tau f+(1-\tau)g)(\cdot, Y, t), (\tau f+(1-\tau)g)(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t

    is convex on W_0 . Using this, and (5.5), we see that the expression in (5.29) is convex as a function of (f, f^*, {\bf{j}}) and this proves that G is convex. Furthermore, using (5.17)–(5.19) we can conclude that the infimum of the expression in (5.29) is finite, hence G(f^*) < \infty . In particular, the function G is locally bounded from above. These two properties imply that G is lower semi-continuous, see [17,Lemma 2.1 and Corollary 2.2].

    We denote by G^* the convex dual of G , defined for every

    h \in (L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)))^\ast = L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)),

    as

    \begin{equation*} G^*(h) : = \sup\limits_{f^* \in L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X))} \bigl( -G(f^*) + \iint_{V_{Y, t}} \langle f^*(\cdot, Y, t), h(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t \bigr). \end{equation*}

    Let G^{**} be the bidual of G . Since G is lower semi-continuous, we have that G^{**} = G (see [17,Proposition 4.1]), and in particular,

    \begin{equation*} G(0) = G^{**}(0) = \sup\limits_{h \in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X))} \bigl( -G^*(h) \bigr) . \end{equation*}

    In order to prove that G(0) \le 0 , it therefore suffices to show that

    \begin{equation} G^*(h) \ge 0\mbox{ for all } h \in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)). \end{equation} (5.35)

    To continue we note that we can rewrite G^*(h) as

    \begin{equation} \begin{split} G^*(h) = \sup\limits_{(f, {\bf{j}}, f^*)} &\bigg\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X f, {\bf{j}}, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}})) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}} {\langle f^*(\cdot, Y, t), (h(\cdot, Y, t)+f(\cdot, Y, t))\rangle}\, {{\text{d}}} Y {{\text{d}}} t\bigg \}, \end{split} \end{equation} (5.36)

    where the supremum is taken with respect to

    (f, {\bf{j}}, f^*)\in W_{0}\times (L_{Y, t}^2(V_{Y, t}, L^2_X(U_X)))^m\times L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)),

    subject to the constraint

    \begin{equation} \nabla_X\cdot {\bf{j}} = f^* + g^* +(\partial_t+X\cdot\nabla_Y)f. \end{equation} (5.37)

    Furthermore, note that for every h \in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)) , we have G^*(h) \in \mathbb R \cup \{+\infty\} .

    Lemma 5.5. Consider h \in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)) . Then

    \begin{equation} G^*(h) < +\infty \quad \implies \quad h \in W\cap L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)). \end{equation} (5.38)

    Proof. To prove the lemma we need to prove that (\partial_t+X\cdot\nabla_Y)h \in L_{Y, t}^2(V_{Y, t}, H_{X}^{-1}(U_X)) . Using that we take a supremum in the definition of G^\ast we can develop lower bounds on G^\ast by restricting the set with respect to which we take the supremum. Here, for f\in W_0 , we choose to restrict the supremum to (f, {\bf{j}}, f^*) where {\bf{j}} = {\bf{j}}_0 is a solution of \nabla_X\cdot {\bf{j}}_0 = g^* and f^* : = -(\partial_t+X\cdot\nabla_Y)f . Recall from (5.19) that such a {\bf{j}}_0 \in (L_{Y, t}^2(V_{Y, t}, L^2_X(U_X)))^m exists. With these choices for {\bf{j}} and f^* , the constraint (5.37) is satisfied, and we obtain that

    \begin{align*} G^*(h)\geq \sup\limits_{f\in W_0} &\biggl\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X f, {\bf{j}}_0, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}}_0)) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ &-\iint_{V_{Y, t}} {\langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), (h(\cdot, Y, t) + f(\cdot, Y, t))\rangle}\, {{\text{d}}} Y{{\text{d}}} t\biggr \}. \end{align*}

    Consider f\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}})\subset W_0 . Then, again arguing as in (5.21), (5.22),

    \begin{align*} -\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\leq 0. \end{align*}

    Furthermore, restricting to f\in C^\infty_{0}({U_X\times V_{Y, t}})\subset C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) yields by the same argument that

    \begin{align*} -\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t = 0. \end{align*}

    Hence we have the lower bound

    \begin{align*} G^*(h)\geq \sup &\biggl\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X f, {\bf{j}}_0, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}}_0)) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ &-\iint_{V_{Y, t}} {\langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), h(\cdot, Y, t)\rangle}\, {{\text{d}}} Y{{\text{d}}} t\biggr \}, \end{align*}

    where the supremum now is taken with respect to f\in C_0^\infty(U_X\times V_{Y, t})\subset C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) . Moreover, as G^*(h) < +\infty , we have that

    \begin{align*} &-\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), h(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\notag\\ &\leq \iiint_{U_X\times V_{Y, t}} (\tilde A(\nabla_X f, {\bf{j}}_0, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}}_0)) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t +G^*(h) < \infty, \end{align*}

    for every f\in C_0^\infty(U_X\times V_{Y, t}) fixed. Note that by replacing f with -f in the above argument we also obtain a lower bound. In particular,

    \sup \ \biggl |\iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)h(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\biggr | < \infty,

    where the supremum is taken over f \in C_0^\infty(U_X\times V_{Y, t}) such that ||f||_{L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X))}\leq 1 . Using that C_0^\infty(U_X\times V_{Y, t}) is dense in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)) we can conclude that

    (\partial_t+X\cdot\nabla_Y)h\in L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X))

    and this observation proves (5.38).

    Lemma 5.5 gives at hand that in place of (5.35), we have reduced the matter to proving that

    \begin{equation} \qquad G^*(h) \ge 0\mbox{ for all }h\in W\cap L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)). \end{equation} (5.39)

    Furthermore, note that for \tilde h\in W\cap C_{X, 0}^\infty(\overline{U_X\times V_{Y, t}}) we have

    \begin{equation} G^*(h) \geq G^*(\tilde h) - \|f^*\|_{L^2_{Y, t}(V_{Y, t}, H_X^{-1}(U_X))} \| h-\tilde h \|_{L^2_{Y, t}(V_{Y, t}, H^1_{X}(U_X))}. \end{equation} (5.40)

    As we are to establish a lower bound on G^* , we may restrict to taking the supremum over f^* such that

    \begin{equation} \|f^*\|_{L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X))}\leq 1. \end{equation} (5.41)

    In Lemma 5.6 below we prove that

    G^*(h) \ge 0\mbox{ for all }h\in W\cap C_{X, 0}^\infty(\overline{U_X\times V_{Y, t}}).

    By combining this with (5.40) and (5.41) we see that

    G^*(h) \geq G^*(\tilde h) - \| h-\tilde h \|_{L^2_{Y, t}(V_{Y, t}, H^1_{X, 0}(U_X))}\geq - \| h-\tilde h \|_{L^2_{Y, t}(V_{Y, t}, H^1_{X, 0}(U_X))},

    for all \tilde h \in W\cap C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}) . Furthermore, by the definitions of W , and L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)) , we can choose a sequence h_j\in W\cap C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}) such that

    \lim\limits_{j\rightarrow \infty} \| h-h_j \|_{L^2_{Y, t}(V_{Y, t}, H^1_{X, 0}(U_X))} = 0.

    Hence the proof that G^*(h)\geq 0 , and hence the final piece in the proof of existence in Theorem 1.5, is to prove the following lemma.

    Lemma 5.6.

    \begin{equation} \qquad G^*(h) \ge 0\;{{for\; all}}\;h\in W\cap C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}). \end{equation} (5.42)

    Proof. To start the proof of the lemma we first note that we have, as f\in W_0 , that

    (\partial_t+X\cdot\nabla_Y)f \in L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)),

    and hence we can replace f^* by f^* - (\partial_t+X\cdot\nabla_Y)f in the variational formula (5.36) for G^* to get

    \begin{align*} G^*(h) \geq \sup\limits_{(f, {\bf{j}}, f^*)} &\biggl\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X f, {\bf{j}}, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}})) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ &+\iint_{V_{Y, t}} {\langle (f^* - (\partial_t+X\cdot\nabla_Y)f)(\cdot, Y, t), (h(\cdot, Y, t) + f(\cdot, Y, t))\rangle}\, {{\text{d}}} Y{{\text{d}}} t\biggr \}, \end{align*}

    where the supremum now is taken with respect to

    \begin{align} (f, {\bf{j}}, f^*)\in (W\cap C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}))\times (L_{Y, t}^2(V_{Y, t}, L^2_X(U_X)))^m\times L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)), \end{align} (5.43)

    subject to the constraint

    \begin{equation} \nabla_X\cdot {\bf{j}} = f^* + g^*. \end{equation} (5.44)

    Next using that f\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) , h\in C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}) , we have

    \begin{align*} &\iint_{V_{Y, t}} -\langle (\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), (h(\cdot, Y, t)+f(\cdot, Y, t))\rangle\notag\\ & = \iint_{V_{Y, t}} \langle (\partial_t+X\cdot\nabla_Y)h(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\notag\\ &\quad {-\int_{U_X}\iint_{\partial V_{Y, t}} \bigl (\frac 1 2 f^2+fh)(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X.} \end{align*}

    Using the identity in the last display we see that

    \begin{align} G^*(h)\geq \sup\limits_{(f, {\bf{j}}, f^*)} &\biggl\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X f, {\bf{j}}, \cdot, \cdot, \cdot)-(\nabla_X f\cdot {\bf{j}})) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}} \langle f^*, (h(\cdot, Y, t)+f(\cdot, Y, t))\rangle +\langle (\partial_t+X\cdot\nabla_Y)h(\cdot, Y, t), f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\\ &-\int_{U_X}\iint_{\partial V_{Y, t}} \bigl (\frac 1 2 f^2+fh)(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\biggr \}, \end{align} (5.45)

    where the supremum still is with respect to (f, {\bf{j}}, f^*) as in (5.43) subject to (5.44). Now, by arguing exactly as in the passage between displays (3.23) and (3.26) in [29], using the properties of \tilde A , we can conclude that it suffices to prove that \tilde G^*(h)\geq 0 where

    \begin{align*} \tilde G^*(h) : = \sup\limits_{(\tilde f, {\bf{j}}, f^*, b)} &\biggl\{ \iiint_{U_X\times V_{Y, t}} -(\tilde A(\nabla_X \tilde f, {\bf{j}}, \cdot, \cdot, \cdot)-(\nabla_X \tilde f\cdot {\bf{j}})) \, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\notag\\ &+\iint_{V_{Y, t}} \langle f^*, (h(\cdot, Y, t)+\tilde f(\cdot, Y, t))\rangle +\langle (\partial_t+X\cdot\nabla_Y)h(\cdot, Y, t), \tilde f(\cdot, Y, t)\rangle\, {{\text{d}}} Y{{\text{d}}} t\notag\\ &-\int_{U_X}\iint_{\partial V_{Y, t}} \bigl (\frac 1 2 b^2+bh)(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\biggr \}, \end{align*}

    and where the supremum is taken with respect to all (\tilde f, {\bf{j}}, f^*, b) in the set

    \begin{align*} (W\cap C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}))\times (L_{Y, t}^2(V_{Y, t}, L^2_X(U_X)))^m\times L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X))\times C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}), \end{align*}

    subject to the condition stated in [29], i.e., that

    \Gamma(\tilde f, b): = ||\tilde f||_{L_{Y, t}^2(V_{Y, t}, H_X^1(U_X))}+||b||_{L_{Y, t}^2(V_{Y, t}, H_X^1(U_X))}\leq\Gamma

    for some large but fixed \Gamma\geq 1 . However, this implies that \tilde f: = -h is an admissible function. With this choice of \tilde f , we then let {\bf{j}}: = A(-\nabla_X h, X, Y, t) \in (L_{Y, t}^2(V_{Y, t}, L^2_X(U_X)))^m and then

    f^* = \nabla_X\cdot {\bf{j}} -g^*\in L_{Y, t}^2(V_{Y, t}, H_X^{-1}(U_X)).

    Using this we deduce that

    \begin{align*} \tilde G^*(h) \geq \sup\limits_{b} &\biggl\{-\int_{U_X}\iint_{\partial V_{Y, t}} \frac 1 2 (b+h)^2(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\biggl \}, \end{align*}

    where supremum now is taken with respect to b\in C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) . Using Lemma 5.7 below it follows that

    \begin{align*} \sup\limits_{b} &\biggl\{-\int_{U_X}\iint_{\partial V_{Y, t}} \frac 1 2 (b+h)^2(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\biggl \}\geq 0. \end{align*}

    The proof of the lemma is therefore complete.

    Lemma 5.7. Assume that h\in W(U_X\times V_{Y, t})\cap C^\infty_{X, 0}(\overline{U_X\times V_{Y, t}}) . Then

    \begin{align} \sup\limits_{b\in W\cap C^\infty_{{\mathcal K}, 0} (U_X\times V_{Y, t})} -\iiint_{U_X\times\partial V_{Y, t}} {(b+h)^2}(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\geq 0. \end{align} (5.46)

    Lemma 5.7 is Lemma 3.7 in [29] and in the next subsection we supply parts of the proof for completion.

    Let \psi(s)\in C^\infty(\mathbb R) be such that 0\leq \psi \leq 1 ,

    \begin{equation*} \psi \equiv 1\ \mbox{on }[ 0, 1], \ \psi \equiv 0\ \mbox{on }[ 2, \infty), \end{equation*}

    |\psi'|\leq 2 and such that \sqrt{1-\psi^2}\in C^\infty(\mathbb R) . Based on \psi we introduce for r , 0\leq r < \infty

    \begin{equation} \psi_r(X, Y, t) : = \psi\biggl( r\, \frac{\big((X, 1)\cdot N_{Y, t}\big)^+}{1+|X|^2} \biggr), \end{equation} (5.47)

    where we use the notation s^+: = \max\lbrace s, 0 \rbrace for s\in \mathbb R . As h is smooth, and U_X and V_{Y, t} are bounded domains, we have

    \begin{equation} \iiint_{U_X\times \partial V_{Y, t}} h^2|(X, 1)\cdot N_{Y, t}|{{\text{d}}} X{{\text{d}}} \sigma_{Y, t} < \infty. \end{equation} (5.48)

    Let, for any r\geq 0 ,

    \begin{equation} {b_r : = (\psi_r-1)h.} \end{equation} (5.49)

    As in the proof of Lemma 3.7 in [29] it follows that

    \begin{equation} b_r\in W(U_X\times V_{Y, t}). \end{equation} (5.50)

    By construction, b_r vanishes on \partial_{\mathcal K}(U_X\times V_{Y, t}) . Together with (5.50), this yields that b_r\in W\cap C^\infty_{{\mathcal K}, 0}(\overline{U_X\times V_{Y, t}}) . Furthermore,

    \begin{equation*} \begin{split} -\iiint_{U_X\times\partial V_{Y, t}} {(b_r+h)^2}(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X & = -\iiint_{U_X\times\partial V_{Y, t}} \psi_r^2h^2(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X. \end{split} \end{equation*}

    Letting r\to \infty we see that

    \begin{align*} &\lim\limits_{r\rightarrow \infty} -\iiint_{U_X\times\partial V_{Y, t}} \psi_r^2h^2(X, 1)\cdot N_{Y, t}\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\\ & = \iiint_{U_X\times\partial V_{Y, t}} h^2\big((X, 1)\cdot N_{Y, t}\big)^+\, {{\text{d}}}\sigma_{Y, t}{{\text{d}}} X\geq 0. \end{align*}

    Retracing the argument we see by (5.26) and (5.28) that

    \begin{equation} J[f_1, g^*] = 0\mbox{ for some }f_1\in W_0. \end{equation} (5.51)

    Using Lemma 5.3 we can conclude that f_1 is the unique weak solution f_1\in W_0 to {\mathcal L} u = g^\ast in U_X\times V_{Y, t} in the sense of Eq (2.7). This completes the proof of existence and uniqueness part of Theorem 1.5. The quantitative estimate follows in the standard way.

    Assume that u\in W(U_X\times V_{Y, t}) is a weak sub-solution to the equation

    \begin{equation} \nabla_X\cdot(A(\nabla_X u, X, Y, t))-(\partial_t+X\cdot\nabla_Y)u = g^* \text{ in } \ U_X\times V_{Y, t}. \end{equation} (5.52)

    By definition this means in particular that u\in W(U_X\times V_{Y, t}) . Given u we now let v\in W(U_X\times V_{Y, t}) be the unique weak solution to the problem

    \begin{equation} \begin{cases} \nabla_X\cdot(A(\nabla_X v, X, Y, t))-(\partial_t+X\cdot\nabla_Y)v = g^* &\text{in} \ U_X\times V_{Y, t}, \\ v = u & \text{on} \ \partial_{\mathcal K}(U_X\times V_{Y, t}), \end{cases} \end{equation} (5.53)

    in the sense that

    \begin{eqnarray} v\in W(U_X\times V_{Y, t}), \ (v-u)\in W_0(U_X\times V_{Y, t}), \end{eqnarray} (5.54)

    and in the sense that (2.7) holds for all \phi\in L_{Y, t}^2(V_{Y, t}, H_{X, 0}^1(U_X)) . By Theorem 1.5 v exists and is unique. We want to prove that u\leq v a.e. in U_X\times V_{Y, t} . To achieve this we let \epsilon > 0 be arbitrary and we use the test function \phi = (u-v-\epsilon)^+ . Then \phi is a non-negative admissible test function and \phi = 0 on \partial_{\mathcal K}(U_X\times V_{Y, t}) . Hence,

    \begin{align} &\iiint_{U_X\times V_{Y, t}}A(\nabla_X u, X, Y, t)\cdot\nabla_X\phi\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}}\ \langle g^\ast(\cdot, Y, t)+ (\partial_t+X\cdot\nabla_Y)u(\cdot, Y, t), \phi(\cdot, Y, t)\rangle\, {{\text{d}}} Y {{\text{d}}} t\leq 0, \end{align} (5.55)

    and

    \begin{align} &\iiint_{U_X\times V_{Y, t}}A(\nabla_X v, X, Y, t)\cdot\nabla_X\phi\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}}\ \langle g^\ast(\cdot, Y, t)+ (\partial_t+X\cdot\nabla_Y)v(\cdot, Y, t), \phi(\cdot, Y, t)\rangle\, {{\text{d}}} Y {{\text{d}}} t = 0. \end{align} (5.56)

    Subtracting these relations, we get

    \begin{align} &\iiint_{U_X\times V_{Y, t}}(A(\nabla_X v, X, Y, t)-A(\nabla_X u, X, Y, t))\cdot\nabla_X\phi\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &+\iint_{V_{Y, t}}\langle\partial_t+X\cdot\nabla_Y)(v-u)(\cdot, Y, t), \phi(\cdot, Y, t)\rangle\, {{\text{d}}} Y {{\text{d}}} t\geq 0. \end{align} (5.57)

    Using the property (1.8)- (ii) , we now first note that

    \begin{align} &\iiint_{U_X\times V_{Y, t}}(A(\nabla_X v, X, Y, t)-A(\nabla_X u, X, Y, t))\cdot\nabla_X\phi\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ &\leq -\Lambda^{-1}\iiint_{U_X\times V_{Y, t}}|\nabla_X(u-v-\epsilon)^+|^2\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t. \end{align} (5.58)

    Second, again using the definition of W(U_X\times V_{Y, t}) and that (v-u)\in W_0(U_X\times V_{Y, t}) , we see that we see that there exists a sequence \{f_j\} , f_j\in C_{{\mathcal K}, 0}^\infty(\overline{U_X\times V_{Y, t}}) such

    \begin{align} &\iint_{V_{Y, t}}(\partial_t+X\cdot\nabla_Y)(v-u)(\cdot, Y, t), \phi(\cdot, Y, t)\rangle\, {{\text{d}}} Y {{\text{d}}} t\\ & = -\lim\limits_{j\to\infty} \iiint_{U_X\times V_{Y, t}}(\partial_t+X\cdot\nabla_Y)(f_j-\epsilon)^+(f_j-\epsilon)^+\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\\ & = -\frac 12\lim\limits_{j\to\infty} \iint_{U_X\times \partial V_{Y, t}}((f_j-\epsilon)^+)^2\, (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t}\leq 0, \end{align} (5.59)

    as f_j = 0 on \partial_{\mathcal K}(U_X\times V_{Y, t}) . Hence, combining (5.57)–(5.59) we conclude that

    \begin{align} \iiint_{U_X\times V_{Y, t}}|\nabla_X(u-v-\epsilon)^+|^2\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\leq 0. \end{align} (5.60)

    Finally, using, for a.e. (Y, t)\in V_{Y, t} , the Poincaré inequality on U_X we deduce from (5.60) that

    \begin{align} \iiint_{U_X\times V_{Y, t}}|(u-v-\epsilon)^+|^2\, {{\text{d}}} X {{\text{d}}} Y {{\text{d}}} t\leq 0. \end{align} (5.61)

    Hence (u-v-\epsilon)^+ = 0 a.e. in U_X\times V_{Y, t} and hence u\leq v+\epsilon a.e. in U_X\times V_{Y, t} . We can conclude that we have proved the following theorem.

    Theorem 5.1. Let u\in W(U_X\times V_{Y, t}) be a weak sub-solution to the equation in (5.52) in the sense of Definition 4. Given u , let v\in W(U_X\times V_{Y, t}) be the unique weak solution to the problem in (5.53) in the sense of Definition 4. Then u\leq v a.e. in U_X\times V_{Y, t} . Similarly, if u\in W(U_X\times V_{Y, t}) is a weak super-solution to the equation in (5.52) in the sense of Definition 4, then v\leq u a.e. in U_X\times V_{Y, t} .

    In this paper we have initiated the study of weak solutions, and their regularity, for what we call nonlinear Kolmogorov-Fokker-Planck type equations. We believe that there are many directions to pursue in this field and in the following we formulate a number of problems.

    Let p , 1 < p < \infty , be given and let A = A(\xi, X, Y, t): \mathbb R^m\times \mathbb R^m\times \mathbb R^m\times \mathbb R\to \mathbb R^m be continuous with respect to \xi , and measurable with respect to X, Y and t . Assume that there exists a finite constant \Lambda\geq 1 such that

    \begin{align} \Lambda^{-1}|\xi|^p\leq A(\xi, X, Y, t)\cdot\xi\leq \Lambda|\xi|^p \end{align} (6.1)

    for almost every (X, Y, t)\in \mathbb R^{N+1} and for all \xi\in\mathbb{R}^m . Given A and p we introduce the operator \mathcal{L}_{A, p} through

    \begin{eqnarray} \mathcal{L}_{A, p}u: = \nabla_X\cdot (A(\nabla_X u(X, Y, t), X, Y, t))-(\partial_t+X\cdot\nabla_Y)u(X, Y, t). \end{eqnarray} (6.2)

    This defines a class of strongly degenerate nonlinear parabolic PDEs modelled on the classical PDE of Kolmogorov and the p -Laplace operator, and to our knowledge there is currently no literature devoted to these operators. The results established in this paper concern \mathcal{L}_{A, 2} assuming that A\in M(\Lambda) or A\in R(\Lambda) . We see a number of interesting research problems.

    Problem 1: Establish existence and uniqueness of weak solutions to the Dirichlet problem

    \begin{align} \begin{cases} \mathcal{L}_{A, p}u = g^*, \quad & \text{ in } U_X\times V_{Y, t}, \\ u = g , \quad & \text{ on }\partial_{\mathcal K}(U_X\times V_{Y, t}). \end{cases} \end{align} (6.3)

    Problem 2: Prove higher integrability, local boundedness, Harnack inequalities and local Hölder continuity of weak solutions for the equation \mathcal{L}_{A, p}u = 0 in the case p\neq 2 . This is a challenging problem and the first step is probably to figure out how to replace the result of Bouchut [7], or the use of the fundamental solution constructed by Kolmogorov, in this case. The problem is already very interesting for the prototype

    \begin{align} \nabla_X\cdot(|\nabla_X u(X, Y, t)|^{p-2}\nabla_X u(X, Y, t))-(\partial_t+X\cdot\nabla_Y)u(X, Y, t) = 0. \end{align} (6.4)

    Problem 3: Consider the equation in (6.4). Prove bounds for \nabla_Xu and local Hölder continuity of \nabla_Xu . Note that this must be a difficult problem in the nonlinear setting due to the lack of ellipticity in the variable Y . Again, the right place to start is probably to (simply) consider the equation

    \nabla_X\cdot(A(\nabla_Xu))-(\partial_t+X\cdot\nabla_Y)u = 0,

    where A(\xi) has linear growth, i.e., a nonlinear p = 2 case.

    Finally, we discuss the very formulation of the Dirichlet problem. Consider the geometry of U_X\times V_{Y, t} and let \Gamma: = \partial U_X\times V_{Y, t} and

    \begin{align} \Sigma^+&: = \{(X, Y, t)\in \overline{U_X}\times \partial V_{Y, t}\mid (X, 1)\cdot N_{Y, t} > 0\}, \\ \Sigma_0&: = \{(X, Y, t)\in \overline{U_X}\times \partial V_{Y, t}\mid (X, 1)\cdot N_{Y, t} = 0\}, \\ \Sigma^-&: = \{(X, Y, t)\in \overline{U_X}\times \partial V_{Y, t}\mid (X, 1)\cdot N_{Y, t} < 0\}. \end{align} (6.5)

    Using this notation \partial_{\mathcal K}(U_X\times V_{Y, t}) = \Gamma\cup \Sigma^- . Recall that W(U_X\times V_{Y, t}) is defined as the closure of C^\infty(\overline{U_X\times V_{Y, t}}) in the norm

    \begin{align} ||u||_{W(U_X\times V_{Y, t})}&: = \bigl (||u||_{L_{Y, t}^2(V_{Y, t}, H_X^1(U_X))}^2+||(\partial_t+X\cdot\nabla_Y)u||_{L_{Y, t}^2(V_{Y, t}, {H}_X^{-1}(U_X))}^2\bigr )^{1/2}. \end{align} (6.6)

    Assuming u\in W(U_X\times V_{Y, t}) , it is relevant to define and study the trace of u to \Gamma\cup \Sigma^+\cup \Sigma_0\cup\Sigma^- . We let = B^{2, 2}_{1/2}(\partial U_X) denote the Besov space defined as the trace space of H_X^1(U_X) to \partial U_X (this space is often denoted H^{1/2}(\partial U_X) in the literature). It is well known, that if U_X is a bounded Lipschitz domain, then there exists a bounded continuous non-injective operator T:H_X^1(U_X)\to B^{2, 2}_{1/2}(\partial U_X) , called the trace operator, and a bounded continuous operator E: B^{2, 2}_{1/2}(\partial U_X)\to H_X^1(U_X) called the extension operator. The trace space of L_{Y, t}^2(V_{Y, t}, H_X^1(U_X)) on \Gamma is therefore L_{Y, t}^2(V_{Y, t}, B^{2, 2}_{1/2}(\partial U_X)) and

    ||u||_{L_{Y, t}^2(V_{Y, t}, B^{2, 2}_{1/2}(\partial U_X))}\leq c||u||_{W(U_X\times V_{Y, t})}.

    The trace to \Sigma^+\cup \Sigma_0\cup\Sigma^- is less clear. Indeed, recall that the space W_{X, 0}(U_X\times V_{Y, t}) is defined as the closure in the norm of W(U_X\times V_{Y, t}) of C_{X, 0}^\infty(\overline{U_X\times V_{Y, t}}) . In particular, given f\in W_{X, 0}(U_X\times V_{Y, t}) there exists \{f_j\} , f_j\in C_{X, 0}^\infty(\overline{U_X\times V_{Y, t}}) such that

    ||f-f_j||_{W(U_X\times V_{Y, t})}\to 0\mbox{ as }j\to \infty,

    and consequently,

    ||(\partial_t+X\cdot\nabla_Y)(f-f_j)||_{L_{Y, t}^2(V_{Y, t}, {H}_X^{-1}(U_X))}\to 0\mbox{ as }j\to \infty.

    Using this we see that

    \begin{align} &\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\\ & = \lim\limits_{j\to\infty} \iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)f_j(\cdot, Y, t), f_j(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t\\ & = \frac 12\lim\limits_{j\to\infty} \iint_{\Sigma^+\cup \Sigma_0\cup\Sigma^-}f_j^2(X, Y, t)\, (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t}. \end{align} (6.7)

    The first obstruction to a trace inequality is that (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t} is a signed measure on \Sigma^+\cup \Sigma_0\cup\Sigma^- . Assuming that f\in W_{0}(U_X\times V_{Y, t}) we deduce that

    \begin{align} &\iint_{V_{Y, t}} \langle(\partial_t+X\cdot\nabla_Y)f(\cdot, Y, t), f(\cdot, Y, t)\rangle \, {{\text{d}}} Y{{\text{d}}} t \end{align} (6.8)
    \begin{align} & = \frac 12\lim\limits_{j\to\infty} \iint_{\Sigma^+\cup \Sigma_0}f_j^2(X, Y, t)\, (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t}. \end{align} (6.9)

    Hence, in this case

    \begin{align} \lim\limits_{j\to\infty} \iint_{\Sigma^+\cup \Sigma_0}f_j^2(X, Y, t)\, (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t}\leq c||f||_{W(U_X\times V_{Y, t})}, \end{align} (6.10)

    and we see that we can extract a subsequence of \{f_j\} converging in L^2(K, (X, 1)\cdot N_{Y, t}{{\text{d}}} \sigma_{Y, t}) whenever K is a compact subset of \Sigma^+ . At the expense of additional notation the roles of \Sigma^+ and \Sigma^- can be interchanged in this argument. This observation highlights the difficulty concerning the possibility of a trace inequality and concerning the identification of the trace space for W(U_X\times V_{Y, t}) . This explains why we in this paper, as in [29], have used the weaker formulation of the Dirichlet problem introduced.

    Problem 4: What function space is the space of traces, to \Gamma\cup \Sigma^+\cup \Sigma_0\cup\Sigma^- , of W(U_X\times V_{Y, t}) ?

    The authors declare no conflict of interest.

    [1] Adrian T, Ashcraft AB (2016) Shadow banking: A review of the literature. Staff Rep 6: 282–315.
    [2] Barth J, Joo S, Kim H, et al. (2018) Forecasting net charge-off rates of banks: A PLS approach. Unpublished Manuscript.
    [3] Barth JR, Miller SM (2017) A primer on the evolution and complexity of bank regulatory capital standards. Unpublished Manuscript.
    [4] Bastos JA (2010) Forecasting bank loans loss-given-default. J Banking Finance 34: 2510–2517. doi: 10.1016/j.jbankfin.2010.04.011
    [5] Bernoth K, Pick A (2011) Forecasting the fragility of the banking and insurance sectors. J Banking Finance 35: 807–818. doi: 10.1016/j.jbankfin.2010.10.024
    [6] Covas FB, Rump B, Zakrajšek E (2014) Stress-testing US bank holding companies: A dynamic panel quantile regression approach. Int J Forecasting 30: 691–713. doi: 10.1016/j.ijforecast.2013.11.003
    [7] Crook J, Banasik J (2012) Forecasting and explaining aggregate consumer credit delinquency behaviour. Int J Forecasting 28: 145–160. doi: 10.1016/j.ijforecast.2010.12.002
    [8] Drehmann M, Juselius M (2014) Evaluating early warning indicators of banking crises: Satisfying policy requirements. Int J Forecasting 30: 759–780. doi: 10.1016/j.ijforecast.2013.10.002
    [9] Fitzpatrick BD, Reichmeier J, Dowell J (2017) Back to the future: The Landscape of the Financial Services Industry 2020 and Beyond. J Adv Econ Finance 2: 40–53.
    [10] Geladi P, Kowalski BR (1986) Partial least-squares regression: A tutorial. Anal Chim Acta 185: 1–17. doi: 10.1016/0003-2670(86)80028-9
    [11] Guerrieri L, Welch M (2012) Can macro variables used in stress testing forecast the performance of banks? Unpublished Manuscript.
    [12] Hirtle B, Kovner A, Vickery J, et al. (2016) Assessing financial stability: The capital and loss assessment under stress scenarios (CLASS) model. J Banking Finance 69: S35–S55. doi: 10.1016/j.jbankfin.2015.09.021
    [13] Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecasting 22: 679–688. doi: 10.1016/j.ijforecast.2006.03.001
    [14] Jakšič M, Marinč M (2017) Relationship banking and information technology: The role of artificial intelligence and FinTech. Risk Manage 2017: 1–18.
    [15] Kupiec P (2018) Inside the black box: The accuracy of alternative stress test models. Unpublished Manuscript.
    [16] Luttrell D, Atkinson T, Rosenblum H (2013) Assessing the costs and consequences of the 2007–2009 financial crisis and its aftermath. Econ Lett 8: 1–4.
    [17] Pesaran MH (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74: 967–1012. doi: 10.1111/j.1468-0262.2006.00692.x
    [18] Roy AD (1952) Safety first and the holding of assets. Econometrica 20: 431–449. doi: 10.2307/1907413
    [19] Tibshirani JR (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc 58: 267–288.
    [20] Zou H, Hastie T (2010) Regularization and variable selection via the elastic net. J R Stat Soc 67: 301–320.
  • Reader Comments
  • © 2018 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(5680) PDF downloads(1246) Cited by(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog