LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome

Xingyu Tang; Peijie Zheng; Yuewu Liu; Yuhua Yao; Guohua Huang; Xingyu Tang; Peijie Zheng; Yuewu Liu; Yuhua Yao; Guohua Huang

doi:10.3934/mbe.2023048

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 1: 1037-1057. doi: 10.3934/mbe.2023048

Previous Article Next Article

Research article Special Issues

LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome

1.
School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China
2.
College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China
3.
School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China

Academic Editor: Leyi Wei

Received: 22 July 2022 Revised: 12 September 2022 Accepted: 18 September 2022 Published: 24 October 2022

DNase I hypersensitive sites (DHSs) are a specific genomic region, which is critical to detect or understand cis-regulatory elements. Although there are many methods developed to detect DHSs, there is a big gap in practice. We presented a deep learning-based language model for predicting DHSs, named LangMoDHS. The LangMoDHS mainly comprised the convolutional neural network (CNN), the bi-directional long short-term memory (Bi-LSTM) and the feed-forward attention. The CNN and the Bi-LSTM were stacked in a parallel manner, which was helpful to accumulate multiple-view representations from primary DNA sequences. We conducted 5-fold cross-validations and independent tests over 14 tissues and 4 developmental stages. The empirical experiments showed that the LangMoDHS is competitive with or slightly better than the iDHS-Deep, which is the latest method for predicting DHSs. The empirical experiments also implied substantial contribution of the CNN, Bi-LSTM, and attention to DHSs prediction. We implemented the LangMoDHS as a user-friendly web server which is accessible at http:/www.biolscience.cn/LangMoDHS/. We used indices related to information entropy to explore the sequence motif of DHSs. The analysis provided a certain insight into the DHSs.

Keywords:

Citation: Xingyu Tang, Peijie Zheng, Yuewu Liu, Yuhua Yao, Guohua Huang. LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome[J]. Mathematical Biosciences and Engineering, 2023, 20(1): 1037-1057. doi: 10.3934/mbe.2023048

Related Papers:

[1]	Ning He, Hongmei Jin, Hong'an Li, Zhanli Li . A global optimization generation method of stitching dental panorama with anti-perspective transformation. Mathematical Biosciences and Engineering, 2023, 20(9): 17356-17383. doi: 10.3934/mbe.2023772
[2]	Duolin Sun, Jianqing Wang, Zhaoyu Zuo, Yixiong Jia, Yimou Wang . STS-TransUNet: Semi-supervised Tooth Segmentation Transformer U-Net for dental panoramic image. Mathematical Biosciences and Engineering, 2024, 21(2): 2366-2384. doi: 10.3934/mbe.2024104
[3]	Shuai Cao, Biao Song . Visual attentional-driven deep learning method for flower recognition. Mathematical Biosciences and Engineering, 2021, 18(3): 1981-1991. doi: 10.3934/mbe.2021103
[4]	Fang Zhu, Wei Liu . A novel medical image fusion method based on multi-scale shearing rolling weighted guided image filter. Mathematical Biosciences and Engineering, 2023, 20(8): 15374-15406. doi: 10.3934/mbe.2023687
[5]	Xiao Zou, Jintao Zhai, Shengyou Qian, Ang Li, Feng Tian, Xiaofei Cao, Runmin Wang . Improved breast ultrasound tumor classification using dual-input CNN with GAP-guided attention loss. Mathematical Biosciences and Engineering, 2023, 20(8): 15244-15264. doi: 10.3934/mbe.2023682
[6]	Zhijing Xu, Jingjing Su, Kan Huang . A-RetinaNet: A novel RetinaNet with an asymmetric attention fusion mechanism for dim and small drone detection in infrared images. Mathematical Biosciences and Engineering, 2023, 20(4): 6630-6651. doi: 10.3934/mbe.2023285
[7]	Yong Tian, Tian Zhang, Qingchao Zhang, Yong Li, Zhaodong Wang . Feature fusion–based preprocessing for steel plate surface defect recognition. Mathematical Biosciences and Engineering, 2020, 17(5): 5672-5685. doi: 10.3934/mbe.2020305
[8]	Ziyue Wang, Junjun Guo . Self-adaptive attention fusion for multimodal aspect-based sentiment analysis. Mathematical Biosciences and Engineering, 2024, 21(1): 1305-1320. doi: 10.3934/mbe.2024056
[9]	Xi Lu, Xuedong Zhu . Automatic segmentation of breast cancer histological images based on dual-path feature extraction network. Mathematical Biosciences and Engineering, 2022, 19(11): 11137-11153. doi: 10.3934/mbe.2022519
[10]	Hui Yao, Yuhan Wu, Shuo Liu, Yanhao Liu, Hua Xie . A pavement crack synthesis method based on conditional generative adversarial networks. Mathematical Biosciences and Engineering, 2024, 21(1): 903-923. doi: 10.3934/mbe.2024038

Abstract

Dedicated to Professor Neil S. Trudinger on occasion of his 80th birthday.

1. Introduction

One classical problem in convex geometry is the Minkowski problem, which is to find convex hypersurfaces in $\mathbb{R}^{n+1}$ whose Gaussian curvature is prescribed as a function defined on $\mathbb{S}^n$ in terms of the inverse Gauss map. It has been settled by the works of Minkowski ^[23], Alexandrov ^[1], Fenchel and Jessen ^[28], Nirenberg ^[25], Pogorelov ^[26], Cheng and Yau ^[3], etc.. In smooth catagory, the Minkowski problem is equivalent to solve following Monge-Ampère equation

${\rm{det}}(\nabla^2u+ug_{\mathbb{S}^n}) = f\quad {\rm{on}}\,\,\mathbb{S}^n,$

where $u$ is the support function of the convex hypersurface, $\nabla^2u+ug_{\mathbb{S}^n}$ the spherical Hessian matrix of the function $u$ . If we take an orthonormal frame on ${\mathbb{S}^n}$ , the spherical Hessian of $u$ is $W_u(x): = u_{ij}(x)+u(x)\delta_{ij}$ , whose eigenvalues are actually the principal radii of the hypersurface.

The general problem of finding a convex hypersurface, whose $k$ -th symmetric function of the principal radii is the prescribed function on its outer normals for $1\le k < n$ , is often called the Christoffel-Minkowski problem. It corresponds to finding convex solutions of the nonlinear Hessian equation

$\sigma_k(W_u) = f\quad {\rm{on}}\,\,\mathbb{S}^n.$

This problem was settled by Guan et al ^[14,15]. In ^[16], Guan and Zhang considered a mixed Hessian equation as follows

$\begin{equation} \sigma_k(W_u(x))+\alpha(x)\sigma_{k-1}(W_u(x)) = \sum\limits_{l = 0}^{k-2}\alpha_l(x)\sigma_{l}(W_u(x)),\quad x\in\mathbb{S}^n, \end{equation}$

(1.1)

where $\alpha(x), \alpha_l(x)\, (0\le l\le k-1)$ are some functions on $\mathbb{S}^n$ . By imposing some group-invariant conditions on those coefficient's functions as in ^[11], the authors proved the existence of solutions.

Let $M$ be a hypersurface of Euclidean space ${\mathbb R}^{n+1}$ and $M = {{\rm{graph}}}\, u$ in a neighbourhood of some point at which we calculate. Let $A$ be the second fundamental form of $M$ , $\lambda(A) = (\lambda_1, \cdots, \lambda_n)\in {\mathbb R}^n$ the eigenvalues of $A$ with respect to the induced metric of $M\subset {\mathbb R}^{n+1}$ , i.e., the principle curvatures of $M$ , and $\sigma_k(\lambda)$ the $k$ -th elementary symmetric function, $\sigma_0(\lambda) = 1$ . It is natural to study the prescribing curvature problems on this aspect. In 1980s, Caffarelli, Nirenberg and Spruck studied the prescribing Weingarten curvature problem. The problem is equivalent to solve the following equation

$\sigma_k(\lambda)(X) = f(X), \quad \,X\in\mathcal{M}.$

When $k = n$ , the problem is just the Minkowski problem; when $k = 1$ , it is the prescribing mean curvature problem, c.f. ^[30,33]. The prescribing Weingarten curvature problem has been studied by many authors, we refer to ^{[2,9,11,12,13,29,37]} and references therein for related works. Recently, Zhou ^[36] generalised above mixed prescribed Weingarten curvature equation. He obtained interior gradient estimates for

$\begin{equation} \sigma_k(A)+\alpha(x)\sigma_{k-1}(A) = \sum\limits_{l = 0}^{k-2}\alpha_l(x)\sigma_{l}(A),\quad x\in B_r(0)\subset\mathbb{R}^n \end{equation}$

(1.2)

where $\sigma_k(A): = \sigma_k(\lambda(A))$ , and the coefficients satisfy $\alpha_{k-2} > 0$ and $\alpha_l\geq 0$ for $0\leq l\leq k-3$ .

Mixed Hessian type of equations arise naturally from many important geometric problems. One example is the so-called Fu-Yau equation arising from the study of the Hull-Strominger system in theoretical physics, which is an equation that can be written as the linear combination of the first and the second elementary symmetric functions

$\begin{equation} \sigma_1(i\partial\bar\partial(e^u+\alpha'e^{-u}))+\alpha'\sigma_2(i\partial\bar\partial u) = \phi \end{equation}$

(1.3)

on $n$ -dimensional compact Kähler manifolds. There are a lot of works related to this equation recently, see ^[6,7,27] for example. Another important example is the special Lagrangian equations introduced by Harvey and Lawson ^[18], which can be written as the alternative combinations of elementary symmetric functions

$\sin\theta\Big(\sum\limits_{k = 0}^{[\frac{n}{2}]}(-1)^k\sigma_{2k}(D^2u)\Big)+\cos\theta\Big(\sum\limits_{k = 0}^{[\frac{n-1}{2}]}(-1)^k\sigma_{2k+1}(D^2u)\Big) = 0.$

This equation is equivalent to

$F(D^2u): = \arctan\lambda_1+\cdots+\arctan\lambda_n = \theta$

where $\lambda_i$ 's are the eigenvalues of $D^2u$ . It is called supercritical if $\theta\in(\frac{(n-2)\pi}{2}, \frac{n\pi}{2})$ and hypercritical if $\theta\in(\frac{(n-1)\pi}{2}, \frac{n\pi}{2})$ . The Lagrangian phase operator $F$ is concave for the hypercritical case and has convex level sets for the supercritical case, while in general $F$ fails to be concave. For subcritical case, i.e., $0\leq \theta < \frac{(n-2)\pi}{2}$ , solutions of the special Lagrangian equation can fail to have interior estimates ^[24,35]. Jacob-Yau ^[20] initiated to study the deformed Hermitian Yang-Mills (dHYM) equation on a compact Kähler manifold $(M, \omega)$ :

${\rm{Re}}(\chi_u+\sqrt{-1}\omega)^n = \cot\theta_0{\rm{Im}}(\chi_u+\sqrt{-1}\omega)^n,$

where $\chi$ is a closed real $(1, 1)$ -form, $\chi_u = \chi+\sqrt{-1}\partial\bar\partial u$ , and $\theta_0$ is the angles of the complex number $\int_{M}(\chi+\sqrt{-1}\omega)^n$ , $u$ is the unknown real smooth function on $M$ . Jacob-Yau showed that dHYM equation has an equivalent form of special Lagrangian equation. Collins-Jacob-Yau ^[5] solved the dHYM equation by continuity method and Fu-Zhang ^[8] gave an alternative approach by dHYM flow, both of which considered in the supercritical case. For more results concerning about dHYM equation and special Lagrangian equation, one can consult Han-Jin ^[17], Chu-Lee ^[4] and the references therein. Note that for $n = 3$ and hypercritical $\theta\in(\pi, \frac{3\pi}{2})$ , the special Lagrangian equation (1.3) is

$\sigma_3(D^2u)+\tan\theta\sigma_2(D^2u) = \sigma_1(D^2u)+\tan\theta\sigma_0(D^2u)$

which is included in (1.1).

In this paper we derive interior curvature bounds for admissible solutions of a class of curvature equations subject to affine Dirichlet data. Let $\Omega$ be a bounded domain in $\mathbb{R}^n$ , and let $u\in C^4(\Omega)\cap C^{0, 1}(\bar\Omega)$ be an admissible solution of

$\begin{equation} \left\{ \begin{array}{lll} &\sigma_k(\lambda)+g(x,u)\sigma_{k-1}(\lambda) = \sum\limits_{l = 0}^{k-2}\alpha_l(x,u)\sigma_l(\lambda)\quad &{\rm{in}}\, \, \, \Omega,\\ &u = \phi\quad &{\rm{on}}\, \, \, \partial\Omega, \end{array} \right. \end{equation}$

(1.4)

where $g(x, u)$ and $\alpha_l(x, u) > 0$ , $l = 0, 1, \cdots, k-2$ , are given smooth functions on $\bar\Omega\times\mathbb{R}$ and $\phi$ is affine, $\lambda = (\lambda_1, \cdots, \lambda_n)$ is the vector of the principal curvatures of graph $u$ . $u$ is the admissible solution in the sense that $\lambda\in \Gamma_{k}$ for points on the graph of $u$ , with

$\Gamma_{k} = \{\lambda\in\mathbb{R}^n|\sigma_1(\lambda) > 0,\cdots,\sigma_{k}(\lambda) > 0\}.$

For simplicity we denote $F = G_k-\sum_{l = 0}^{k-2}\alpha_lG_l$ and $G_l = \sigma_l(\lambda)/\sigma_{k-1}(\lambda)$ for $l = 0, 1, \cdots, k-2, k$ . The ellipticity and concavity properties of the operator $F$ have been proved in ^[16]. Our main result is as follows.

Theorem 1.1. Assume that for every $l$ $(0\leq l\leq k-2)$ , $\alpha_l, g\in C^{1, 1}(\bar\Omega\times\mathbb{R})$ , $\alpha_l > 0$ , and $g > 0$ or $g < 0$ . $\phi$ is affine in $(1.4)$ . For any fixed $\beta > 0$ , if $u\in C^4(\Omega)\cap C^{0, 1}(\bar\Omega)$ is an admissible solution of $(1.4)$ , then there exists a constant $C$ , depending only on $n, k, \beta, ||u||_{C^1(\bar\Omega)}, \alpha_l, g$ and their first and second derivatives, such that the second fundamental form ${\bf{A}}$ of graph $u$ satisfies

$|{\bf{A}}|\leq \frac{C}{(\phi-u)^{\beta}}.$

Remark 1.1. Comparing with ^[16], here we require $g > 0$ or $g < 0$ additionally. Also our curvature estimates still hold if $\alpha_l\equiv 0$ for some $0\leq l\leq k-2$ . More over, if $\alpha_l\equiv 0$ for all $l = 0, 1, \cdots, k-2$ , Eq (1.4) becomes the Hessian quotient equation and the results can be followed from ^[29].

To see that this is an interior curvature estimate, we need to verify that $\phi-u > 0$ on $\Omega$ . We apply the strong maximum principle for the minimal graph equation. Since $\phi$ is affine, it satisfies the following minimal graph equation

$Qu: = (1+|Du|^2)\triangle u-u_iu_ju_{ij} = nH(1+|Du|^2)^{\frac{3}{2}} = 0\quad {\rm{on}}\quad \Omega.$

Since $u$ is $k$ -admissible solution, and $n\geq k\geq 2$ , graph of $u$ is mean-convex and $Qu > Q\phi = 0$ . By the comparison principle for quasilinear equations (Theorem 10.1 in ^[10]), we then have $\phi > u$ on $\Omega$ .

The main application of the curvature bound of Theorem 1.1 is to extend various existence results for the Dirichlet problem for curvature equations of mixed Hessian type.

Theorem 1.2. Let $\Omega$ be a bounded domain in $\mathbb{R}^n$ , let $\alpha_l, g\in C^{1, 1}(\bar\Omega\times\mathbb{R})$ satisfying $\inf |g| > 0$ , $\partial_u g(x, u)\leq 0$ , $\alpha_l > 0$ and $\partial_u \alpha_l(x, u)\geq 0$ . Suppose there is an admissible function $\underline{u}\in C^2(\Omega)\cap C^{0, 1}(\bar\Omega)$ satisfying

$\begin{equation} F[\underline{u}]\geq -g(x,\underline{u})\quad {{{in}}}\, \,\Omega,\quad\quad \underline{u} = 0\quad{{{on}}}\,\partial\Omega. \end{equation}$

(1.5)

Then the problem

$\begin{equation} F[u] = -g(x,u)\quad {{{in}}}\, \, \Omega,\quad\quad u = 0\quad{{{on}}}\, \, \partial\Omega. \end{equation}$

(1.6)

has a unique admissible solution $u\in C^{3, \alpha}(\Omega)\cap C^{0, 1}(\bar\Omega)$ for all $\alpha\in (0, 1)$ .

Remark 1.2. $\partial_u g\leq 0$ , $\partial_u \alpha_l(x, u)\geq 0$ and the existence of sub-solutions are required in the $C^0$ estimate. The $C^1$ interior estimate is a slightly modification of the result in Theorem 5.1.1 ^[36] since the coefficients $g$ , $\alpha_l$ of (1.2) are independent of $u$ . We use conditions $\partial_u g\leq 0$ and $\partial_u \alpha_l(x, u)\geq 0$ again to eliminate extra terms in the $C^1$ estimate.

As a further application of the a priori curvature estimate we also consider a Plateau-type problem for locally convex Weingarten hypersurfaces. Let $\Sigma$ be a finite collection of disjoint, smooth, closed, codimension 2 submanifolds of $\mathbb{R}^{n+1}$ . Suppose $\Sigma$ bounds a locally uniformly convex hypersurface $\mathcal{M}_0$ with

$f_{(n)}(\lambda^0): = \frac{\sigma_n}{\sigma_{n-1}}(\lambda^0)-\sum\limits_{l = 0}^{n-2}\alpha_l\frac{\sigma_{l}}{\sigma_{n-1}}(\lambda^0)\geq c,$

where $\lambda^0 = (\lambda^0_1, \cdots, \lambda^0_n)$ are the principal curvatures of $\mathcal{M}_0$ and $\alpha_l$ 's are positive constants, $c\neq 0$ is a constant. Is there a locally convex hypersurface $\mathcal{M}$ with boundary $\Sigma$ and $f_{(n)}(\lambda) = c$ , where $\lambda = (\lambda_1, \cdots, \lambda_n)$ are the principal curvatures of $\mathcal{M}$ ?

Theorem 1.3. Let $\Sigma$ , $f_{(n)}(\lambda)$ be as above. If $\Sigma$ bounds a locally uniformly convex hypersurface $\mathcal{M}_0$ with $f_{(n)}(\lambda^0)\geq c$ at each point of $\mathcal{M}_0$ . Then $\Sigma$ bounds a smooth, locally convex hypersurface $\mathcal{M}$ with $f_{(n)}(\lambda) = c$ at each point of $\mathcal{M}$ .

2. Proof of the curvature bound

We compute using a local orthonormal frame field $\hat{{\bf{e}}}_1, \cdots, \hat{{\bf{e}}}_n$ defined on $\mathcal{M} =$ graph $u$ in a neighbourhood of the point at which we are computing. The standard basis of $\mathbb{R}^{n+1}$ is denoted by ${\bf{e}}_1, \cdots, {\bf{e}}_{n+1}$ . Covariant differentiation on $\mathcal{M}$ in the direction $\hat{{\bf{e}}}_i$ is denoted by $\nabla_i$ . The components of the second fundamental form ${\bf{A}}$ of $\mathcal{M}$ in the basis $\hat{{\bf{e}}}_1, \cdots, \hat{{\bf{e}}}_n$ are denoted by $(h_{ij})$ . Thus

$h_{ij} = \langle D_{\hat{{\bf{e}}}_i}\hat{{\bf{e}}}_j,\nu\rangle,$

where $D$ and $\langle\cdot, \cdot\rangle$ denote the usual connection and inner product on $\mathbb{R}^{n+1}$ , and $\nu$ denotes the upward unit normal

$\nu = \frac{(-Du,1)}{\sqrt{1+|Du|^2}}.$

The differential equation in (1.4) can then be expressed as

$\begin{equation} F({\bf{A}},X) = -g(X). \end{equation}$

(2.1)

As usual we denote first and second partial derivatives of $F$ with respect to $h_{ij}$ by $F^{ij}$ and $F^{ij, rs}$ . We assume summation from $1$ to $n$ over repeated Latin indices unless otherwise indicated. Following two lemmas are similar to the ones in ^[29] with minor changes, so we omit the proof.

Lemma 2.1. The second fundamental form $h_{ab}$ satisfies

$\begin{align*} F^{ij}\nabla_i\nabla_jh_{ab} = &-F^{ij,rs}\nabla_ah_{ij}\nabla_bh_{rs}+F^{ij}h_{ij}h_{ap}h_{pb}\\ &-F^{ij}h_{ip}h_{pj}h_{ab}-\nabla_a\nabla_bg+\sum\limits_{l = 0}^{k-2} (\nabla_a\alpha_l\nabla_bG_l+\nabla_b\alpha_l\nabla_aG_l)\\ &+\sum\limits_{l = 0}^{k-2}\nabla_a\nabla_b\alpha_l\cdot G_l. \end{align*}$

Lemma 2.2. For any $\alpha = 1, \cdots, n+1$ , we have

$F^{ij}\nabla_i\nabla_j\nu_{\alpha}+F^{ij}h_{ip}h_{pj}\nu_{\alpha} = \langle\nabla g,\mathrm{e}_{\alpha}\rangle-\sum\limits_{l = 0}^{k-2}\langle\nabla\alpha_l,\mathrm{e}_{\alpha}\rangle G_l.$

Lemma 2.3. There is a constant $C > 0$ , depending only on $n, k, \inf \alpha_l, |g|_{C^0}$ , so that for any $l = 0, 1, \cdots, k-2$ ,

$|G_l|\leq C.$

Proof. Proof by contradiction. If the result is not true, then for any integer $i$ , there is an admissible solution $u_{(i)}$ , a point $x_{(i)}\in \Omega$ and an index $0\leq l_{(i)}\leq k-2$ , so that

$\frac{\sigma_{l_{(i)}}}{\sigma_{k-1}}(\lambda[u_{(i)}]) > i \quad {\rm{at}}\, \, x_{(i)}.$

By passing to a subsequence, we may assume $l_{(i)}\rightarrow l_{\infty}$ and $x_{(i)}\rightarrow x_{\infty}\in\bar\Omega$ as $i\rightarrow +\infty$ . Therefore

$\lim\limits_{i\rightarrow +\infty}\frac{\sigma_{l_{\infty}}}{\sigma_{k-1}}(\lambda[u_{(i)}])(x_{(i)}) = +\infty,$

or we may simply write $\frac{\sigma_{l_{\infty}}}{\sigma_{k-1}}\rightarrow +\infty$ if no ambiguilty arises. Since $\alpha_{l_{\infty}} > 0$ , and $g$ is bounded, by (1.4) we have $\frac{\sigma_k}{\sigma_{k-1}}\rightarrow +\infty$ . For $i$ large enough, $\sigma_k > 0$ . By Newton-MacLaurin inequalities, we have

$\frac{\sigma_{l_{\infty}}}{\sigma_{k-1}} = \frac{\sigma_{l_{\infty}}}{\sigma_{l_{\infty}+1}}\cdots\frac{\sigma_{k-2}}{\sigma_{k-1}}\leq C(\frac{\sigma_{k-1}}{\sigma_{k}})^{k-1-l_{\infty}}\rightarrow 0.$

We therefore get a contradiction.

Proof of Theorem 1.1. Here the argument comes from ^[29]. Let $\eta = \phi-u$ . $\eta > 0$ in $\Omega$ . For a function $\Phi$ to be chosen and a constant $\beta > 0$ fixed, we consider the function

$\tilde W(X,\xi) = \eta^{\beta}(\exp\Phi(\nu_{n+1}))h_{\xi\xi}$

for all $X\in\mathcal{M}$ and all unit vector $\xi\in \rm{T}_X\mathcal{M}$ . Then $\tilde W$ attains its maximum at an interior point $X_0\in\mathcal{M}$ , in a direction $\xi_0\in \rm{T}_{X_0}\mathcal{M}$ which we may take to be $\hat{{\bf{e}}}_1$ . We may assume that $(h_{ij})$ is diagonal at $X_0$ with eigenvalues $\lambda_1\geq \lambda_2\geq\cdots\geq\lambda_n$ . Without loss of generality we may assume that the $\hat{{\bf{e}}}_1, \cdots, \hat{{\bf{e}}}_n$ has been chosen so that $\nabla_i\hat{{\bf{e}}}_j = 0$ at $X_0$ for all $i, j = 1, \cdots, n$ . Let $\tau = \hat{{\bf{e}}}_1$ . Then $W(X) = \tilde W(X, \tau)$ is defined near $X_0$ and has an interior maximum at $X_0$ . Let $Z: = h_{ab}\tau_a\tau_b$ . By the special choice of frame and the fact that $h_{ij}$ is diagonal at $X_0$ in this frame, we can see that

$\nabla_i Z = \nabla_ih_{11}\quad\quad {\rm{and}}\quad\quad \nabla_i\nabla_jZ = \nabla_i\nabla_jh_{11}\quad {\rm{at}}\,X_0$

Therefore the scalar function $Z$ satisfies the same equation as the component $h_{11}$ of the tensor $h_{ij}$ . Thus at $X_0$ , we have

$\begin{equation} \frac{\nabla_i W}{W} = \beta\frac{\nabla_i\eta}{\eta}+\Phi'\nabla_i\nu_{n+1}+\frac{\nabla_ih_{11}}{h_{11}} = 0 \end{equation}$

(2.2)

and

$\begin{align} \frac{\nabla_i\nabla_jW}{W}-\frac{\nabla_i W\nabla_j W}{W^2} = &\beta\Big(\frac{\nabla_i\nabla_j\eta}{\eta}-\frac{\nabla_i\eta\nabla_j\eta}{\eta^2}\Big)\\ &+\Phi''\nabla_i\nu_{n+1}\nabla_j\nu_{n+1}+\Phi'\nabla_i\nabla_j\nu_{n+1}\\ &+\frac{\nabla_i\nabla_j h_{11}}{h_{11}}-\frac{\nabla_ih_{11}\nabla_jh_{11}}{h_{11}^2} \end{align}$

(2.3)

is nonpositive in the sense of matrices at $X_0$ . By Lemmas 2.1 and 2.2, we have, at $X_0$ ,

$\begin{align} 0\geq & \beta F^{ij}\Big(\frac{\nabla_i\nabla_j\eta}{\eta}-\frac{\nabla_i\eta\nabla_j\eta}{\eta^2}\Big)+\Phi''F^{ij}\nabla_i\nu_{n+1}\nabla_j\nu_{n+1}\\ &-(\Phi'\nu_{n+1}+1)F^{ij}h_{ip}h_{pj}+F^{ij}h_{ij}h_{11}-\frac{\nabla_1\nabla_1 g}{h_{11}}\\ &+\Phi'\langle\nabla g,{\bf{e}}_{n+1}\rangle-\frac{1}{h_{11}}F^{ij,rs}\nabla_1h_{ij}\nabla_1h_{rs}-F^{ij}\frac{\nabla_ih_{11}\nabla_jh_{11}}{h_{11}^2}\\ &-\sum\limits_{l = 0}^{k-2}\Phi'\langle\nabla\alpha_l,{\bf{e}}_{n+1}\rangle\frac{\sigma_l}{\sigma_{k-1}}+\sum\limits_{l = 0}^{k-2}\frac{1}{h_{11}}\Big(2\nabla_1\alpha_l\cdot\nabla_1\frac{\sigma_l}{\sigma_{k-1}}+\nabla_1\nabla_1\alpha_l\cdot\frac{\sigma_l}{\sigma_{k-1}}\Big). \end{align}$

(2.4)

Using Gauss's formula

$\nabla_i\nabla_j X_{\alpha} = h_{ij}\nu_{\alpha},$

we have

$\begin{align*} \nabla_1\nabla_1 g(X)& = \sum\limits_{\alpha = 1}^{n+1}\frac{\partial g}{\partial X_{\alpha}}\nabla_1\nabla_1X_{\alpha}+\sum\limits_{\alpha,\beta = 1}^{n+1}\frac{\partial^2 g}{\partial X_{\alpha}\partial X_{\beta}}\nabla_1X_{\alpha}\nabla_1 X_{\beta}\\ & = \sum\limits_{\alpha = 1}^{n+1}\frac{\partial g}{\partial X_{\alpha}}\nu_{\alpha}h_{11}+\sum\limits_{\alpha,\beta = 1}^{n+1}\frac{\partial^2 g}{\partial X_{\alpha}\partial X_{\beta}}\nabla_1X_{\alpha}\nabla_1 X_{\beta}. \end{align*}$

Consequently,

$|\frac{\nabla_1\nabla_1 g}{h_{11}}|\leq C.$

For the same reason, we have for all $l = 0, \cdots, k-2$ ,

$|\frac{\nabla_1\nabla_1\alpha_l}{h_{11}}|\leq C.$

Taking Lemma 2.3 into count, we estimate the two terms in the last line of (2.4) as

$-\sum\limits_{l = 0}^{k-2}\Phi'\langle\nabla\alpha_l,{\bf{e}}_{n+1}\rangle\frac{\sigma_l}{\sigma_{k-1}}+\sum\limits_{l = 0}^{k-2}\frac{1}{h_{11}}\nabla_1\nabla_1\alpha_l\cdot\frac{\sigma_l}{\sigma_{k-1}}\geq -C|\Phi'|-C.$

Recall that $F = G_k-\sum \alpha_lG_l$ and it is well-known that the operator $(\frac{\sigma_{k-1}}{\sigma_{l}})^{\frac{1}{k-1-l}}$ is concave for $0\leq l\leq k-2$ . It follows that

$(\frac{1}{G_l})^{\frac{1}{k-1-l}}\,{\rm{is \;a\; concave\; operator\; for}}\,\forall l = 0,1,\cdots,k-2.$

For any symmetric matrix $(B_{ij})\in\mathbb{R}^{n\times n}$ , we have

$\Big\{(\frac{1}{G_l})^{\frac{1}{k-1-l}}\Big\}^{ij,rs}B_{ij}B_{rs}\leq 0.$

Direct computation shows that

$G_l^{ij,rs}B_{ij}B_{rs}\geq \frac{1}{G_l}\cdot\frac{k-l}{k-1-l}\cdot(G_l^{ij}B_{ij})^2.$

Note that $G_k$ is also a concave operator.

$\begin{align*} &-\frac{1}{h_{11}}F^{ij,rs}\nabla_1h_{ij}\nabla_1h_{rs}+\sum\limits_{l = 0}^{k-2}\frac{2}{h_{11}}\nabla_1\alpha_l\cdot\nabla_1\frac{\sigma_l}{\sigma_{k-1}}\\ = &-\frac{1}{h_{11}}G_k^{ij,rs}\nabla_1h_{ij}\nabla_1h_{rs}+\sum\limits_{l = 0}^{k-2}\frac{\alpha_l}{h_{11}}G_l^{ij,rs}\nabla_1h_{ij}\nabla_1h_{rs}+\sum\limits_{l = 0}^{k-2}\frac{2}{h_{11}}\nabla_1\alpha_l\cdot\nabla_1\frac{\sigma_l}{\sigma_{k-1}}\\ \geq & \frac{1}{h_{11}}\sum\limits_{l = 0}^{k-2}G_l^{-1}\alpha_lC_l(\nabla_1G_l+\frac{\nabla_1\alpha_l}{C_l\alpha_l}G_l)^2-\frac{1}{h_{11}}\sum\limits_{l = 0}^{k-2}\frac{(\nabla_1\alpha_l)^2}{C_l\alpha_l}G_l\\ \geq & -\frac{C}{h_{11}} \end{align*}$

where $C_l = \frac{k-l}{k-1-l}$ . By the homogeneity of $G_l$ 's, we see that

$F^{ij}h_{ij} = G_k+\sum\limits_{l = 0}^{k-2}\alpha_l(k-1-l)G_l\geq G_k+\sum\limits_{l = 0}^{k-2}\alpha_l\frac{\sigma_l}{\sigma_{k-1}}\geq \inf |g| > 0.$

Using Lemma 2.3 again, we have

$F^{ij}h_{ij}\leq C.$

Next we assume that $\phi$ has been extended to be constant in the ${\bf{e}}_{n+1}$ direction.

$\begin{align*} \nabla_i\nabla_j\eta& = \sum\limits_{\alpha,\beta = 1}^n\frac{\partial^2\phi}{\partial X_{\alpha}\partial X_{\beta}}\nabla_i X_{\alpha}\nabla_j X_{\beta}+\sum\limits_{\alpha = 1}^n\frac{\partial \phi}{\partial X_{\alpha}}\nabla_i\nabla_j X_{\alpha}-\nabla_i\nabla_jX_{n+1}\\ & = \sum\limits_{\alpha = 1}^{n}\frac{\partial \phi}{\partial X_{\alpha}}\nu_{\alpha}h_{ij}-h_{ij}\nu_{n+1}. \end{align*}$

Consequently,

$F^{ij}\nabla_i\nabla_j\eta = (\sum\limits_{\alpha = 1}^{n}\frac{\partial \phi}{\partial X_{\alpha}}\nu_{\alpha}-\nu_{n+1})F^{ij}h_{ij}.$

Using above estimates in (2.4), we have, at $X_0$ ,

$\begin{align} 0\geq&-\frac{C\beta}{\eta}-\beta F^{ij}\frac{\nabla_i\eta\nabla_j\eta}{\eta^2}+\Phi''F^{ij}\nabla_i\nu_{n+1}\nabla_j\nu_{n+1}-F^{ij}\frac{\nabla_ih_{11}\nabla_jh_{11}}{h_{11}^2}\\ &-(\Phi'\nu_{n+1}+1)F^{ij}h_{ip}h_{pj}+\inf |g|h_{11}-C(1+|\Phi'|). \end{align}$

(2.5)

Next, using (2.2), we have

$\begin{align*} F^{ij}\frac{\nabla_ih_{11}\nabla_jh_{11}}{h_{11}^2} & = F^{ij}\Big(\beta\frac{\nabla_i\eta}{\eta}+\Phi'\nabla_i\nu_{n+1}\Big)\Big(\beta\frac{\nabla_j\eta}{\eta}+\Phi'\nabla_j\nu_{n+1}\Big)\\ &\leq (1+\gamma^{-1})\beta^2F^{ij}\frac{\nabla_i\eta\nabla_j\eta}{\eta^2}+(1+\gamma)(\Phi')^2F^{ij}\nabla_i\nu_{n+1}\nabla_j\nu_{n+1} \end{align*}$

for any $\gamma > 0$ . Therefore at $X_0$ we have, since $|\nabla\eta|\leq C$ ,

$\begin{align} 0\geq& -\frac{C\beta}{\eta}-C[\beta+(1+\gamma^{-1})\beta^2]\frac{\sum_{i = 1}^n F^{ii}}{\eta^2}\\ &+[\Phi''-(1+\gamma)(\Phi')^2]F^{ij}\nabla_i\nu_{n+1}\nabla_j\nu_{n+1}\\ &-[\Phi'\nu_{n+1}+1]F^{ij}h_{ip}h_{pj}+\inf |g|h_{11}-C(1+|\Phi'|). \end{align}$

(2.6)

We choose a positive constant $a$ , so that

$a\leq \frac{1}{2}\nu_{n+1} = \frac{1}{2\sqrt{1+|Du|^2}}$

which depends only on $\sup_{\Omega}|Du|$ . Therefore

$\frac{1}{\nu_{n+1}-a}\leq\frac{1}{a}\leq C.$

We now choose

$\Phi(t) = -\log(t-a).$

Then

$\Phi'(t) = \frac{-1}{t-a},\quad\Phi''(t) = \frac{1}{(t-a)^2},$

and

$\begin{align*} -(\Phi't+1)& = \frac{a}{t-a},\\ \Phi''-(1+\gamma)(\Phi')^2& = -\frac{\gamma}{(t-a)^2}. \end{align*}$

By direct computation, we have $\nabla_i\nu_{n+1} = -h_{ip}\langle \hat{{\bf{e}}}_p, {\bf{e}}_{n+1}\rangle$ , and therefore

$F^{ij}\nabla_i\nu_{n+1}\nabla_j\nu_{n+1} = F^{ij}h_{ip}h_{jq}\langle \hat{{\bf{e}}}_p,{\bf{e}}_{n+1}\rangle\langle \hat{{\bf{e}}}_q,{\bf{e}}_{n+1}\rangle\leq F^{ij}h_{ip}h_{pj}.$

Next we choose $0 < \gamma\leq\frac{a^2}{2}$ , then we have

$-(\Phi't+1)+[\Phi''-(1+\gamma)(\Phi')^2] = \frac{a}{t-a}-\frac{\gamma}{(t-a)^2}\geq\frac{\frac{1}{2}a^2}{(t-a)^2} > 0.$

Thus we have

$\begin{align} 0\geq -\frac{C\beta}{\eta}-C(\beta,a)\eta^{-2}(\sum\limits_{i = 1}^nF^{ii})+\inf |g|h_{11}-C(a). \end{align}$

(2.7)

In the following we show that $\sum_{i = 1}^nF^{ii}\leq C$ . By the definition of operator $F$ and Lemma 2.3, we have

$\begin{align*} \sum\limits_{i = 1}^nF^{ii}& = \sum\limits_{i = 1}^n(\frac{\sigma_k}{\sigma_{k-1}})^{ii}-\sum\limits_{i = 1}^n\sum\limits_{l = 0}^{k-2}\alpha_l(\frac{\sigma_l}{\sigma_{k-1}})^{ii}\\ & = \sum\limits_{i = 1}^n\frac{\sigma_{k-1}(\lambda|i)}{\sigma_{k-1}}-\frac{\sigma_k}{\sigma_{k-1}^2}\sum\limits_{i = 1}^n\sigma_{k-2}(\lambda|i)+\frac{\alpha_0}{\sigma_{k-1}^2}\sum\limits_{i = 1}^n\sigma_{k-2}(\lambda|i)\\ &\quad +\sum\limits_{l = 1}^{k-2}\alpha_l\frac{\sum_i\sigma_l\sigma_{k-2}(\lambda|i)-\sum_i\sigma_{k-1}\sigma_{l-1}(\lambda|i)}{\sigma_{k-1}^2}\\ & = n-k+1-(n-k+2)\frac{\sigma_k\sigma_{k-2}}{\sigma_{k-1}^2}+(n-k+2)\alpha_0\frac{\sigma_{k-2}}{\sigma_{k-1}^2}\\ &\quad +\sum\limits_{l = 1}^{k-2}\alpha_l\frac{(n-k+2)\sigma_l\sigma_{k-2}-(n-l+1)\sigma_{k-1}\sigma_{l-1}}{\sigma_{k-1}^2}\\ &\leq n-k+1+(n-k+2)|\frac{\sigma_k}{\sigma_{k-1}}G_{k-2}|+(n-k+2)|\alpha_0|_{C^0}|G_{k-2}G_0|\\ &\quad +(n-k+2)|G_{k-2}|\sum\limits_{l = 1}^{k-2}|\alpha_l|_{C^0}|G_l|. \end{align*}$

From Eq (1.4) we have $|G_k|\leq C$ , therefore $\sum_i F^{ii}\leq C$ . At $X_0$ , we get an upper bound

$\lambda_1\leq \frac{C(\beta,a)}{\eta^2}.$

Consequently, $W(X_0)$ satisfies an upper bound. Since $W(X)\leq W(X_0)$ , we get the required upper bound for the maximum principle curvature. Since $\lambda\in\Gamma_{k}$ and $n\geq k\geq 2$ , $u$ is at least mean-convex and

$\sum\limits_{i = 1}^n\lambda_i > 0.$

Therefore $\lambda_n\geq -(n-1)\lambda_1$ and

$|{\bf{A}}| = \sqrt{\sum\limits_{i = 1}^n\lambda_i^2}\leq C(n)\lambda_1\leq \frac{C}{(\phi-u)^{\beta}}.$

3. The Dirichlet problem

In this section we prove Theorem 1.2. By comparison principle, we have $0\geq u\geq \underline{u}$ . For any $\Omega'\Subset\Omega$ , $\inf_{\Omega'}\underline{u}\leq u\leq c(\Omega') < 0$ . First we show the gradient bound of admissible solutions of (1.6). We need following lemmas to prove the gradient estimate.

Lemma 3.1. Suppose $A = \{a_{ij}\}_{n\times n}$ satisfies $\lambda(A)\in\Gamma_{k-1}$ , $a_{11} < 0$ and $\{a_{ij}\}_{2\leq i, j\leq n}$ is diagonal, then

$\begin{equation} \sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}a_{1i}\leq 0. \end{equation}$

(3.1)

Proof. Let

$\begin{equation*} B = \left( \begin{array}{cccc} a_{11} & 0 & \cdots & 0\\ 0 & a_{22} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 &\cdots &a_{nn} \end{array} \right), \quad C = \left( \begin{array}{cccc} 0 & a_{12} & \cdots & a_{1n}\\ a_{21} & 0 & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & 0 &\cdots & 0 \end{array} \right). \end{equation*}$

$A(t): = B+tC$ , $f(t): = F(A(t))$ . Suppose $a_{1i} = a_{i1}$ for all $2\leq i\leq n$ . Directly we have

$\sigma_k(A(t)) = \sigma_k(B)-t^2\sum\limits_{i = 2}^n a_{1i}^2\sigma_{k-2}(B|1i),$

where $(B|ij)$ is the submatrix of $B$ formed by deleting $i$ -th, $j$ -th rows and columns. Easily we see that for $t\in[-1, 1]$ , $\lambda(A(t))\in\Gamma_{k-1}$ and $f$ is concave on $[-1, 1]$ . $f(-1) = f(1) = F(A)$ . So $f'(1)\leq 0$ . While

$f'(1) = 2\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}a_{1i}.$

Remark 3.1. By the concavity of $\frac{\sigma_k}{\sigma_{k-1}}$ , we can prove following inequality with $\lambda(B)\in\Gamma_{k-1}$

$\begin{equation} \sigma_{k-2}(B|1i)\sigma_{k-1}(B)-\sigma_{k-3}(B|1i)\sigma_k(B)\geq 0\quad \forall 2\leq i\leq n. \end{equation}$

(3.2)

We let $f(t) = \frac{\sigma_{k}}{\sigma_{k-1}}(A(t))$ .

$f'(1) = \frac{-2(\sum_{i = 2}^na_{1i}\sigma_{k-2}(B|1i))}{\sigma_{k-1}(A)}-\frac{\sigma_k(A)(\sum_{i = 2}^n a_{1i}^2\sigma_{k-3}(B|1i))}{\sigma_{k-1}^2(A)}\leq 0.$

Equivalently,

$\begin{equation} \sigma_{k-1}(B)\big(\sum\limits_{i = 2}^na_{1i}^2\sigma_{k-2}(B|1i)\big)-\sigma_{k}(B)\big(\sum\limits_{i = 2}^na_{1i}^2\sigma_{k-3}(B|1i)\big)\geq 0. \end{equation}$

(3.3)

We can choose $a_{1i} > 0$ small enough and $a_{1j} = 0$ for $j\ne i$ and $2\leq j\leq n$ , so that $\lambda(A)\in \Gamma_{k-1}$ . Then (3.3) implies (3.2).

Lemma 3.2. Let $\alpha_{k-2} > 0$ and $\alpha_l\geq 0$ for $0\leq l\leq k-3$ . Suppose symmetric matrix $A = \{a_{ij}\}_{n\times n}$ satisfying

$\lambda(A)\in\Gamma_{k-1}, a_{11} < 0, \;{{{and}}}\, \{a_{ij}\}_{2\leq i,j\leq n}\, {{is\; diagonal}}.$

Then

$\begin{equation} \frac{\partial F}{\partial a_{11}}\geq C_0\big(\sum\limits_{i = 1}^n\frac{\partial F}{\partial a_{ii}}\big) \end{equation}$

(3.4)

where $C_0$ depends on $n, k, |u|_{C^0}, |g|_{C^0}, \inf \alpha_{k-2}$ .

Proof. Note that

$\begin{align*} \frac{\partial}{\partial a_{11}}(\frac{\sigma_l}{\sigma_{k-1}}(A)) = &\frac{\sigma_{l-1}(A|1)\sigma_{k-1}(A)-\sigma_l(A)\sigma_{k-2}(A|1)}{\sigma_{k-1}^2(A)}\\ = &\sum\limits_{i = 2}^n\frac{a_{1i}^2}{\sigma_{k-1}^2(A)}[\sigma_{l-2}(A|1i)\sigma_{k-2}(A|1)-\sigma_{l-1}(A|1)\sigma_{k-3}(A|1i)]\\ &+\sigma_{k-1}^{-2}(A)[\sigma_{l-1}(A|1)\sigma_{k-1}(A|1)-\sigma_l(A|1)\sigma_{k-2}(A|1)]. \end{align*}$

For $0\leq l\leq k-2$ ,

$\frac{\partial}{\partial a_{11}}(\frac{\sigma_l}{\sigma_{k-1}}(A))\leq -C_{n,l}\frac{\sigma_l(A|1)\sigma_{k-2}(A|1)}{\sigma_{k-1}^{2}(A)}.$

As for $l = k$ ,

$\frac{\partial}{\partial a_{11}}(\frac{\sigma_k}{\sigma_{k-1}}(A))\geq C_{n,k}\frac{\sigma_{k-1}^2(A|1)}{\sigma_{k-1}^{2}(A)}\geq C_{n,k}.$

Therefore

$\frac{\partial F}{\partial a_{11}}\geq C_{n,k}+C_{n,k}\inf\alpha_{k-2}\frac{\sigma_{k-2}^2(A|1)}{\sigma_{k-1}^{2}(A)}.$

Next we compute $\sum_{i = 1}^n \frac{\partial F}{\partial a_{ii}}$ as

$\begin{align*} \sum\limits_{i = 1}^n \frac{\partial F}{\partial a_{ii}} = &n-k+1-(n-k+2)\frac{\sigma_k\sigma_{k-2}}{\sigma_{k-1}^2}(A)+(n-k+2)\alpha_0\frac{\sigma_{k-2}}{\sigma_{k-1}^2}(A)\\ &+\sum\limits_{l = 1}^{k-2}\alpha_l\frac{(n-k+2)\sigma_l(A)\sigma_{k-2}(A)-(n-l+1)\sigma_{k-1}(A)\sigma_{l-1}(A)}{\sigma_{k-1}^2(A)}\\ &\leq n-k+1-(n-k+2)\frac{\sigma_{k-2}(A)}{\sigma_{k-1}(A)}(\frac{\sigma_k}{\sigma_{k-1}}(A)-\sum\limits_{l = 0}^{k-2}\alpha_l\frac{\sigma_{l}}{\sigma_{k-1}}(A))\\ &\leq C_{n,k}+C_{n,k}|g|_{C^0}\frac{\sigma_{k-2}}{\sigma_{k-1}}(A)\\ &\leq C(n,k,|g|_{C^0})+C(n,k,|g|_{C^0})(\frac{\sigma_{k-2}}{\sigma_{k-1}}(A))^2\\ &\leq C(n,k,|g|_{C^0})+C(n,k,|g|_{C^0})\frac{\sigma_{k-2}^2(A|1)}{\sigma_{k-1}^2(A)}\\ &\leq C(n,k,|g|_{C^0},\inf\alpha_{k-2})\frac{\partial F}{\partial a_{11}}. \end{align*}$

Lemma 3.3. For any $\Omega'\Subset\Omega$ , there is a constant $C$ depending only on $\Omega', n, k, \alpha_l, g$ and their first derivatives, such that if $u$ is an admissible solution of (1.6), then

$|Du|\leq C$

on $\Omega'$ .

Proof. Since we require that $\partial_u g\leq 0$ and $\partial_u\alpha_l\geq 0$ , we only need to modify the equation (5.42) in ^[36] (i.e., (A.6)), where extra terms $\sum_{l = 0}^{k-2}\frac{(\alpha_l)_u}{\log u_1}\frac{\sigma_l(A)}{\sigma_{k-1}(A)}-\frac{g_u}{\log u_1}$ should be included. These terms are all good terms and Zhou's proof will also hold in our case. For reader's convenience, we sketch the proof in the appendix below.

Now we give the proof of Theorem 1.2.

Proof of Thorem 1.2. The theorem can be proved by solving uniformlly elliptic approximating problems.

$F_{\epsilon}[u_{\epsilon}] = -g_{\epsilon}(x,u_{\epsilon})\quad {\rm{in}}\,\Omega,\quad\quad u_{\epsilon} = 0\quad {\rm{on}}\,\partial \Omega,$

for $\epsilon > 0$ small, and $\underline{u}$ is an admissible subsolution for each of the approximating problems. By the comparison principle and Theorem 1.1, the interior gradient estimates in ^[36](modified), we have uniform $C^2$ interior estimates for $u_{\epsilon}$ . Then Evans-Krylov's theory, together with Schauder theory, imply uniform estimates for $||u_{\epsilon}||_{C^{3, \alpha}(\Omega')}$ for any $\Omega'\Subset\Omega$ . Theorem 2 then follows by extracting a suitable subsequence as $\epsilon\rightarrow 0$ .

4. The Plateau problem

In this section we prove Theorem 1.3. The notion of locally convex hypersurface we use is the same as that in ^[29].

Definition 4.1. A compact, connected, locally convex hypersurface $\mathcal{M}$ (possibly with boundary) in $\mathbb{R}^{n+1}$ is an immersion of an $n$ -dimensional, compact, oriented and connected manifold $\mathcal{N}$ (possibly with boundary) in $\mathbb{R}^{n+1}$ , that is, a mapping $T:\mathcal{N}\rightarrow \mathcal{M}\subset\mathbb{R}^{n+1}$ , such that for any $p\in\mathcal{N}$ there is a neighbourhood $\omega_p\subset\mathcal{N}$ such that

● $T$ is a homeomorphism from $\omega_p$ to $T(\omega_p)$ ;

● $T(\omega_p)$ is a convex graph;

● the convexity of $T(\omega_p)$ agrees with the orientation.

Since $\mathcal{M}$ is immersed, a point $x\in\mathcal{M}$ may be the image of several points in $\mathcal{N}$ . Since $\mathcal{M}$ and $\mathcal{N}$ are compact, $T^{-1}(x)$ consists of only finitely many points. Let $r > 0$ and $x\in\mathcal{M}$ . For small enough $r$ , $T^{-1}(\mathcal{M}\cap B_r^{n+1}(x))$ consists of several disjoint open sets $U_1, \cdots, U_s$ of $\mathcal{N}$ such that $T|_{U_i}$ is a homeomorphism of $U_i$ onto $T(U_i)$ for each $i = 1, \cdots, s$ . By an $r$ -neighbourhood $\omega_r(x)$ of $x$ in $\mathcal{M}$ we mean any one of the sets $T(U_i)$ . We say that $\omega_r(x)$ is convex if $\omega_r(x)$ lies on the boundary of its convex hull.

We shall use following lemma (see ^[32] Theorem A) to prove Theorem 1.3.

Lemma 4.1. Let $\mathcal{M}_0\subset B_R(0)$ be a locally convex hypersurface with $C^2$ -boundary $\partial \mathcal{M}_0$ . Suppose that on $\partial \mathcal{M}_0$ , the principal curvatures $\lambda_1^0, \cdots, \lambda_n^0$ of $\mathcal{M}_0$ satisfy

$C_0^{-1}\leq \lambda_i^0\leq C_0,\quad i = 1,2,\cdots,n,$

for some $C_0 > 0$ . Then there exist positive constants $r$ and $\alpha$ , depending only on $n, C_0, R$ and $\partial \mathcal{M}_0$ , such that for any point $p\in\mathcal{M}_0$ , each $r$ -neighbourhood $\omega_r(p)$ of $p$ is convex, and there is a closed cone $C_{p, \alpha}$ with vertex $p$ and angle $\alpha$ such that $\omega_r(p)\cap C_{p, \alpha} = \{p\}$ .

Note that for any point $p\in\mathcal{M}_0$ , if one chooses the axial direction of the cone $C_{p, \alpha}$ as the $x_{n+1}$ -axis, then each $\delta$ -neighbourhood of $p$ can be represented as a graph,

$x_{n+1} = u(x),\quad |x|\leq \delta,$

for any $\delta < r\sin(\alpha/2)$ . The cone condition also implies

$|Du(x)|\leq C,\quad |x| < \delta,$

where $C > 0$ only depends on $\alpha$ . Lemma 4.1 holds not just for $\mathcal{M}_0$ , but also for a family of locally convex hypersurfaces, with uniform $r$ and $\alpha$ .

For $2\leq k\leq n$ , denote

$f_{(k)}(\lambda) = \frac{\sigma_k}{\sigma_{k-1}}(\lambda)-\sum\limits_{l = 0}^{n-2}\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(\lambda).$

$\alpha_l$ 's are positive constants. With the aid of Lemma 4.1, we use the Perron method to obtain a viscosity solution of the Plateau problem for the curvature function $f_{(n)}$ , using the following lemma.

Lemma 4.2. Let $\Omega$ be a bounded domain in $\mathbb{R}^n$ with Lipschitz boundary. Let $\phi\in C^{0, 1}(\bar\Omega)$ be a $k$ -convex viscosity subsolution of

$\begin{equation} f_{(k)}(\lambda) = \frac{\sigma_k}{\sigma_{k-1}}(\lambda)-\sum\limits_{l = 0}^{k-2}\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(\lambda) = c\quad {{{in}}}\,\Omega, \end{equation}$

(4.1)

where $\alpha_l > 0$ and $c\neq 0$ are all constants. Then there is a viscosity solution $u$ of (4.1) such that $u = \phi$ on $\partial \Omega$ .

Proof. The proof uses the well-known Perron method. Let $\Psi$ denote the set of $k$ -convex subsolutions $v$ of (4.1) with $v = \phi$ on $\partial \Omega$ . Then $\Psi$ is not empty and the required solution $u$ is given by

$u(x) = \sup\{v(x):v\in\Psi\}.$

It is a standard argument. The key ingredient that needs to be mentioned is the solvability of the Dirichlet problem

$\begin{equation} f_{(k)}(\lambda) = c\quad {\rm{in}}\, \, B_r,\quad\quad u = u_0\quad {\rm{on}}\, \, \partial B_r, \end{equation}$

(4.2)

in small enough balls $B_r\subset\mathbb{R}^n$ , if $u_0$ is any Lipschitz viscosity subsolution of (4.2). This is a consequence of ^[31] Theorem 6.2 with slight modification.

Using Lemma 4.2 and the argument of ^[32], we conclude that there is a locally convex hypersurface $\mathcal{M}$ with boundary $\Sigma$ which satisfies the equation $f_{(n)}(\lambda) = c$ in the viscosity sense; that is, for any point $p\in\mathcal{M}$ , if $\mathcal{M}$ is locally represented as the graph of a convex function $u$ (by Lemma 4.1), then $u$ is a viscosity solution of $f_{(n)}(\lambda) = c$ .

Following we discuss the regularity of $\mathcal{M}$ . The interior regularity follows in the same way as ^[29].

Boundary regularity

The boundary regularity of $\mathcal{M}$ is a local property. The boundary estimates we need are contained in ^[19,21]. However, they can not be applied directly to $\mathcal{M}$ . Since we are working in a neighbourhood of a boundary point $p_0\in\mathcal{M}$ , which we may take to be the origin, we may assume that for a smooth bounded domain $\Omega\subset\mathbb{R}^n$ with $0\in\partial \Omega$ and small enough $\rho > 0$ we have

$\mathcal{M}\cap(B_{\rho}\times\mathbb{R}) = {\rm{graph}}\,u,\quad\mathcal{M}_0\cap(B_{\rho}\times\mathbb{R}) = {\rm{graph}}\,u_0,$

where $u\in C^{\infty}(\Omega_{\rho})\cap C^{0, 1}(\bar\Omega_{\rho})$ , and $u_0\in C^{\infty}(\bar\Omega_{\rho})$ are $k$ -convex solutions of

$f_{(k)}[u] = c\quad {\rm{in}}\,\Omega_{\rho},\quad\quad f_{(k)}[u_0]\geq c\quad {\rm{in}}\,\Omega_{\rho},$

with

$u\geq u_0\quad{\rm{in}}\,\Omega_{\rho},\quad\quad u = u_0\quad{\rm{on}}\,\partial\Omega\cap B_{\rho}.$

We may choose the coordinate system in $\mathbb{R}^n$ in such a way that $\Omega$ is uniformly convex, and moreover, so that for some $\epsilon_0 > 0$ we have

$\begin{equation} \frac{\sigma_{k-1}(\kappa')}{\sigma_{k-2}(\kappa')}\geq \epsilon_0 > 0 \end{equation}$

(4.3)

on $\partial \Omega\cap B_{\rho}$ , where $\kappa' = (\kappa'_1, \cdots, \kappa'_{n-1})$ denotes the vector of principal curvatures of $\partial\Omega$ . We recall that the principal curvatures of graph( $u$ ) are the eigenvalues of the matrix

$(I-\frac{Du\otimes Du}{1+|Du|^2})(\frac{D^2u}{\sqrt{1+|Du|^2}}).$

We denote $\sigma_k(p, r)$ as the $k$ -th elementary symmetric function of the eigenvalues of the matrix

$(I-\frac{p\otimes p}{1+|p|^2})r,\,p = (p_1,\cdots,p_n),\,r = (r_{ij})_{n\times n}.$

Let $f_{(k)}(p, r) = \frac{\sigma_k}{\sigma_{k-1}}(p, r)-\sum_{l = 0}^{k-2}\alpha_l(1+|p|^2)^{\frac{k-l}{2}}\frac{\sigma_l}{\sigma_{k-1}}(p, r)$ . $\lambda(r)$ is the vector formed by eigenvalues of $r$ . For any $p\in\mathbb{R}^n$ and symmetric matrices $r, s$ with $\lambda(r), \lambda(s)\in \Gamma_k$ , we have

$\begin{equation} \sum\limits_{i,j}\frac{\partial f_{(k)}}{\partial r_{ij}}(p,r)s_{ij}\geq f_{(k)}(p,s)+\sum\limits_{l = 0}^{k-2}(k-l)\alpha_l(1+|p|^2)^{\frac{k-l}{2}}\frac{\sigma_l}{\sigma_{k-1}}(p,r). \end{equation}$

(4.4)

For later purposes we note the simple estimate, if $r\geq 0$ ,

$\frac{1}{1+|p|^2}\sigma_k(0,r)\leq\sigma_k(p,r)\leq \sigma_k(0,r),$

and the development

$\sigma_k(p,r) = \frac{1+|\tilde p|^2}{1+|p|^2}r_{nn}\sigma_{k-1}(\tilde p,\tilde r)+O((|r_{st}|^k)_{(s,t)\neq (n,n)}),$

where $p = (p_1, \cdots, p_n)\in\mathbb{R}^n$ , $r = (r_{ij})_{n\times n}$ , $\tilde p = (p_1, \cdots, p_{n-1})\in\mathbb{R}^{n-1}$ , $\tilde r = (r_{ij})_{i, j = 1, \cdots, n-1}$ .

We suppose that $\partial \Omega$ is the graph of $\omega:B_{\rho}^{n-1}(0)\subset\mathbb{R}^n\rightarrow \mathbb{R}$ and $u(\tilde x, \omega(\tilde x)) = \varphi(\tilde x)$ . Furthermore, $\omega(0) = 0$ , $D\omega(0) = 0$ , $D\varphi(0) = 0$ and $\omega$ is a strictly convex function of $\tilde x$ . The curvature equation is equivalent to

$\begin{equation} f_{(k)}(Du,D^2u) = c\sqrt{1+|Du|^2} \end{equation}$

(4.5)

defined in some domain $\Omega\subset\mathbb{R}^n$ . We have following boundary estimates for second derivatives of $u$ .

Lemma 4.3. Let $u\in C^3(\bar\Omega)$ be a $k$ -convex solution of (4.5). We assume (4.3) with $\epsilon > 0$ . Then the estimate

$\begin{equation} |D^2u(0)|\leq C(n,k,\alpha_l,c,\epsilon,||\omega||_{C^3},||\varphi||_{C^4},||u||_{C^1},\lambda_{\min}(D^2\omega(0))) \end{equation}$

(4.6)

holds true where $\lambda_{\min}$ denotes the smallest eigenvalue.

Remark 4.1. On $\partial \Omega$ , we have for $i, j = 1, \cdots, n-1$ ,

$\begin{align*} &u_i+u_n\omega_i = \varphi_i,\\ &u_{ij}+u_{in}\omega_j+u_{nj}\omega_i+u_{nn}\omega_i\omega_j+u_n\omega_{ij} = \varphi_{ij}. \end{align*}$

Therefore $|u_{ij}(0)| = |\varphi_{ij}(0)-u_n(0)\omega_{ij}(0)|\leq C$ . It remains to show that $|u_{in}(0)|\leq C$ and $|u_{nn}(0)|\leq C$ . We follow ^[19,21] to obtain mixed second derivative boundary estimates and double normal second derivative boundary estimate.

Proof. Let

$\Omega_{d,\kappa} = \{x(\tilde x,x_n)\in\Omega||\tilde x| < d,\omega(\tilde x) < x_n < \tilde\omega(\tilde x)+\frac{\kappa}{2}d^2\}$

where $0 < d < \rho$ , $\tilde\omega(\tilde x): = \omega(\tilde x)-\frac{\kappa}{2}|\tilde x|^2$ , and $\kappa > 0$ is chosen small enough such that $\tilde\omega$ is still strictly convex. We decompose $\partial\Omega_{d, \kappa} = \partial_1\Omega_{d, \kappa}\cup\partial_2\Omega_{d, \kappa}\cup\partial_3\Omega_{d, \kappa}$ with

$\begin{align*} &\partial_1\Omega_{d,\kappa} = \{x\in\partial\Omega_{d,\kappa}|x_n = \omega(\tilde x)\},\\ &\partial_2\Omega_{d,\kappa} = \{x\in\partial\Omega_{d,\kappa}|x_n = \omega(\tilde x)+\frac{\kappa}{2}d^2\},\\ &\partial_3\Omega_{d,\kappa} = \{x\in\partial\Omega_{d,\kappa}||\tilde x| = d\}. \end{align*}$

Our lower barrier function $v$ will be of the form

$\begin{equation} v(x) = \theta(\tilde x)+h(\rho(x)) \end{equation}$

(4.7)

where $\theta(\tilde x)$ is an arbitrary $C^2$ -function, $h(\rho) = \exp\{B\rho\}-\exp\{\kappa Bd^2\}$ and $\rho(x) = \kappa d^2+\tilde\omega(\tilde x)-x_n$ . Denote $F^{ij} = \frac{\partial f_{(k)}(Du, D^2u)}{\partial u_{ij}}$ .

Mixed second derivative boundary estimates

By (4.4) and Lemma 2.3, we have

$F^{ij}v_{ij}\geq f_{(k)}(Du,D^2v)+C$

where $C$ depends only on $n, k, \alpha_l$ 's, $c, ||Du||_{C^0}$ . We choose an orthonormal frame $\{b_i\}_{i = 1}^n$ with $b_n = -\frac{D\rho}{|D\rho|}$ and denote $v_{(s)} = \frac{\partial v}{\partial b_s}$ . Directly, we have

$\begin{align*} &v_{(s)} = \theta_{(s)}+h'\rho_{(s)},\,(1\leq s\leq n-1);\quad v_{(n)} = \theta_{(n)}-h'\sqrt{1+|D\tilde \omega|^2};\\ &v_{(st)} = \theta_{(st)}+h'\tilde\omega_{(st)},\,(s,t)\neq (n,n);\\ &v_{(nn)} = \theta_{(nn)}+h'\tilde \omega_{(nn)}+h''(1+|D\tilde\omega|^2). \end{align*}$

We may choose $d$ small so that $|Du|$ is also small. Note that $|D\tilde\omega|$ is small since we can choose $d, \kappa$ small. By choosing large enough $B$ , we caculate

$\begin{align*} f_{(k)}(Du,D^2v) = &\frac{\sigma_k}{\sigma_{k-1}}(Du,D^2v)-\sum\limits_{l = 0}^{k-2}\alpha_l(1+|Du|^2)^{\frac{k-l}{2}}\frac{\sigma_l}{\sigma_{k-1}}(Du,D^2v)\\ \geq & (1-\epsilon)\frac{\sigma_k}{\sigma_{k-1}}(0,D^2v)-2\sum\limits_{l = 0}^{k-2}\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(0,D^2v)\\ \geq & (1-\epsilon)^2h'\frac{\sigma_{k-1}}{\sigma_{k-2}}(0,\tilde \omega_{(st)})-2\sum\limits_{l = 1}^{k-2}\alpha_l (h')^{l-k+1}\frac{\sigma_{l-1}}{\sigma_{k-2}}(0,\tilde \omega_{(st)})-o(B^{-1}) \end{align*}$

where in the last line, $1\leq s, t\leq n-1$ . Finally, we see that for large enough $B$ and small enough $d$ and $\kappa$ the estimate

$(1-\delta)h'\leq |Dv|\leq (1+\delta)h'$

is valid for small $\delta$ . Therefore

$\begin{equation} F^{ij}v_{ij}\geq (1-\epsilon)\frac{\sigma_{k-1}}{\sigma_{k-2}}(0,\tilde \omega_{(st)})|Dv|+C. \end{equation}$

(4.8)

Let $\tau$ be a $C^2$ -smooth vector field which is tangential along $\partial \Omega$ . Following ^[19,21] we then introduce the function

$w = 1-\exp(-a\tilde w)-b|x|^2$

where $\tilde w = u_{\tau}-\frac{1}{2}\sum_{i = 1}^{n-1}u_s^2$ and $a, b$ are positive constants. Since on $\partial_1\Omega_{d, \kappa}$ , $u = \varphi$ , and

$w|_{\partial_1\Omega_{d,\kappa}}\geq a\varphi_{\tau}-c|\tilde x|^2,\,w(0) = 0,\,w|_{\partial_2\Omega_{d,\kappa}\cup\partial_3\Omega_{d,\kappa}}\geq-M$

for suitable constants $c, M$ depending on $a, b, ||u||_{C^1}$ and $||\varphi||_{C^1}$ . By differentiation of Eq (4.5), we obtain

$F^{ij}u_{ijp}+F^iu_{ip} = c\bar v_p$

where $F^i: = \frac{\partial f_{(k)}}{\partial u_i}$ and $\bar v: = \sqrt{1+|Du|^2}$ .

$\begin{align} F^{ij}\tilde w_{ij} = &F^{ij}u_{ijp}\tau_p+F^{ij}(u_{pj}\tau_{pi}+u_{pi}\tau_{pj})+F^{ij}\tau_{ijp}u_p-\\&\sum\limits_{s = 1}^{n-1}F^{ij}(u_{is}u_{js}+u_{sij}u_s)\\ = &c(\bar v_p\tau_p-\sum\limits_{s = 1}^{n-1}\bar v_su_s)-F^iu_{ip}\tau_p+\\&\sum\limits_{s = 1}^{n-1}F^iu_{is}u_s+F^{ij}(u_{pj}\tau_{pi}+u_{pi}\tau_{pj})\\ &+F^{ij}\tau_{ijp}u_p-\sum\limits_{s = 1}^{n-1}F^{ij}u_{is}u_{js}. \end{align}$

(4.9)

By the definition of $\tilde w$ , we have

$\begin{equation} c(\bar v_p\tau_p-\sum\limits_{s = 1}^{n-1}\bar v_su_s) = \frac{c}{\bar v}\big(\langle D\tilde w,Du\rangle-\rm{Hess}(\tau)(Du,Du)\big). \end{equation}$

(4.10)

Then we compute $F^i$ . Denote $b_{ij} = \delta_{ij}-\frac{u_iu_j}{\bar v^2}$ and $c_{ij} = b_{ip}u_{pj}$ . $f_{(k)}$ can be rewritten as

$f_{(k)} = f_{(k)}(c_{ij},\bar v) = \frac{\sigma_k}{\sigma_{k-1}}(c_{ij})-\sum\limits_{l = 0}^{k-2}\alpha_l\bar v^{k-l}\frac{\sigma_l}{\sigma_{k-1}}(c_{ij}).$

Directly we have

$\begin{align*} F^i = &\frac{\partial f_{(k)}}{\partial u_i} = \frac{\partial f_{(k)}}{\partial c_{pq}}\frac{\partial c_{pq}}{\partial u_i}+\frac{\partial f_{(k)}}{\partial \bar v}\frac{\partial \bar v}{\partial u_i}\\ = &-\frac{1}{\bar v^2}f_{(k)}^{iq}u_{ql}u_l-\frac{1}{\bar v^2}f_{(k)}^{pq}u_{iq}u_p+\frac{2}{\bar v^3}f_{(k)}^{pq}u_pu_lu_{lq}u_i \\&-\sum\limits_{l = 0}^{k-2}\alpha_l(k-l)\bar v^{k-l-2}\frac{\sigma_l}{\sigma_{k-1}}(c_{ij})u_i \end{align*}$

where $f_{(k)}^{pq}: = \frac{\partial f_{(k)}}{\partial c_{pq}}$ . Therefore

$\begin{eqnarray} -F^iu_{ip}\tau_p+\sum\limits_{s = 1}^{n-1}F^iu_{is}u_s = \Big(-\frac{1}{\bar v^2}f_{(k)}^{iq}u_{ql}u_l-\frac{1}{\bar v^2}f_{(k)}^{pq}u_{iq}u_p+\frac{2}{\bar v^3}f_{(k)}^{pq}u_pu_lu_{lq}u_i\\ -\sum\limits_{l = 0}^{k-2}\alpha_l(k-l)\bar v^{k-l-2}\frac{\sigma_l}{\sigma_{k-1}}(c_{ij})u_i\Big)(-\tilde w_i+u_p\tau_{pi}). \end{eqnarray}$

(4.11)

In order to derive the right hand side of (4.11), we use the same coordinate system as ^[21], which corresponds to the projection of principal curvature directions of the graph of $u$ onto $\mathbb{R}^n\supset\Omega$ . Fixing a point $y\in\Omega$ , we choose a basis of eigenvectors $\hat e_1, \cdots, \hat e_n$ of the matrix $(c_{ij})$ at $y$ , corresponding to the eigenvalues $\lambda_1, \cdots, \lambda_n$ and orthonormal with respect to the inner product given by the matrix $I+Du\otimes Du$ . Using a subscript $\alpha$ to denote differentiation with respect to $\hat e_{\alpha}$ , $\alpha = 1, \cdots, n$ , so that

$u_{\alpha} = \hat e_{\alpha}^iu_i = \langle Du,\hat e_{\alpha}\rangle,\quad u_{\alpha\alpha} = \lambda_{\alpha} = \hat e_{\alpha}^i\hat e_{\alpha}^ju_{ij}.$

Then we obtain

$\begin{align*} \frac{1}{\bar v^2}f_{(k)}^{iq}u_{ql}u_l(\tilde w_i-u_p\tau_{pi}) = &\frac{1}{\bar v^2}\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}u_{\alpha}(\tilde \omega_{\alpha}-\rm{Hess}(\tau)(Du,\hat e_{\alpha}))\\ \leq& \delta \frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2+C(\delta)\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\tilde w_{\alpha}^2+C(\delta)\sum\limits_{\alpha = 1}^n\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}. \end{align*}$

The second term of (4.11) can be estimated in the same way as above. As for the third term of (4.11), we calculate as

$|f_{(k)}^{pq}u_pu_lu_{lq}| = |f_{(k)}^{pq}(u_{pq}-c_{pq})|\leq C|Du|^2.$

Thus

$\begin{equation} -F^iu_{ip}\tau_p+\sum\limits_{s = 1}^{n-1}F^iu_{is}u_s\leq 2\delta \frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2 \\+C(\delta)\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\tilde w_{\alpha}^2+C(\delta)\sum\limits_{\alpha = 1}^n\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}+C|\tilde w_iu_i-\tau_{ij}u_iu_j|. \end{equation}$

(4.12)

Let $(\eta_i^{\alpha})$ denote the inverse matrix to $(\hat e^i_{\alpha})$ , we write

$u_{s\alpha} = \hat e^{i}_{\alpha}u_{is} = \lambda_{\alpha}\eta^{\alpha}_s.$

Furthermore,

$\sum\limits_{s = 1}^{n-1}\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}u_{s\alpha}^2 = \frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2\sum\limits_{s = 1}^{n-1}(\eta^{\alpha}_s)^2.$

Now we reason similarly to ^[21]. If for all $\alpha = 1, \cdots, n$ , we have

$\begin{equation} \sum\limits_{s = 1}^{n-1}(\eta^{\alpha}_s)^2\geq \epsilon > 0 \end{equation}$

(4.13)

where $\epsilon$ is a small postive number. Then we clearly have

$\begin{equation} \sum\limits_{s = 1}^{n-1}\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}u_{s\alpha}^2\geq \epsilon \frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2. \end{equation}$

(4.14)

On the other hand, if (4.13) is not true, then

$\sum\limits_{s = 1}^{n-1}(\eta^{\gamma}_s)^2 < \epsilon$

for some $\gamma$ , which implies

$\sum\limits_{s = 1}^{n-1}(\eta^{\alpha}_s)^2\geq \delta_0 > 0$

for all $\alpha\neq \gamma$ . Hence

$\begin{equation} \sum\limits_{s = 1}^{n-1}\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}u_{s\alpha}^2\geq \delta_0\sum\limits_{\alpha\neq \gamma} \frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2. \end{equation}$

(4.15)

Then we use Theorem 3, 4 in ^[22] to deduce that

$\begin{align*} &\sum\limits_{\alpha\neq \gamma}(\frac{\sigma_k}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2\geq\frac{1}{C(n,k)}(\frac{\sigma_k}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2,\\ &\sum\limits_{\alpha\neq \gamma}(-\frac{\sigma_l}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2\geq\frac{1}{C(n,k,l)}(-\frac{\sigma_l}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2,\\ &\sum\limits_{\alpha\neq \gamma}(-\frac{1}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2\geq\frac{1}{C(n,k,0)}(-\frac{1}{\sigma_{k-1}})_{,\alpha}\lambda_{\alpha}^2-\frac{1}{C(n,k,0)}\frac{\sigma_1}{\sigma_{k-1}} \end{align*}$

where subscript ' $, \alpha$ ' denotes differentiation with respect to $\lambda_{\alpha}$ . Therefore,

$\begin{equation} \sum\limits_{s = 1}^{n-1}\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}u_{s\alpha}^2\geq \delta'\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2-C. \end{equation}$

(4.16)

Combing (4.9), (4.10), (4.12), (4.16), we have

$\begin{equation} F^{ij}\tilde w_{ij}\leq C|\langle D\tilde w,Du\rangle|+CF^{ij}\tilde w_i\tilde w_j+C\sum\limits_{i = 1}^nF^{ii} \end{equation}$

(4.17)

where we have chosen $\delta < < \delta'$ , so that $\frac{\partial f_{(k)}}{\partial \lambda_{\alpha}}\lambda_{\alpha}^2$ can be discarded. Note that in (4.17), we also have used the fact that $\sum_{i = 1}^nF^{ii}\geq C_0 > 0$ . By choosing $a, b$ large, we conclude that

$\begin{equation} F^{ij}w_{ij}\leq C|\langle Dw,Du\rangle|. \end{equation}$

(4.18)

From (4.8), (4.18), by comparison principle, we have at $0$ ,

$u_{\tau n}(0) = \frac{1}{a}w_n(0)\geq \frac{1}{a}v_n(0).$

Since $\tau$ is an arbitrary tangential direction at $0\in\partial\Omega$ , if we replace $\tau$ by $-\tau$ , we get an upper bound for $u_{\tau n}(0)$ .

Double normal second derivative boundary estimate

We turn to estimate $|u_{nn}(0)|$ . The idea is to estimate $u_{nn}$ in a first step at some optimally chosen point $y$ and in a second step conclude from this the estimate in the given point. We introduce a smooth moving orthonormal frame $\{b_1, \cdots, b_n\}$ with $b_n = (-\omega_{\tilde x}, 1)/\sqrt{1+|\omega_{\tilde x}|^2}$ being the upward normal to $\partial\Omega$ . Here $\omega_{\tilde x}$ is the gradient of $\omega(\tilde x)$ . Let

$G = \frac{\sigma_{k-1}}{\sigma_{k-2}}(u_{(\tilde x)},u_{(\tilde x\tilde x)})-\sum\limits_{l = 1}^{k-2}\alpha_l\sqrt{1+|Du|^2}^{k-l}\frac{\sigma_{l-1}}{\sigma_{k-2}}(u_{(\tilde x)},u_{(\tilde x\tilde x)})-c\sqrt{1+|Du|^2}$

on $\partial\Omega$ , where $u_{(\tilde x)} = (\frac{\partial u}{\partial b_1}, \cdots, \frac{\partial u}{\partial b_{n-1}})$ and $u_{(\tilde x\tilde x)} = (\frac{\partial^2 u}{\partial b_i\partial b_j})_{1\leq i, j\leq n-1}$ . For simplicity, we denote $\tilde p = u_{(\tilde x)}$ , $\tilde r = u_{(\tilde x\tilde x)}$ , $\bar v = \sqrt{1+|Du|^2}$ . First we observe that

$f_{(k)}(p,r) < \lim\limits_{r_{nn}\rightarrow +\infty}f_{(k)}(p,r) = \frac{\sigma_{k-1}}{\sigma_{k-2}}(\tilde p,\tilde r)-\sum\limits_{l = 1}^{k-2}\alpha_l\bar v^{k-l}\frac{\sigma_{l-1}}{\sigma_{k-2}}(\tilde p,\tilde r)$

from what we see that $G > 0$ . Hence the function

$\tilde G = G(x)+\frac{4|\tilde x|^2}{\bar \rho^2}\bar G$

with $\bar G = \max\{G(x)|x\in\partial\Omega, \, |\tilde x| < \rho\}$ and $0 < \bar\rho < \rho$ attains its minimum over $\partial\Omega\cap B_{\rho}(0)$ at some point $y\in\partial\Omega\cap B_{\bar\rho/2}(0)$ . If $|u_{nn}(y)| < C$ , then $G(y) > C^{-1} > 0$ .

$G(0) = \tilde G(0)\geq \tilde G(y) > G(y) > C^{-1} > 0.$

Therefore $G(0)$ is strictly positive and we have

$|u_{nn}(0)| < +\infty.$

To check that $|u_{nn}(y)| < +\infty$ , we proceed in essentially the same way as in mixed second derivative estimates. The point $y$ plays the role of the origin and the function $\tilde w$ is defined as

$\tilde w(x) = -(u_n(x)-u_n(y))-K|Du(x)-Du(y)|^2$

where $K$ is a sufficiently big constant. In order to apply the comparison principle, we need to obtain that

$w(x)\geq \tilde \theta(\tilde x)-C|\tilde x-\tilde y|^2(x\in\partial\Omega\cap B_{\rho}(0))$

where $\tilde \theta$ is some $C^2$ -smooth function. We reason similarly to Lemma 2.5 in ^[19]. The choice of the moving frame gives

$u_{(s)} = \varphi_{(s)},\quad u_{(st)} = \varphi_{(st)}-u_n\omega_{(st)}(s,t = 1,\cdots,n-1).$

By the concavity of $\frac{\sigma_{k-1}}{\sigma_{k-2}}(\tilde p, \tilde r)$ , $-\frac{\sigma_{l}}{\sigma_{k-2}}(\tilde p, \tilde r)(l = 0, \cdots, k-3)$ in $\tilde r$ and the convexity of $\sqrt{1+|\tilde p|^2}$ in $\tilde p$ , we compute

$\begin{equation} 0\leq \tilde G(x)-\tilde G(y)\leq g(y,x)(u_n(y)-u_n(x))+h(y,x) \end{equation}$

(4.19)

with

$\begin{align*} g(y,x) = &(\frac{\sigma_{k-1}}{\sigma_{k-2}})^{st}(\tilde p(x),\tilde r(y))\omega_{(st)}(x)-\\ &\sum\limits_{l = 1}^{k-2}\alpha_l\bar v^{k-l}(x)(\frac{\sigma_{l-1}}{\sigma_{k-2}})^{st}(\tilde p(x),\tilde r(y))\omega_{(st)}(x)\\ &+\Big(\sum\limits_{l = 1}^{k-2}\alpha_l(k-l)\bar v^{k-l-2}(y)\frac{\sigma_{l-1}}{\sigma_{k-2}}(\tilde p(y),\tilde r(y))+c\bar v^{-1}(y)\Big)\\ &(u_n(y)-u_i(y)\omega_i(x)) \end{align*}$

and

$\begin{align*} h(y,x) = &\frac{\sigma_{k-1}}{\sigma_{k-2}}(\varphi_{\tilde x}(x),\tilde r(y))-\frac{\sigma_{k-1}}{\sigma_{k-2}}(\varphi_{\tilde x}(y),\tilde r(y))\\ &+\big(\frac{\sigma_{k-1}}{\sigma_{k-2}}\big)^{st}(\varphi_{\tilde x}(x),\tilde r(y))\Psi_{st}(y,x)\\ &-\sum\limits_{l = 1}^{k-2}\alpha_l\bar v^{k-l}\big(\frac{\sigma_{l-1}}{\sigma_{k-2}}\big)^{st}(\varphi_{\tilde x}(x),\tilde r(y))\Psi_{st}(y,x)\\ &+\sum\limits_{l = 1}^{k-2} \alpha_l\bar v^{k-l}\Big(\frac{\sigma_{l-1}}{\sigma_{k-2}}(\varphi_{\tilde x}(y),\tilde r(y))-\frac{\sigma_{l-1}}{\sigma_{k-2}}(\varphi_{\tilde x}(x),\tilde r(y))\Big)\\ &+[c\bar v^{-1}-\sum\limits_{l = 1}^{k-2}\alpha_l(k-l)\bar v^{k-l-2}\frac{\sigma_{l-1}}{\sigma_{k-2}}(\tilde p(y),\tilde r(y))]\\ &\cdot A+\frac{4\bar G}{\bar \rho^2}(|\tilde x|^2-|\tilde y|^2) \end{align*}$

where $\Psi_{st}(y, x) = \varphi_{(st)}(x)-\varphi_{(st)}(y)-u_n(y)(\omega_{(st)}(x)-\omega_{(st)}(y))$ , $A = [\varphi_i(y)-\varphi_i(x)-u_n(y)(\omega_i(y)-\omega_i(x))]u_i(y)$ . We may take $\tilde \theta(\tilde x) = -\frac{h}{g}(y, x)$ if we can show that $g(y, x) > 0$ . This is true since $|Du|$ is small and $-(\frac{\sigma_{l-1}}{\sigma_{k-1}})^{st}$ is semi-positive definite, together with condition (4.3). This completes the proof of the boundary regularity.

Acknowledgments

The authors were supported by NSFC, grant nos. 12031017 and 11971424.

Conflict of interest

The authors declare no conflict of interest.

A. Appendix. Proof of Lemma 3.3

In this appendix, we sketch the proof of Lemma 3.3 for reader's convenience. For the original proof, see ^[36].

Without loss of generality, we assume $\Omega = B_r(0)$ . Let $\rho = r^2-|x|^2$ , $M = osc_{B_r}u$ , $\tilde g(u) = \frac{1}{M}(M+u-\inf_{B_r} u)$ , $\phi(x, \xi) = \rho(x)\tilde g(u)\log(u_{\xi}(x))$ . This auxiliary function $\phi$ comes from ^[34]. Suppose $\phi$ attains its maximum at $(x_0, e_1)$ . Furthermore, by rotating $e_2, \cdots, e_n$ , we can assume that $\{u_{ij}(x_0)\}_{2\leq i, j\leq n}$ is diagonal. Thus $\varphi(x) = \log\rho(x)+\log\tilde g(u(x))+\log\log u_1$ also attains a local maximum at $x_0\in B_r(0)$ . At $x_0$ , we have

$\begin{equation} 0 = \varphi_i = \frac{\rho_i}{\rho}+\frac{\tilde g_i}{\tilde g}+\frac{u_{1i}}{u_1\log u_1}, \end{equation}$

(A.1)

$\begin{equation} 0\geq \varphi_{ij} = \frac{\rho_{ij}}{\rho}-\frac{\rho_i\rho_j}{\rho^2}+\frac{\tilde g_{ij}}{\tilde g}-\frac{\tilde g_i\tilde g_j}{\tilde g^2}+\frac{u_{1ij}}{u_1\log u_1}-(1+\frac{1}{\log u_1})\frac{u_{1i}u_{1j}}{u_1^2\log u_1}. \end{equation}$

(A.2)

Only in this proof we denote that $F^{ij}: = \frac{\partial F}{\partial u_{ij}}$ . $F^{ij}$ is positive definite. Taking trace with $\varphi_{ij}$ and using (A.1), we have

$\begin{align} 0&\geq F^{ij}\varphi_{ij}\\ & = F^{ij}\Big(\frac{\rho_{ij}}{\rho}+2\frac{\rho_i}{\rho}\frac{\tilde g_j}{\tilde g}+\frac{\tilde g_{ij}}{\tilde g}\Big)+F^{ij}\Big(\frac{u_{1ij}}{u_1\log u_1}-(1+\frac{2}{\log u_1})\frac{u_{1i}u_{1j}}{u_1^2\log u_1}\Big)\\ &: = \mathcal{A}+\mathcal{B}. \end{align}$

(A.3)

It is well-known that the principal curvatures of graph $u$ are the eigenvalues of matrix $A = (a_{ij})_{n\times n}$ :

$a_{ij} = \frac{1}{W}\Big(u_{ij}-\frac{u_iu_lu_{lj}}{W(W+1)}-\frac{u_ju_lu_{li}}{W(W+1)}+\frac{u_iu_ju_pu_qu_{pq}}{W^2(W+1)^2}\Big)$

where $W = \sqrt{1+|Du|^2}$ . Next we compute $F^{ij}$ at $x_0$ .

$\begin{equation*} \frac{\partial a_{ij}}{\partial u_{ij}} = \left\{ \begin{array}{cl} \frac{1}{W^3} & i = j = 1,\\ \frac{1}{W^2} & i = 1,j\geq 2\,\rm{or}\,i\geq 2,j = 1,\\ \frac{1}{W} & i\geq 2,j\geq 2. \end{array} \right. \end{equation*}$

For two different sets $\{p, q\}\neq\{i, j\}$ , $\frac{\partial a_{pq}}{\partial u_{ij}} = 0$ . Therefore

$\begin{equation*} F^{ij} = \frac{\partial F}{\partial a_{ij}}\frac{\partial a_{ij}}{\partial u_{ij}} = \left\{ \begin{array}{cl} \frac{1}{W^3}\frac{\partial F}{\partial a_{11}} & i = j = 1,\\ \frac{1}{W^2}\frac{\partial F}{\partial a_{ij}} & i = 1,j\geq 2\,\rm{or}\,i\geq 2,j = 1,\\ \frac{1}{W}\frac{\partial F}{\partial a_{ij}} & i\geq 2,j\geq 2. \end{array} \right. \end{equation*}$

Direct computation shows that

$\mathcal{A} = \frac{-2}{\rho}(\sum\limits_{i = 1}^n\frac{\partial F}{\partial u_{ii}})+\frac{1}{M\tilde g}(\sum\limits_{i,j = 1}^n\frac{\partial F}{\partial u_{ij}}\cdot u_{ij})+\frac{2u_1}{M\rho\tilde g}\sum\limits_{i = 1}^n\frac{\partial F}{\partial u_{1i}}\cdot \rho_i,$

$\sum\limits_{i = 1}^n\frac{\partial F}{\partial u_{ii}} = \frac{\partial F}{\partial a_{11}}\frac{1}{W^3}+\sum\limits_{i = 2}^n \frac{\partial F}{\partial a_{ii}}\frac{1}{W}\leq \frac{1}{W}\sum\limits_{i = 1}^n\frac{\partial F}{\partial a_{ii}},$

$\sum\limits_{i,j = 1}^n\frac{\partial F}{\partial u_{ij}}\cdot u_{ij} = \sum\limits_{i,j = 1}^n\frac{\partial F}{\partial a_{ij}}\cdot a_{ij} = \frac{\sigma_k}{\sigma_{k-1}}(A)-\sum\limits_{l = 0}^{k-2}\alpha_l(l-k+1)\frac{\sigma_l}{\sigma_{k-1}}(A).$

By (A.1), suppose that $u_1\gg 1$ , then we have $u_{11} < 0$ and

$\begin{align*} \frac{2u_1}{M\rho\tilde g}\sum\limits_{i = 1}^n\frac{\partial F}{\partial u_{1i}}\cdot \rho_i& = \frac{2u_1}{M\rho\tilde g}(\frac{\partial F}{\partial u_{11}}\rho_1+\sum\limits_{i = 2}^n\frac{\partial F}{\partial u_{1i}}\rho_i)\\ & = \frac{2u_1}{M\rho\tilde g}(\frac{\partial F}{\partial a_{11}}\frac{\rho_1}{W^3}-\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}\frac{u_{1i}\rho}{W^2u_1\log u_{1}})\\ &\geq -\frac{4ru_1}{MW^3\rho\tilde g}\frac{\partial F}{\partial a_{11}}-\frac{2}{M\tilde g\log u_1}\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}a_{1i}\\ &\geq -\frac{4ru_1}{MW^3\rho\tilde g}\sum\limits_{i = 1}^n\frac{\partial F}{\partial a_{ii}} \end{align*}$

where we have used (3.1). Therefore

$\begin{equation} \mathcal{A}\geq (-\frac{2}{W\rho}-C\frac{u_1}{W^3})(\sum\limits_{i = 1}^n\frac{\partial F}{\partial a_{ii}})+\frac{1}{M\tilde g}(-g+\sum\limits_{l = 0}^{k-2}\alpha_l(k-l)\frac{\sigma_l}{\sigma_{k-1}}(A)). \end{equation}$

(A.4)

In the following we turn to estimate $\mathcal{B}$ . By the definition of $a_{ij}$ , we have at $x_0$ ,

$\frac{\partial a_{11}}{\partial x_1} = \frac{1}{W^3}u_{111}-\frac{3u_1}{W^5}u_{11}^2-\frac{2u_1}{W^3(W+1)}\sum\limits_{k = 2}^nu_{k1}^2,$

for $i\geq 2$ ,

$\frac{\partial a_{1i}}{\partial x_1} = \frac{1}{W^2}u_{1i1}-\frac{2u_1}{W^4}u_{11}u_{1i}-\frac{u_1}{W^2(W+1)}u_{1i}u_{ii}-\frac{u_1}{W^3(W+1)}u_{11}u_{1i},$

$\frac{\partial a_{ii}}{\partial x_1} = \frac{1}{W}u_{ii1}-\frac{u_1}{W^3}u_{11}u_{ii}-\frac{2u_1}{W^2(W+1)}u_{1i}^2,$

for $i\geq 2, j\geq 2, i\neq j$ ,

$\frac{\partial a_{ij}}{\partial x_1} = \frac{1}{W}u_{ij1}-2\frac{u_{i1}u_{j1}u_1}{W^2(W+1)}.$

Taking derivatives with respect to $x_1$ on both sides of (1.4), we have

$\sum\limits_{i,j = 1}^n\frac{\partial F}{\partial a_{ij}}\frac{\partial a_{ij}}{\partial x_1}-\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_{l}}{\sigma_{k-2}} = -g_{,1}.$

For the first term of $\mathcal{B}$ , we calculate as

$\sum\limits_{i,j = 1}^n\frac{\partial F}{\partial u_{ij}}\frac{u_{ij1}}{u_1\log u_1} = \frac{1}{u_1\log u_1}\Big(\frac{\partial F}{\partial a_{11}}\frac{u_{111}}{W^3}+2\sum\limits_{i\geq 2}\frac{\partial F}{\partial a_{1i}}\frac{u_{1i1}}{W^2}+\sum\limits_{i,j\geq 2}\frac{\partial F}{\partial a_{ij}}\frac{u_{ij1}}{W}\Big),$

$\begin{align*} F^{ij}u_{ij1} = &\frac{\partial F}{\partial a_{11}}\Big(\frac{\partial a_{11}}{\partial x_1}+\frac{3u_1}{W^5}u_{11}^2+\frac{2u_1}{W^3(W+1)}\sum\limits_{k\geq 2} u_{k1}^2\Big)\\ &+2\sum\limits_{i\geq 2}\frac{\partial F}{\partial a_{1i}}\Big(\frac{\partial a_{1i}}{\partial x_1}+\frac{2u_1}{W^4}u_{11}u_{1i}\\ &+\frac{u_1u_{1i}u_{ii}}{W^2(W+1)}+\frac{u_1u_{11}u_{1i}}{W^3(W+1)}\Big)\\ &+\sum\limits_{i\neq j\geq 2}\frac{\partial F}{\partial a_{ij}}\Big(\frac{\partial a_{ij}}{\partial x_1}+2\frac{u_1u_{i1}u_{j1}}{W^2(W+1)}\Big)\\ &+\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{ii}}\Big(\frac{\partial a_{ii}}{\partial x_1}+\frac{u_1u_{11}u_{ii}}{W^3}+\frac{2u_1u_{1i}^2}{W^2(W+1)}\Big)\\ = &-g_{,1}+\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_{l}}{\sigma_{k-1}}(A)+\frac{u_1u_{11}}{W^2}\Big(-g+\sum\limits_{l = 0}^{k-2}(k-l)\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(A)\Big)\\ &+\frac{\partial F}{\partial a_{11}}\Big(\frac{2u_1}{W^5}u_{11}^2+\frac{2u_1}{W^3(W+1)}\sum\limits_{k\geq 2} u_{k1}^2\Big)\\ &+2\sum\limits_{i\geq 2}\frac{\partial F}{\partial a_{1i}}\Big(\frac{u_1}{W^4}u_{11}u_{1i}+\frac{u_1u_{1i}u_{ii}}{W^2(W+1)}\\ &+\frac{u_1u_{11}u_{1i}}{W^3(W+1)}\Big)+2\sum\limits_{i\neq j\geq 2}\frac{\partial F}{\partial a_{ij}}\frac{u_1u_{i1}u_{j1}}{W^2(W+1)}\\ &+2\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{ii}}\frac{u_1u_{1i}^2}{W^2(W+1)}. \end{align*}$

For the second term of $\mathcal{B}$ , we calculate

$\sum\limits_{i,j = 1}^n\frac{\partial F}{\partial u_{ij}}u_{1i}u_{1j} = \frac{\partial F}{\partial a_{11}}\frac{u_{11}^2}{W^3}+2\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}\frac{u_{11}u_{1i}}{W^2}+\sum\limits_{2\leq i,j\leq n}\frac{\partial F}{\partial a_{ij}}\frac{u_{1i}u_{1j}}{W}.$

Therefore

$\begin{align*} \mathcal{B} = &\frac{1}{u_1\log u_1}\Big(-g_{,1}+\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_{l}}{\sigma_{k-1}}(A)\Big)\\ &+\frac{u_{11}}{W^2\log u_1}\Big(-g+\sum\limits_{l = 0}^{k-2}(k-l)\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(A)\Big)\\ &+\Big(\frac{2}{W^5\log u_1}-(1+\frac{2}{\log u_1})\frac{1}{u_1^2\log u_1W^3}\Big)\\ &\frac{\partial F}{\partial a_{11}}u_{11}^2+\frac{2}{W^3(W+1)\log u_1}\sum\limits_{k\geq 2} \frac{\partial F}{\partial a_{11}}u_{k1}^2\\ &+\Big(\frac{2}{W^4\log u_1}+\frac{2}{W^3(W+1)\log u_1}\\ &-(1+\frac{2}{\log u_1})\frac{2}{W^2u_1^2\log u_1}\Big)\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}u_{11}u_{1i}\\ &+\frac{2}{W^2(W+1)\log u_1}\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}u_{1i}u_{ii}\\ &+\Big(\frac{2}{W^2(W+1)\log u_1}-\frac{1+2/\log u_1}{Wu_1^2\log u_1}\Big)\times\\ &\sum\limits_{2\leq i,j\leq n}\frac{\partial F}{\partial a_{ij}}u_{1i}u_{1j}. \end{align*}$

Since $\{\frac{\partial F}{\partial a_{ij}}\}_{1\leq i, j\leq n}$ is positive definite, so is $\{\frac{\partial F}{\partial a_{ij}}\}_{2\leq i, j\leq n}$ . $W = \sqrt{1+u_1^2}\approx u_1$ . Therefore

$\begin{align} \mathcal{B}\geq&\frac{1}{u_1\log u_1}\Big(-g_{,1}+\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_{l}}{\sigma_{k-1}}(A)\Big)\\&+\frac{u_{11}}{W^2\log u_1}\Big(-g+\sum\limits_{l = 0}^{k-2}(k-l)\alpha_l\frac{\sigma_l}{\sigma_{k-1}}(A)\Big)\\ &+\frac{1-\delta}{W^5\log u_1}\frac{\partial F}{\partial a_{11}}u_{11}^2+\frac{2}{W^2(W+1)\log u_1}\big(\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}u_{1i}u_{ii}\big) \end{align}$

(A.5)

where $\delta > 0$ is a small constant, depending only on $u_1$ . By (A.3), (A.4), (A.5), we have

$\begin{align} 0\geq& (-\frac{2}{W\rho}-\frac{Cu_1}{W^3})(\sum\limits_{i = 1}^n\frac{\partial F}{\partial a_{ii}})+(\frac{1}{M\tilde g}+\frac{u_{11}}{W^2\log u_1})(-g+\sum\limits_{l = 0}^n\alpha_l(k-l)\frac{\sigma_l}{\sigma_{k-1}})\\ &+\frac{1}{u_1\log u_1}(-g_{,1}+\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_l}{\sigma_{k-1}})+\frac{1-\delta}{W^5\log u_1}\frac{\partial F}{\partial a_{11}}u_{11}^2\\ &+\frac{2}{W^2(W+1)\log u_1}\big(\sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}u_{1i}u_{ii}\big). \end{align}$

(A.6)

Since we require that $g_u\leq 0$ and $(\alpha_l)_u\geq 0$ ,

$-g_{,1}+\sum\limits_{l = 0}^{k-2}(\alpha_l)_{,1}\frac{\sigma_l}{\sigma_{k-1}}\geq-\frac{\partial g}{\partial x_1}+\sum\limits_{l = 0}^{k-2}\frac{\partial \alpha_l}{\partial x_1}\frac{\sigma_l}{\sigma_{k-1}}.$

We claim that

$\begin{equation} \sum\limits_{i = 2}^n\frac{\partial F}{\partial a_{1i}}u_{1i}u_{ii}\geq -C\frac{u_1^2\log^2 u_1}{W}\frac{|D\rho|^2}{\rho^2}\frac{\partial F}{\partial a_{11}}. \end{equation}$

(A.7)

We deter the proof of (A.7). By (A.1), we see that the leading term in (A.6) is $\frac{1-\delta}{W^5\log u_1}\frac{\partial F}{\partial a_{11}}u_{11}^2\approx\frac{\log u_1}{W}\frac{\partial F}{\partial a_{11}} > 0$ . Other terms have order at most $O(W^{-1})$ , therefore

$\log u_1\leq C.$

The interior gradient estimate is proved after we check (A.7). Let $\Upsilon = \{2\leq j\leq n|a_{jj}\geq 0\}$ . Note that $a_{11} < 0$ and $\lambda(A)\in\Gamma_k$ .

$\begin{align*} \sum\limits_{i = 2}^{n} \frac{\partial F}{\partial a_{1 i}} u_{1 i} u_{i i} = &-\sum\limits_{i = 2}^{n}\left[\frac{\sigma_{k-2}(A|1i) \sigma_{k-1}(A)-\sigma_{k-3}(A|1 i) \sigma_{k}(A)}{\sigma_{k-1}^{2}(A)}\right.\\ &\left.+\sum\limits_{l = 1}^{k-2} \alpha_{l}\frac{\sigma_{k-3}(A|1 i) \sigma_{l}(A)-\sigma_{l-2}(A|1 i) \sigma_{k-1}(A)}{\sigma_{k-1}^{2}(A)}\right] a_{i 1} u_{1 i} u_{i i} \\ \geq &-\sum\limits_{i \in \Upsilon}^{n}\left[\frac{\sigma_{k-2}(A|1 i) \sigma_{k-1}(A)-\sigma_{k-3}(A|1 i) \sigma_{k}(A)}{\sigma_{k-1}^{2}(A)}\right.\\ &\left.+\sum\limits_{l = 1}^{k-2} \alpha_{l}\frac{\sigma_{k-3}(A|1 i) \sigma_{l}(A)-\sigma_{l-2}(A|1 i) \sigma_{k-1}(A)}{\sigma_{k-1}^{2}(A)}\right] a_{i i} \frac{u_{1 i}^{2}}{W} \\ \geq &-\sum\limits_{i \in \Upsilon}\left[\frac{a_{i i} \sigma_{k-2}(A|1 i) \sigma_{k-1}(A)}{\sigma_{k-1}^{2}(A)}+\sum\limits_{l = 1}^{k-2} \alpha_{l}\frac{a_{i i} \sigma_{k-3}(A|1 i) \sigma_{l}(A)}{\sigma_{k-1}^{2}(A)}\right] \frac{u_{1 i}^{2}}{W} \\ \geq &-\sum\limits_{i \in \Upsilon}\left[\frac{C_{n,k}\sigma_{k-1}(A|1) \sigma_{k-1}(A)}{\sigma_{k-1}^{2}(A)}+\sum\limits_{l = 1}^{k-2} \alpha_{l}\frac{C_{n,k}\sigma_{k-2}(A|1) \sigma_{l}(A)}{\sigma_{k-1}^{2}(A)}\right] \frac{u_{1 i}^{2}}{W} \\ \geq &-\left[\frac{C_{n,k}\sigma_{k-1}(A|1) \sigma_{k-1}(A)}{\sigma_{k-1}^{2}(A)}+\sum\limits_{l = 1}^{k-2} \alpha_{l}\frac{C_{n,k}\sigma_{k-2}(A|1) \sigma_{l}(A)}{\sigma_{k-1}^{2}(A)}\right] \sum\limits_{i = 2}^{n} \frac{u_{1 i}^{2}}{W} \\ \geq &-\left[C_{n,k}\frac{\sigma_{k-1}(A|1) \sigma_{k-1}(A)-\sigma_{k-2}(A|1) \sigma_{k}(A)}{\sigma_{k-1}^{2}(A)}\right.\\ &+\left.\sum\limits_{l = 0}^{k-2}\alpha_lC_{n,k}\frac{\sigma_{k-2}(A|1)\sigma_l(A)-\sigma_{l-1}(A)\sigma_{k-1}(A)}{\sigma_{k-1}^2(A)}\right]\sum\limits_{i = 2}^n\frac{u_{1i}^2}{W}\\ \geq &-C(n,k)\sum\limits_{i = 2}^n\frac{u_{1i}^2}{W}\frac{\partial F}{\partial a_{11}}\\ \geq &-C(n,k)\frac{u_1^2\log^2u_1}{W}\frac{|D\rho|^2}{\rho^2}\frac{\partial F}{\partial a_{11}}. \end{align*}$

Thus (A.7) holds and the gradient estimate is proved.

References

[1]	T. Zhang, A. P. Marand, J. Jiang, PlantDHS: A database for DNase I hypersensitive sites in plants, Nucleic. Acids. Res., 44 (2016), D1148–D1153. https://doi.org/10.1093/nar/gkv962 doi: 10.1093/nar/gkv962
[2]	D. S. Gross, W. T. Garrard, Nuclease hypersensitive sites in chromatin, Annu. Rev. Biochem., 57 (1988), 159–197. https://doi.org/10.1146/annurev.bi.57.070188.001111 doi: 10.1146/annurev.bi.57.070188.001111
[3]	G. E. Crawford, I. E. Holt, J. C. Mullikin, D. Tai, E. D. Green, T. G. Wolfsberg, et al., Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites, Proc. Natl. Acad. Sci., 101 (2004), 992–997. https://doi.org/10.1073/pnas.0307540100 doi: 10.1073/pnas.0307540100
[4]	M. M. Carrasquillo, M. Allen, J. D. Burgess, X. Wang, S. L. Strickland, S. Aryal, et al., A candidate regulatory variant at the TREM gene cluster associates with decreased Alzheimer's disease risk and increased TREML1 and TREM2 brain gene expression, Alzheimer's Dementia, 13 (2017), 663–673. https://doi.org/10.1016/j.jalz.2016.10.005 doi: 10.1016/j.jalz.2016.10.005
[5]	W. Meuleman, A. Muratov, E. Rynes, J. Halow, K. Lee, D. Bates, et al., Index and biological spectrum of human DNase I hypersensitive sites, Nature, 584 (2020), 244–251. https://doi.org/10.1038/s41586-020-2559-3 doi: 10.1038/s41586-020-2559-3
[6]	M. T. Maurano, R. Humbert, E. Rynes, R. E. Thurman, E. Haugen, H. Wang, et al., Systematic localization of common disease-associated variation in regulatory DNA, Science, 337 (2012), 1190–1195. https://doi.org/10.1126/science.1222794 doi: 10.1126/science.1222794
[7]	J. Ernst, P. Kheradpour, T. S. Mikkelsen, N. Shoresh, L. D. Ward, C. B. Epstein, et al., Mapping and analysis of chromatin state dynamics in nine human cell types, Nature, 473 (2011), 43–49. https://doi.org/10.1038/nature09906 doi: 10.1038/nature09906
[8]	M. Mokry, M. Harakalova, F. W. Asselbergs, P. I. de Bakker, E. E. Nieuwenhuis, Extensive association of common disease variants with regulatory sequence, PLoS One, 11 (2016), e0165893. https://doi.org/10.1371/journal.pone.0165893 doi: 10.1371/journal.pone.0165893
[9]	D. Weghorn, F. Coulet, K. M. Olson, C. DeBoever, F. Drees, A. Arias, et al., Identifying DNase I hypersensitive sites as driver distal regulatory elements in breast cancer, Nat. Commun., 8 (2017), 1–16. https://doi.org/10.1038/s41467-017-00100-x doi: 10.1038/s41467-017-00100-x
[10]	W. Jin, Q. Tang, M. Wan, K. Cui, Y. Zhang, G. Ren, et al., Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples, Nature, 528 (2015), 142–146. https://doi.org/10.1038/nature15740 doi: 10.1038/nature15740
[11]	G. E. Crawford, S. Davis, P. C. Scacheri, G. Renaud, M. J. Halawi, M. R. Erdos, et al., DNase-chip: A high-resolution method to identify DNase I hypersensitive sites using tiled microarrays, Nat. Methods, 3 (2006), 503–509. https://doi.org/10.1038/nmeth888 doi: 10.1038/nmeth888
[12]	J. Cooper, Y. Ding, J. Song, K. Zhao, Genome-wide mapping of DNase I hypersensitive sites in rare cell populations using single-cell DNase sequencing, Nat. Protoc., 12 (2017), 2342–2354. https://doi.org/10.1038/nprot.2017.099 doi: 10.1038/nprot.2017.099
[13]	G. E. Crawford, I. E. Holt, J. Whittle, B. D. Webb, D. Tai, S. Davis, et al., Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS), Genome Res., 16 (2006), 123–131. https://doi.org/10.1101/gr.4074106 doi: 10.1101/gr.4074106
[14]	L. Song, G. E. Crawford, DNase-seq: A high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells, Cold Spring Harbor Protoc., 2010 (2010), pdb.prot5384. https://doi.org/10.1101/pdb.prot5384 doi: 10.1101/pdb.prot5384
[15]	W. Zhang, J. Jiang, Genome-wide mapping of DNase I hypersensitive sites in plants, in Plant Functional Genomics, Humana Press, 1284 (2015), 71–89. https://doi.org/10.1007/978-1-4939-2444-8_4
[16]	Y. Wang, K. Wang, Genome-wide identification of DNase I hypersensitive sites in plants, Curr. Protoc., 1 (2021), e148. https://doi.org/10.1002/cpz1.148 doi: 10.1002/cpz1.148
[17]	S. Wang, Q. Zhang, Z. Shen, Y. He, Z. Chen, J. Li, et al., Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture, Mol. Ther. Nucleic Acids, 24 (2021), 154–163. https://doi.org/10.1016/j.omtn.2021.02.014 doi: 10.1016/j.omtn.2021.02.014
[18]	Q. Zhang, Y. He, S. Wang, Z. Chen, Z. Guo, Z. Cui, et al., Base-resolution prediction of transcription factor binding signals by a deep learning framework, PLoS Comp. Biol., 18 (2022), e1009941. https://doi.org/10.1371/journal.pcbi.1009941 doi: 10.1371/journal.pcbi.1009941
[19]	S. Wang, Y. He, Z. Chen, Q. Zhang, FCNGRU: Locating transcription factor binding sites by combing fully convolutional neural network with gated recurrent unit, IEEE J. Biomed. Health. Inf., 26 (2021), 1883–1890. https://doi.org/10.1109/JBHI.2021.3117616 doi: 10.1109/JBHI.2021.3117616
[20]	Q. Zhang, Z. Shen, D. S. Huang, Predicting in-vitro transcription factor binding sites using DNA sequence+ shape, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2019), 667–676. https://doi.org/10.1109/TCBB.2019.2947461 doi: 10.1109/TCBB.2019.2947461
[21]	Q. Zhang, S. Wang, Z. Chen, Y. He, Q. Liu, D. S. Huang, Locating transcription factor binding sites by fully convolutional neural network, Briefings Bioinf., 22 (2021), bbaa435. https://doi.org/10.1093/bib/bbaa435 doi: 10.1093/bib/bbaa435
[22]	Y. Zhang, Z. Wang, Y. Zeng, Y. Liu, S. Xiong, M. Wang, et al., A novel convolution attention model for predicting transcription factor binding sites by combination of sequence and shape, Briefings Bioinf., 23 (2022), bbab525. https://doi.org/10.1093/bib/bbab525 doi: 10.1093/bib/bbab525
[23]	Y. Zhang, Z. Wang, Y. Zeng, J. Zhou, Q. Zou, High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method, Briefings Bioinf., 22 (2021), bbab273. https://doi.org/10.1093/bib/bbab273 doi: 10.1093/bib/bbab273
[24]	Y. He, Z. Shen, Q. Zhang, S. Wang, D. S. Huang, A survey on deep learning in DNA/RNA motif mining, Briefings Bioinf., 22 (2021), bbaa229. https://doi.org/10.1093/bib/bbaa229 doi: 10.1093/bib/bbaa229
[25]	W. S. Noble, S. Kuehn, R. Thurman, M. Yu, J. Stamatoyannopoulos, Predicting the in vivo signature of human gene regulatory sequences, Bioinformatics, 21 (2005), i338–i343. https://doi.org/10.1093/bioinformatics/bti1047 doi: 10.1093/bioinformatics/bti1047
[26]	B. Manavalan, T. H. Shin, G. Lee, DHSpred: Support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest, Oncotarget, 9 (2018), 1944. https://doi.org/10.18632/oncotarget.23099 doi: 10.18632/oncotarget.23099
[27]	S. Zhang, W. Zhuang, Z. Xu, Prediction of DNase I hypersensitive sites in plant genome using multiple modes of pseudo components, Anal. Biochem., 549 (2018), 149–156. https://doi.org/10.1016/j.ab.2018.03.025 doi: 10.1016/j.ab.2018.03.025
[28]	Y. Liang, S. Zhang, IDHS-DMCAC: Identifying DNase I hypersensitive sites with balanced dinucleotide-based detrending moving-average cross-correlation coefficient, SAR QSAR Environ. Res., 30 (2019), 429–445. https://doi.org/10.1080/1062936X.2019.1615546 doi: 10.1080/1062936X.2019.1615546
[29]	S. Zhang, Z. Duan, W. Yang, C. Qian, Y. You, IDHS-DASTS: Identifying DNase I hypersensitive sites based on LASSO and stacking learning, Mol. Omics, 17 (2021), 130–141. https://doi.org/10.1039/D0MO00115E doi: 10.1039/D0MO00115E
[30]	B. Liu, R. Long, K. C. Chou, IDHS-EL: Identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework, Bioinformatics, 32 (2016), 2411–2418. https://doi.org/10.1093/bioinformatics/btw186 doi: 10.1093/bioinformatics/btw186
[31]	S. Zhang, J. Lin, L. Su, Z. Zhou, PDHS-DSET: Prediction of DNase I hypersensitive sites in plant genome using DS evidence theory, Anal. Biochem., 564 (2019), 54–63. https://doi.org/10.1016/j.ab.2018.10.018 doi: 10.1016/j.ab.2018.10.018
[32]	Y. Zheng, H. Wang, Y. Ding, F. Guo, CEPZ: A novel predictor for identification of DNase I hypersensitive sites, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2021), 2768–2774. https://doi.org/10.1109/TCBB.2021.3053661 doi: 10.1109/TCBB.2021.3053661
[33]	S. Zhang, Q. Yu, H. He, F. Zhu, P. Wu, L. Gu, et al., IDHS-DSAMS: Identifying DNase I hypersensitive sites based on the dinucleotide property matrix and ensemble bagged tree, Genomics, 112 (2020), 1282–1289. https://doi.org/10.1016/j.ygeno.2019.07.017 doi: 10.1016/j.ygeno.2019.07.017
[34]	S. Zhang, T. Xue, Use Chou's 5-steps rule to identify DNase I hypersensitive sites via dinucleotide property matrix and extreme gradient boosting, Mol. Genet. Genomics, 295 (2020), 1431–1442. https://doi.org/10.1007/s00438-020-01711-8 doi: 10.1007/s00438-020-01711-8
[35]	Z. C. Xu, S. Y. Jiang, W. R. Qiu, Y. C. Liu, X. Xiao, IDHSs-PseTNC: Identifying DNase I hypersensitive sites with pseuo trinucleotide component by deep sparse auto-encoder, Lett. Org. Chem., 14 (2017), 655–664. https://doi.org/10.2174/1570178614666170213102455 doi: 10.2174/1570178614666170213102455
[36]	C. Lyu, L. Wang, J. Zhang, Deep learning for DNase I hypersensitive sites identification, BMC genomics, 19 (2018), 155–165. https://doi.org/10.1186/s12864-018-5283-8 doi: 10.1186/s12864-018-5283-8
[37]	P. Feng, N. Jiang, N. Liu, Prediction of DNase I hypersensitive sites by using pseudo nucleotide compositions, Sci. World J., 2014 (2014), 740506. https://doi.org/10.1155/2014/740506 doi: 10.1155/2014/740506
[38]	W. Chen, T. Y. Lei, D. C. Jin, H. Lin, K. C. Chou, PseKNC: A flexible web server for generating pseudo K-tuple nucleotide composition, Anal. Biochem., 456 (2014), 53–60. https://doi.org/10.1016/j.ab.2014.04.001 doi: 10.1016/j.ab.2014.04.001
[39]	W. Chen, H. Lin, K. C. Chou, Pseudo nucleotide composition or PseKNC: An effective formulation for analyzing genomic sequences, Mol. Biosyst., 11 (2015), 2620–2634. https://doi.org/10.1039/C5MB00155B doi: 10.1039/C5MB00155B
[40]	B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, K. C. Chou, Pse-in-One: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences, Nucleic Acids Res., 43 (2015), W65–W71. https://doi.org/10.1093/nar/gkv458 doi: 10.1093/nar/gkv458
[41]	S. Zhang, Z. Zhou, X. Chen, Y. Hu, L. Yang, PDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine, J. Theor. Biol., 426 (2017), 126–133. https://doi.org/10.1016/j.jtbi.2017.05.030 doi: 10.1016/j.jtbi.2017.05.030
[42]	K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824 doi: 10.1109/TPAMI.2015.2389824
[43]	F. Y. Dao, H. Lv, W. Su, Z. J. Sun, Q. L. Huang, H. Lin, IDHS-deep: an integrated tool for predicting DNase I hypersensitive sites by deep neural network, Briefings Bioinf., 22 (2021), bbab047. https://doi.org/10.1093/bib/bbab047 doi: 10.1093/bib/bbab047
[44]	C. E. Breeze, J. Lazar, T. Mercer, J. Halow, I. Washington, K. Lee, et al., Atlas and developmental dynamics of mouse DNase I hypersensitive sites, bioRxiv, 2020 (2020). https://doi.org/10.1101/2020.06.26.172718 doi: 10.1101/2020.06.26.172718
[45]	W. Li, A. Godzik, Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22 (2006), 1658–1659. https://doi.org/10.1093/bioinformatics/btl158 doi: 10.1093/bioinformatics/btl158
[46]	L. Fu, B. Niu, Z. Zhu, S. Wu, W. Li, CD-HIT: Accelerated for clustering the next-generation sequencing data, Bioinformatics, 28 (2012), 3150–3152. https://doi.org/10.1093/bioinformatics/bts565 doi: 10.1093/bioinformatics/bts565
[47]	X. Tang, P. Zheng, X. Li, H. Wu, D. Q. Wei, Y. Liu, et al., Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species, Methods, 204 (2022), 142–150. https://doi.org/10.1016/j.ymeth.2022.04.011 doi: 10.1016/j.ymeth.2022.04.011
[48]	T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, preprint, arXiv: 1301.3781.
[49]	T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in Advances in neural information processing systems, 26 (2013), 3111–3119.
[50]	K. Fukushima, S. Miyake, Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position, Pattern Recognt., 15 (1982), 455–469. https://doi.org/10.1016/0031-3203(82)90024-3 doi: 10.1016/0031-3203(82)90024-3
[51]	D. H. Hubel, T. N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex, J. Physiol., 160 (1962), 106. https://doi.org/10.1113/jphysiol.1962.sp006837 doi: 10.1113/jphysiol.1962.sp006837
[52]	Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, et al., Handwritten digit recognition with a back-propagation network, in Advances in neural information processing systems, Morgan Kaufmann, 2 (1989), 396–404.
[53]	S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 doi: 10.1162/neco.1997.9.8.1735
[54]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in neural information processing systems, 30 (2017), 6000–6010.
[55]	C. Raffel, D. P. Ellis, Feed-forward networks with attention can solve some long-term memory problems, preprint, arXiv: 1512.08756.

This article has been cited by:

James C. L. Chow, Computational physics and imaging in medicine, 2025, 22, 1551-0018, 106, 10.3934/mbe.2025005

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)