Deep multi-input and multi-output operator networks method for optimal control of PDEs

Jinjun Yong; Xianbing Luo; Shuyu Sun; Jinjun Yong; Xianbing Luo; Shuyu Sun

doi:10.3934/era.2024193

Electronic Research Archive

2024, Volume 32, Issue 7: 4291-4320. doi: 10.3934/era.2024193

Previous Article Next Article

Research article Special Issues

Deep multi-input and multi-output operator networks method for optimal control of PDEs

1.
School of Mathematics and Statistics, Guizhou University, Guiyang 550025, China
2.
School of Mathematics And Big Data, Guizhou Education University, Guiyang 550018, China
3.
Computational Transport Phenomena Laboratory, Division of Physical Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia

Received: 28 May 2024 Revised: 26 June 2024 Accepted: 01 July 2024 Published: 08 July 2024

Deep operator networks is a popular machine learning approach. Some problems require multiple inputs and outputs. In this work, a multi-input and multi-output operator neural network (MIMOONet) for solving optimal control problems was proposed. To improve the accuracy of the numerical solution, a physics-informed MIMOONet was also proposed. To test the performance of the MIMOONet and the physics-informed MIMOONet, three examples, including elliptic (linear and semi-linear) and parabolic problems, were presented. The numerical results show that both methods are effective in solving these types of problems, and the physics-informed MIMOONet achieves higher accuracy due to its incorporation of physical laws.

Keywords:

Citation: Jinjun Yong, Xianbing Luo, Shuyu Sun. Deep multi-input and multi-output operator networks method for optimal control of PDEs[J]. Electronic Research Archive, 2024, 32(7): 4291-4320. doi: 10.3934/era.2024193

Related Papers:

[1]	Ruohan Cao, Jin Su, Jinqian Feng, Qin Guo . PhyICNet: Physics-informed interactive learning convolutional recurrent network for spatiotemporal dynamics. Electronic Research Archive, 2024, 32(12): 6641-6659. doi: 10.3934/era.2024310
[2]	Saeedreza Tofighi, Farshad Merrikh-Bayat, Farhad Bayat . Designing and tuning MIMO feedforward controllers using iterated LMI restriction. Electronic Research Archive, 2022, 30(7): 2465-2486. doi: 10.3934/era.2022126
[3]	Duhui Chang, Yan Geng . Distributed data-driven iterative learning control for multi-agent systems with unknown input-output coupled parameters. Electronic Research Archive, 2025, 33(2): 867-889. doi: 10.3934/era.2025039
[4]	Xinzheng Xu, Yanyan Ding, Zhenhu Lv, Zhongnian Li, Renke Sun . Optimized pointwise convolution operation by Ghost blocks. Electronic Research Archive, 2023, 31(6): 3187-3199. doi: 10.3934/era.2023161
[5]	Hongzeng He, Shufen Dai . A prediction model for stock market based on the integration of independent component analysis and Multi-LSTM. Electronic Research Archive, 2022, 30(10): 3855-3871. doi: 10.3934/era.2022196
[6]	Sanqiang Yang, Zhenyu Yang, Leifeng Zhang, Yapeng Guo, Ju Wang, Jingyong Huang . Research on deformation prediction of deep foundation pit excavation based on GWO-ELM model. Electronic Research Archive, 2023, 31(9): 5685-5700. doi: 10.3934/era.2023288
[7]	Jiange Liu, Yu Chen, Xin Dai, Li Cao, Qingwu Li . MFCEN: A lightweight multi-scale feature cooperative enhancement network for single-image super-resolution. Electronic Research Archive, 2024, 32(10): 5783-5803. doi: 10.3934/era.2024267
[8]	Yi Gong . Consensus control of multi-agent systems with delays. Electronic Research Archive, 2024, 32(8): 4887-4904. doi: 10.3934/era.2024224
[9]	Xiaochun Gu, Fang Han, Zhijie Wang, Kaleem Kashif, Wenlian Lu . Enhancement of gamma oscillations in E/I neural networks by increase of difference between external inputs. Electronic Research Archive, 2021, 29(5): 3227-3241. doi: 10.3934/era.2021035
[10]	Denggui Fan, Yingxin Wang, Jiang Wu, Songan Hou, Qingyun Wang . The preview control of a corticothalamic model with disturbance. Electronic Research Archive, 2024, 32(2): 812-835. doi: 10.3934/era.2024039

Abstract

1. Introduction

The optimal control problem has been successfully applied in various fields, such as heat transfer phenomena ^[1], finance ^[2], image processing ^[3], shape optimization ^[4,5], aerodynamics ^[6,7], crystal growth ^[8], and drug delivery ^[9]. To solve partial differential equation constrained (PDE-constrained) optimal control problems, numerous numerical methods have been developed, including finite element methods, finite differences methods, finite volume methods, spectral methods, and mesh-less methods (see, e.g., ^{[10,11,12,13]}). Despite their effectiveness, the optimal control problem remains challenging to solve, particularly when the problem is nonlinear. Recently, deep learning has emerged as a popular method for solving partial differential equations, especially nonlinear ones.

Deep learning methods for solving PDEs have received significant attention, including physics-informed neural networks (PINN) ^[14,15], deep Galerkin method ^[16], deep Ritz method ^[17], deep Nitsche method ^[18], and deep operator networks (DeepONets) ^[19,20]. PINNs can be used to solve specific PDEs with boundary conditions and loading terms, but they require expensive optimization computation cost during inference. Therefore, the PDEs with operating conditions and real-time inference cannot be solved by PINNs. The neural operator full-fills well. Various versions of operator networks have been published, e.g., graph neural operator networks ^[21], Fourier neural operator (FNO) ^[22], physics-informed neural operators (PINO) ^[23], and deep multiple input operator network (DeepMIONet) ^[24].

Neural networks have several advantages over traditional numerical solvers, including being mesh-free and easier to deal with complex geometric regions. Simulating control problems that involve these complex geometric regions using traditional numerical methods often requires high-quality grids and extensive preprocessing before simulation. By contrast, researchers are interested in using neural networks to replace traditional numerical methods. To solve optimal control problems with PINNs, methodologies and guidelines have been proposed in previous works ^[25,26]. Barry-Straume et al. used a two-stage framework to solve PDE-constrained optimization problems ^[27]. Wang et al. used physics-informed deep operator networks (DeepONets) to learn the solution operator of parametric PDEs, building a surrogate for solving PDE-constrained optimization problems ^[28]. In summary, deep learning approaches, such as PINNs and DeepONets, have shown promise for solving PDE-constrained optimal control problems, providing an efficient solution without requiring extensive preprocessing or expensive optimization during inference. Future work could explore further improvements to deep learning methods and investigate their application in new fields.

In this paper, to solve PDE-constrained optimal control problems with available data, we introduce a MIMOONet. In the context of the PDE optimal problem, the governing PDE is fully known, and the objective is to determine a control variable that minimizes the cost function. Initially, the PDE-constrained optimal control problem is reformulated into a PDE system using the adjoint method. Subsequently, the PDE system is tackled by using MIMOONets. Additionally, we examine a physical system described by PDEs and propose physics-informed MIMOONets for addressing the PDE-constrained optimal control problem. Overall, this method (MIMOONet) has the following advantages:

● MIMOONet can solve PDE-constraint problems governed by different types of PDE.

● MIMOONet can easily approximate nonlinear optimal control problems.

● Compared with traditional numerical methods, its prediction speed is faster.

The remainder of this paper is organized as follows. In Section 2, we introduce the adjoint state method of solving PDE-constrained optimization problem and optimality system, and provide the framework of MIMOONets, physics-informed MIMOONets, and the detailed method of our main technical contribution. In Section 3, we give the deep learning framework of elliptic and parabolic constrained optimal control problem, and present the numerical results to assess the performance of the proposed MIMOONets and physics-informed MIMOONets. Finally, Section 4 summarizes the results, potential pitfalls, and shortcomings, and details the groundwork for future directions.

2. Methods

Let $U$ , $S$ , $V$ be Banach spaces. We consider the following PDE-constrained optimization problem:

$\begin{equation} \left\{\begin{split} &u^{\ast},v^{\ast} = \arg\min\limits_{u\in S,v\in U} J(u,v),\\ &\mbox{subject to}\; \; \; \mathcal{F}(u,v) = 0, \end{split}\right. \end{equation}$

(2.1)

where $J: S\times U \rightarrow \mathbb{R}$ is a cost function, $\mathcal{F}: S\times U \rightarrow V$ is a system of PDEs subject to initial and boundary conditions, $u$ and $v$ are the state variable and control variable, respectively. Assume that the problem (2.1) has unique solutions $u$ and $v$ . In the subsequent sections, the MIMOONets method is presented under this assumption.

2.1. Optimality system

In this subsection, we transform the PDE-constrained optimization problem into an optimality system.

The PDE-constrained optimization problem, which involves optimizing an objective function subject to a set of partial differential equations (PDEs), can be transformed into an optimality system. The resulting optimality system consists of two sets of equations: the state equations and the adjoint equations. These equations are coupled and must be solved simultaneously to obtain the optimal solution to the original PDE-constrained optimization problem.

Consider the following problem ^[29],

$\begin{eqnarray} &\min\limits_{u\in S,v\in U} J(u,v), \end{eqnarray}$

(2.2)

$\begin{eqnarray} \mbox{subject to}\; \; \; &\left\{ \begin{array}{lll} \mathcal{F}[u(x,t);v(x,t)] = 0,&x\in\Omega,t\in [0,T],\\ \mathcal{B}[u(x,t)] = 0,&x\in\partial\Omega,t\in [0,T],\\ \mathcal{I}[u(x,0)] = 0,&x\in\Omega,\\ \end{array} \right. \end{eqnarray}$

(2.3)

where $x$ and $t$ denote space and time variables, respectively, the domain $\Omega\subseteq\mathbb{R}^d$ , $\partial\Omega$ is the boundary of the domain $\Omega$ , and $\mathcal{B}$ and $\mathcal{I}$ are boundary conditions and initial condition, respectively. We construct the Lagrangian function for the problems (2.2) and (2.3) as follows:

$\begin{equation} \mathcal{L}(u,v,p_{1},p_{2},p_{3}) = J(u,v)-\int^{T}_{0}\int_{\Omega}p_{1}\mathcal{F}(u,v)dxdt-\int^{T}_{0}\int_{\partial\Omega}p_{2}\mathcal{B}(u)dsdt-\int_{\Omega}p_{3}\mathcal{I}(u)dx. \end{equation}$

(2.4)

Here, $p_{1}, p_{2}$ , and $p_{3}$ are Lagrange multiplier functions defined on $\Omega\; \times\; [0, T]$ , $\partial\Omega\; \times\; [0, T]$ and $\partial\Omega\; \times\; {0}$ , respectively. According to the Lagrange principle, we seek the pair $(u^{*}, v^{*})$ and the Lagrange multipliers or adjoint field $\mathbf{p} = (p_{1}, p_{2}$ , $p_{3})$ to satisfy the optimality conditions. Therefore, the problems (2.2) and (2.3) are equivalent to the following unconstrained problem

$\begin{equation} (u^{*},v^{*},\mathbf{p}^{*}) = \arg\min\limits_{u\in S,v\in U,\mathbf{p}} \mathcal{L}(u,v,\mathbf{p}). \end{equation}$

(2.5)

Then, the directional derivative of $\mathcal{L}$ with respect to $u$ disappears at the optimal point, that is

$\begin{equation} D_{u}\mathcal{L}(u^{*},v^{*},\mathbf{p}^{*})\delta u = \lim\limits_{\varepsilon\rightarrow 0}\frac{\mathcal{L}(u^{*}+\varepsilon \delta u,v^{*},\mathbf{p}^*)-\mathcal{L}(u^{*},v^{*},\mathbf{p}^{*})}{\varepsilon } = 0, \; \; \; \forall \delta u\in S. \end{equation}$

(2.6)

For the control variable $v$ and Lagrange multipliers $\mathbf{p}$ , we have

$\begin{equation} D_{v}\mathcal{L}(u^{*},v^{*},\mathbf{p}^{*})\delta v = 0, \; \; \; \forall \delta v\in U, \end{equation}$

(2.7)

and

$\begin{equation} D_{\mathbf{p}}\mathcal{L}(u^{*},v^{*},\mathbf{p}^{*}){\delta \mathbf{p}} = 0, \; \; \; \forall \delta\mathbf{p}. \end{equation}$

(2.8)

Therefore, the problems (2.2) and (2.3) can be written in the following way:

$\begin{eqnarray} \left\{ \begin{array}{ll} D_{u}\mathcal{L}(u,v,\mathbf{p})\delta u = 0,\\ D_{v}\mathcal{L}(u,v,\mathbf{p})\delta v = 0,\\ D_{\mathbf{p}}\mathcal{L}(u,v,\mathbf{p}){\delta \mathbf{p}} = 0. \end{array} \right. \end{eqnarray}$

(2.9)

Once the optimality system has been derived, it can be solved by using neural networks. Next, we introduce the MIMOONet and the physics-informed MIMOONet methods for solving the optimality system.

2.2. Multiple-input operators networks

DeepONets is a learning framework proposed by Lu et al. ^[19], which enables the learning of abstract nonlinear operators in infinite-dimensional function spaces. It is inspired by the universal approximation theorem of operators ^[30]. The DeepONets network comprises two main components: the trunk network and the branch network.

The trunk network provides the basis functions for the output function by encoding information related to the space-time coordinates. It takes as input the space-time coordinates and any other relevant physical parameters and produces a set of basis functions for the function space. These basis functions serve as a representation of the output function and are used in the computation of the final output.

The branch network encodes the input function to provide the coefficients at fixed sensor points. Given an input function, which may be a solution to a PDE or a data-driven function, the branch network maps it to a set of coefficients that correspond to specific sensor points in the domain. These coefficients are then combined with the basis functions from the trunk network to produce the final output function.

Theorem 2.1. Suppose that $X$ is a Banach space, $K_{1} \subset X$ , $K_{2}\subset R^{d}$ are two compact sets in $X$ and $R^{d}$ , respectively. Let $V$ be a compact set in $C(K_{1})$ , $G:V\rightarrow C(K_{2})$ a nonlinear continuous operator, $\sigma$ a continuous non-polynomial function. Then, for $\forall\varepsilon > 0$ , there exist $n$ , $p$ , $m\in \mathbb{N}$ , constants $c^{k}_{i}$ , $a^{k}_{ij}$ , $\theta_{i}^{k}$ , $\zeta_{k}\in R$ , $w_{k}\subset R^{d}$ , $x_{j}\in K_{1}$ , $i = 1, 2, \cdots, n$ , $k = 1, \cdots, p$ , $j = 1, \cdots, m$ , such that

$\begin{equation} \mid G(u)(y)-\sum\limits^{p}_{k = 1}\underbrace{\sum\limits^{n}_{i = 1}c^{k}_{i}\sigma(\sum\limits^{m}_{j = 1}a^{k}_{ij}u(x_{j})+\theta_{i}^{k})}_{branch}\underbrace{\sigma(w_{k}\cdot y+\zeta_{k})}_{trunk}\mid < \varepsilon, \end{equation}$

(2.10)

for any $u\in V$ and $y\in K_{2}$ .

DeepONets are designed to learn abstract nonlinear operators in infinite-dimensional function spaces from a single input function defined on a Banach space, while real-world problems often involve multiple input functions. To address this issue, the DeepMIONet was proposed in ^[24], which is defined through the tensor product of Banach spaces.

In DeepMIONets, only the case of two input functions $u$ and $v$ are considered, which are represented by their respective branch networks. The trunk network provides basis functions that span the function space, and the outputs of the branch networks are combined with the basis functions by using tensor products to produce the final output function. Specifically, the output of DeepMIONet $G_{\theta}(u, v)$ can be written as:

$\begin{equation} G_{\theta}(u,v) = \sum\limits^{p}_{i = 1}b^{u}_{i}b^{v}_{i}tr_{i}, \end{equation}$

(2.11)

where $b^{u}_{i}$ and $b^{v}_{i}$ denote the $i$ -th output of the branch networks corresponding to the input functions represented by $u$ and $v$ , respectively. And $tr_{i}$ is the $i$ -th output of the trunk network.

The architecture of DeepMIONets includes two separate branch networks and a shared trunk network (as shown in Figure 1), which provide basis functions spanning the function space. The branch networks encode the input functions and provide coefficients at fixed sensor points, while the trunk network provides basis functions spanning the function space. The outputs of the branch networks and the trunk network are combined using tensor products to produce the final output function. Specifically, the outputs of the branch networks are the coefficients of the input functions, while the trunk network provides a set of basis functions that are combined with the outputs of the branch networks using tensor products to obtain the final output function.

Figure 1. Architectures of MIONet for

$G_{\theta}(u, v)(x, t)$ : the branch network 1 takes

$u$ as input functional [employs a fully connected neural network (FNN) to take as input the values at

$m$ sensor], the branch network 2 takes

$v$ as input functional (employs an FNN to take as input the values at

$n$ sensor), and computes coefficients of the solution for the coordinates, which are inputs of the trunk network (employs a FNN).

DownLoad: Full-Size Img PowerPoint

2.3. Multi-input and multi-output operators networks

Here, we focus on using neural operator networks to solve systems of PDEs.

Although DeepONets and DeepMIONet can be used to solve a single PDE, they cannot do it for a system of PDEs, in which at least two solution operators must be generated and two output operators are needed for a network. To solve this problem, we propose MIMOONets, which also can be used to solve systems of PDEs. The MIMOONets framework is composed of a trunk network and multiple branch networks. The trunk network provides the basis functions of the solution operators, while the branch networks provide additional groups of coefficients of the solution operators at fixed sensor points.

We consider the following problem that is a system of PDEs in domain $D\subseteq \mathbb{R}^{d}$ ( $d$ is the dimension of space),

$\begin{eqnarray*} \left\{ \begin{array}{lll} \mathcal{L}_{i}[u_{1}(x),u_{2}(x),\cdot\cdot\cdot,u_{n}(x)] = f_{i}(x),& x\in D, i = 1,2,\cdot\cdot\cdot,n,\\ \mathcal{B}_{i}[u_{i}(x)] = \varphi_{i} (x),& x\in\partial D,i = 1,2,\cdot\cdot\cdot,n,\\ \end{array} \right. \end{eqnarray*}$

where $u_{i}$ and $f_{i}$ are functions, $\mathcal{L}_{i}$ is the differential operator, $\varphi_{i}$ is the boundary condition of $u_{i}$ . Let $G^{i}$ be the solution of the operator with input functions $f_{i}$ and $\varphi_{i}$ s.t.

$\mathcal{L}_{i}[G^{1},G^{2},\cdot\cdot\cdot,G^{i}] = f_{i}, i = 1,2,\cdot\cdot\cdot,n,$

and

$\mathcal{B}_{i}\circ G^{i} = \varphi_{i}, on \; \; \partial D,i = 1,2,\cdot\cdot\cdot,n.\\$

This means that $G^{i}(f_{1}, f_{2}, \cdot\cdot\cdot, f_{n}, \varphi_{1}, \varphi_{2}, \cdot\cdot\cdot, \varphi_{n})(y)$ is the corresponding output function. According to the results in ^[24] and ^[30], the solution $u_{i}$ can be expressed as

$\begin{eqnarray} u_{i} = G^{i}(f_{1}, f_{2},\cdot\cdot\cdot, f_{n}, \varphi_{1},\varphi_{2},\cdot\cdot\cdot,\varphi_{n})(y). \end{eqnarray}$

(2.12)

Then, the solution of problem (2.12) can be learned by using MIMOONets, which is defined through the tensor product of Banach spaces.

Here, we only give the framework of two-input and two-output operators networks (see Figure 2). The solution operator networks can be described by

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(f,g) = \sum\limits^{p}_{k = 1} b1^{f}_{k}b1^{g}_{k}tr_{k},\\ G_{\theta}^{2}(f,g) = \sum\limits^{p}_{k = 1} b2^{f}_{k}b2^{g}_{k}tr_{k}, \end{array}\right. \end{eqnarray}$

(2.13)

Figure 2. Architectures of MIMOONets for

$G_{\theta}(f, g)(x)$ : the branch network 1 takes

$f$ as input functional (employs a FNN to take as input the values at

$m$ sensor), the branch network 2 takes

$g$ as input functional (employs an FNN to take as input the values at

$n$ sensor), and computes two groups coefficients of the solution for the coordinates, which are inputs of the trunk network (employs a FNN).

DownLoad: Full-Size Img PowerPoint

where the definitions of $b1^{f}_{k}$ , $b2^{f}_{k}$ , $b1^{g}_{k}$ , $b2^{g}_{k}$ and $tr_{k}$ are similar to (2.11). To reduce the generalized error, we may add a bias $b^{1}_{0}, b^{2}_{0}\in\mathbb{R}$ in the last stage:

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(f,g) = \sum\limits_{k = 1}^{p} b1^{f}_{k}b1^{g}_{k}tr_{k}+b^{1}_{0},\\ G_{\theta}^{2}(f,g) = \sum\limits^{p}_{k = 1} b2^{f}_{k}b2^{g}_{k}tr_{k}+b^{2}_{0}.\\ \end{array} \right. \end{eqnarray}$

(2.14)

Let $G^{i}:C(D)\longrightarrow L^{2}(D)$ be a Borel measurable mapping with $G^{i}\in L^{2}(\mu)$ . Then, for any $\varepsilon > 0$ , there exists an operator network $G^{i}_{\theta}:C(D)\longrightarrow L^{2}(D)$ , such that

$||G^{i}-G^{i}_{\theta}||_{L^{2}(\mu)} = \left(\int_{C(D)}||G^{i}-G^{i}_{\theta}||_{L^{2}}(\mu)d\mu(f,g)\right)^{\frac{1}{2}} < \varepsilon,$

here, $\mu$ is a probability measure on $C(D)$ .

When we have some dataset of the pair solution $\{u(x_{k}), v(x_{k})\}_{1}^{N}$ , the solution can be learned by MIMOONets. The corresponding loss function can be formulated as follows

$\begin{equation} \mathcal{L}(\theta) = \frac{1}{N}\sum\limits^{N}_{k = 1}\left(\mid u(x_{k})-G_{\theta}^{1}(f,g)(x_{k})\mid ^2+\mid v(x_{k})-G_{\theta}^{2}(f,g)(x_{k})\mid ^2\right). \end{equation}$

(2.15)

Now we can learn parameter $\theta$ by minimizing the loss function (2.15) and using the stochastic gradient descent method.

2.4. Physics-informed MIMOONets

A large amount of paired datasets is required to solve PDE systems by using MIMOONets. However, data acquisition is expensive in many engineering applications and physical systems. Under the condition of sparse data, it becomes important to introduce physics-informed neural networks to train MIMOONets by integrating known differential equations with label data in the loss function. We use automatic differentiation of the outputs of MIMOONets with respect to their input coordinates and adopt an appropriate regularization mechanism to ensure that the target output functions satisfy the PDE constraints.

For simplicity, we consider the following problem (without causing confusion, we still use the preceding symbols):

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{1}[u(x);v(x)] = f(x),& x\in D,\\ \mathcal{L}_{2}[u(x);v(x)] = g(x),& x\in D,\\ \mathcal{B}_{1}[u(x)] = \varphi (x),& x\in\partial D,\\ \mathcal{B}_{2}[v(x)] = \psi (x),& x\in\partial D. \end{array}\right. \end{eqnarray}$

(2.16)

The solutions $u$ and $v$ can be expressed as

$\begin{eqnarray} \left\{ \begin{array}{lll} u = G^{1}(f, g, \varphi,\psi)(y),\\ v = G^{2}(f, g,\varphi,\psi)(y).\\ \end{array}\right. \end{eqnarray}$

(2.17)

For the problem (2.16), the loss function of a physics-informed MIMOONets is defined as follows

$\begin{equation} \mathcal{L}(\theta) = \mathcal{L}_{data}(\theta)+\mathcal{L}_{physics}(\theta), \end{equation}$

(2.18)

where

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{data}(\theta) = \frac{1}{N}\sum\limits^{N}_{k = 1}\left(\mid u(x_{k})-G_{\theta}^{1}(f,g,\varphi,\psi)(x_{k})\mid ^2+\mid v(x_{k})-G_{\theta}^{2}(f,g,\varphi,\psi)(x_{k})\mid ^2\right),\\ \mathcal{L}_{physics}(\theta) = \mathcal{L}_{pde1}(\theta)+\mathcal{L}_{pde2}(\theta) +\mathcal{L}_{BC1}(\theta)+\mathcal{L}_{BC2}(\theta),\\ \end{array} \right. \end{eqnarray}$

(2.19)

and

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{pde1}(\theta) = \frac{1}{N_{f}}\sum\limits^{N_{f}}_{k = 1}\mid\mathcal{L}_{1}[ G_{\theta}^{1}(f,g,\varphi,\psi)(x_{k});G_{\theta}^{2}(f,g,\varphi,\psi)(x_{k})]- f(x_{k})\mid^2, \\ \mathcal{L}_{pde2}(\theta) = \frac{1}{N_{g}}\sum\limits^{N_{g}}_{k = 1}\mid\mathcal{L}_{2} [G_{\theta}^{1}(f,g,\varphi,\psi)(x_{k});G_{\theta}^{2}(f,g,\varphi,\psi)(x_{k})]- g(x_{k})\mid^2,\\ \mathcal{L}_{BC1}(\theta) = \frac{1}{N_{\varphi}}\sum\limits^{N_{\varphi}}_{k = 1}\mid\mathcal{B}_{1}[G_{\theta}^{1}(f,g,\varphi,\psi)(x_{k})]- \varphi(x_{k})\mid^2,\\ \mathcal{L}_{BC2}(\theta) = \frac{1}{N_{\psi}}\sum\limits^{N_{\psi}}_{k = 1}\mid\mathcal{B}_{2}[G_{\theta}^{2}(f,g,\varphi,\psi)(x_{k})]- \psi(x_{k})\mid^2.\\ \end{array} \right. \end{eqnarray}$

(2.20)

Here, $N$ is the number of initial data points. $N_{f}$ and $N_{g}$ are the number of sample from the computational domain $\Omega$ for the PDEs. $N_{\varphi}$ and $N_{\psi}$ are the number of the boundary points for $u$ and $v$ .

The loss function (2.18) is minimized by learning the parameters $\theta$ of the deep neural network. Sometimes, in order to improve the accuracy of the numerical solution or increase the convergence rate, we can apply penalty parameters.

2.5. PDE-constrained optimization with physics-informed MIMOONets

For a given PDE-constrained optimal control problem such as (2.2), we use physics-informed MIMOONets to solve the optimization problems. The corresponding steps are given as follows. First, we turn the PDE-constrained optimization problem (2.2) to an optimality system (2.9), which consists of adjoint equation, state equation, and optimality condition; second, we solve the optimality system using physics-informed MIMOONets. The detailed computing framework can be seen in Algorithm 1.

Algorithm 1 The steps of physics-informed MIMOONets to approximate optimal control problems
Input: $N$ (the number of initial data of $\{(x_i, u(x_i), v(x_i))\})$ , $N_f = N_g$ (the number of internal points), $N_{\varphi} = N_{\psi}$ (the number of boundary points), $M$ (maximum number of iterations), $\lambda_i$ , $f$ , $g$ , $\varphi$ , $\psi$ , $\Omega$ .
Output: $G_{\theta}^f$ (state variable function), $G_{\theta}^9$ (adjoint state variable function).
1. Take $N_f$ sample points $\{\boldsymbol{x}_{i}\}$ in $\Omega$ , $N_{\varphi} = N_{\psi}$ sample points $\{\boldsymbol{x}_{j}\}$ on $\partial\Omega$ .
2. Generate $G_{\theta}^f$ , $G_{\theta}^g$ using Multiple input multiple output Deep ONet.
3. For $k = 1$ to $M$
4. Calculate $L(\theta)$ according to (2.18),
5. Update neural network parameter $\theta$ ,
6. Endfor
7. Output $G_{\theta}^f$ , $G_{\theta}^g$ .

Remark 2.1. For MIMOONets (no physics-informed), the algorithm is only need to change "Input: $\cdots$ " to "Input: $N$ (the number of initial data of $\{(x_i, u(x_i), v(x_i))\})$ ".

3. Main results

In the following demonstrations, to showcase the effectiveness of MIMOONs, some numerical examples of elliptic, semi-linear elliptic, parabolic, and second-order hyperbolic optimal control problems are provided. Data-driven MIMOONs or physics-informed MIMOONs are employed with uniform distribution random sampling on the solution domain during the training process. In the subsequent examples, the relative error of the numerical solution $\boldsymbol{u}$ is calculated by: $\dfrac{||\boldsymbol{u}- \boldsymbol{u}^{\ast}||}{||\boldsymbol{u}^{\ast}||}$ , where the reference solution $\boldsymbol{u}^{\ast}$ is an analytical solution, or finite element approximated solution with $100\times100$ spatial grid.

3.1. Linear elliptic optimal control problem

We start with an example involving finding an optimal heat source under homogeneous Dirichlet boundary conditions. The model can be represented as follows,

$\begin{equation} \min J(u,v) = \frac{1}{2}\int_{\Omega} (u-u_{d})^2dx+\frac{\alpha}{2}\int_{\Omega} v^2 dx,\\ \end{equation}$

(3.1)

$\begin{equation} \mbox{subject to}\; \; \; \left\{ \begin{array}{lll} -\Delta u-v = f, &\text{in}\; \Omega,\\ u = 0,&\text{on} \; \partial\Omega, \end{array} \right. \end{equation}$

(3.2)

where $\Omega$ is a bounded domain, $u:\Omega\rightarrow \mathbb{R}$ is the unknown temperature satisfying (3.2), $u_{d}:\Omega\rightarrow \mathbb{R}$ is the given desired temperature, $v$ is the unknown control function, $f$ is the source term in $\Omega$ , $\alpha\geq 0$ is a regularization parameter. Here, we set $\Omega = (0, 1)\times (0, 1)$ , $u_{d}(x, y) = (1-10\pi)\sin(\pi x)\sin(\pi y)$ , $\lambda = 1$ , $f(x, y) = (5+2\pi^2)\sin(\pi x)\sin(\pi y)$ . When, $u(x, y) = \sin(\pi x)\sin(\pi y)$ , $v(x, y) = -5\sin(\pi x)\sin(\pi y)$ , the $J(u, v)$ gets the global minimum. We know that the optimal control problems (3.1) and (3.2) can be transformed into the optimality system as follows

$\begin{eqnarray} \left\{ \begin{array}{lll} -\Delta u-v = f,&\text{in}\; \Omega,\\ \Delta p+u = u_{d},&\text{in}\; \Omega,\\ \alpha v+p = 0,&\text{in}\; \Omega,\\ u = 0,&\text{on} \; \partial\Omega, \\ p = 0,&\text{on} \; \partial\Omega.\\ \end{array} \right. \end{eqnarray}$

(3.3)

We use the MIMOONets to solve the PDEs system (3.3). The solution operator $G^{1}$ and $G^{2}$ of $f$ and $u_{d}$ can be represented as follows,

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(f,u_{d}) = \sum\limits^{p}_{k = 1} b1^{f}_{k}b1^{u_{d}}_{k}tr_{k},\\ G_{\theta}^{2}(f,u_{d}) = \sum\limits^{p}_{k = 1} b2^{f}_{k}b2^{u_{d}}_{k}tr_{k}, \end{array} \right. \end{eqnarray}$

(3.4)

where the branch network 1 and the branch network 2 are two separate 5-layer fully connected neural networks (FNN). Every hidden layer and output layer has 300 neurons. The input layer per network has 100 neurons. The trunk network is 5-layer FNN with 300 neurons per hidden layer and 150 neurons output layer. Relu or Tanh is used as the activation function. The loss function of the deep MIMOONets is denoted by

$\begin{equation} \mathcal{L}_{data}(\theta) = \frac{1}{N}\sum\limits^{N}_{k = 1}\left(\mid u(x_{k},y_{k})-G_{\theta}^{1}(f,u_{d})(x_{k},y_{k})\mid ^2+\mid p(x_{k},y_{k})-G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})\mid ^2\right). \end{equation}$

(3.5)

We use the same network structure as MIMOONets to consider the physics-informed MIMOONets. The activation function is Tanh. The corresponding loss function can be expressed as follows

$\begin{equation} \mathcal{L}(\theta) = \mathcal{L}_{data}(\theta)+\mathcal{L}_{physics}(\theta), \end{equation}$

(3.6)

where

$\begin{equation} \mathcal{L}_{physics}(\theta) = \mathcal{L}_{pde1}(\theta)+\mathcal{L}_{pde2}(\theta)+\mathcal{L}_{BC1}(\theta)+\mathcal{L}_{BC2}(\theta), \end{equation}$

(3.7)

and

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{pde1}(\theta) = \frac{1}{N_{f}}\sum\limits^{N_{f}}_{k = 1}\mid -\frac{\partial^{2} G_{\theta}^{1}(f,u_{d})(x_{k},y_{k})}{\partial x^{2}}-\frac{\partial^{2} G_{\theta}^{1}(f,u_{d})(x_{k},y_{k})}{\partial y^{2}}+G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})-f(x_{k},y_{k})\mid^2, \\ \mathcal{L}_{pde2}(\theta) = \frac{1}{N_{u_{d}}}\sum\limits^{N_{u_{d}}}_{k = 1}\mid \frac{\partial^{2}G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})}{\partial x^{2}}+\frac{\partial^{2} G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})}{\partial y^{2}}+G_{\theta}^{1}(f,u_{d})(x_{k},y_{k})-u_{d}(x_{k},y_{k})\mid^2, \\ \mathcal{L}_{BC1}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}\mid G_{\theta}^{1}(f,u_{d})(x_{i},y_{i})\mid^2,\\ \mathcal{L}_{BC2}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}\mid G_{\theta}^{2}(f,u_{d})(x_{i},y_{i})\mid^2, \end{array} \right. \end{eqnarray}$

(3.8)

where $N$ is the number of initial data points. $N_{f}$ and $N_{u_{d}}$ denotes the number of integration points of $f$ and $u_{d}$ in the computational domain $\Omega$ , respectively. $N_{BC}$ is the number of the boundary points on the $\partial\Omega$ .

To evaluate the loss, we randomly sample $N = 10,000$ training points $(x_{k}, y_{k})\in \Omega$ , $N_{f} = N_{u_{d}} = 10,000$ residual training points $(x_{k}, y_{k})\in \Omega$ . We select $N_{BC} = 400$ equidistant boundary training points $(x_{i}, y_{i})\in \partial\Omega$ .

Table 1. MIMOONets for an optimal control of elliptic problem.

Iterations	Activation function	Relative $L^{2}$ error of $u$	Relative $L^{2}$ error of $p$
10,000	Relu	(0.62 $\pm$ 0.08)%	(0.39 $\pm$ 0.05)%
10,000	Tanh	(1.00 $\pm$ 0.15)%	(0.98 $\pm$ 0.12)%
40,000	Relu	(0.14 $\pm$ 0.03)%	(0.11 $\pm$ 0.03)%
40,000	Tanh	(0.45 $\pm$ 0.05)%	(0.45 $\pm$ 0.06)%

| Show Table

DownLoad: CSV

We utilize the Adam optimizer from PyTorch to train the networks, and take the corresponding learning rate 0.001. At the same time, we can choose different parameters to improve the accuracy of deep physics-informed MIMOONets method for the following format

$\mathcal{L}(\theta) = \lambda_{0} \mathcal{L}_{data}(\theta)+\lambda_{1}\mathcal{L}_{pde1}(\theta)+\lambda_{2}\mathcal{L}_{pde2}(\theta)+\lambda_{3}\mathcal{L}_{BC1}(\theta)+\lambda_{4}\mathcal{L}_{BC2}(\theta).$

We take $\lambda_{0} = \lambda_{1} = \lambda_{2} = 1$ , $\lambda_{3} = \lambda_{4} = 100$ . The experimental results are shown in Table 2 and Figure 5.

Table 2. Physics-informed MIMOONets for an optimal control of elliptic problem.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Tanh	(0.26 $\pm$ 0.04) $\%$	(0.28 $\pm$ 0.05) $\%$
40,000	Tanh	(0.13 $\pm$ 0.04) $\%$	(0.14 $\pm$ 0.05) $\%$

| Show Table

DownLoad: CSV

Figures 3–5 show that both the deep MIMOONets method and the deep physics-informed MIMOONets method are effective for optimal control problems with elliptic constraints; also, the activation function Relu converges faster than Tanh with the MIMOONets method. However, the precision of the deep MIMOONets method can be achieved with only a small amount of data by means of the deep physics-informed MIMOONets method.

Figure 3. MIMONet iterations 10,000 times with Rule. (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 4. MIMONet iterations 10,000 times with tanh. (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 5. Physics-informed MIMONet iterations 10,000 times,

$\lambda_{0} = \lambda_{1} = \lambda_{2} = 1$ ,

$\lambda_{3} = \lambda_{4} = 100$ . (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

3.2. Semi-linear elliptic optimal control problem

A semi-linear elliptic constrained optimal control problem is considered here. The model is described as follows

$\begin{equation} \min J(u,v) = \frac{1}{2}\int_{\Omega}(u-u_{d})^2dx+\frac{\alpha}{2}\int_{\Omega}v^2dx,\\ \end{equation}$

(3.9)

$\begin{equation} \mbox{subject to}\; \; \; \left\{ \begin{array}{lll} -\Delta u+u^{3} = v, &\text{in}\; \Omega,\\ u = 0,&\text{on} \; \partial\Omega. \end{array} \right. \end{equation}$

(3.10)

The problems (3.9) and (3.10) lead to the following optimality system

$\begin{eqnarray} \left\{ \begin{array}{lll} -\Delta u+u^{3}-v = 0,&\text{in}\; \Omega,\\ \Delta p-3u^{2}p+u = u_{d},&\text{in}\; \Omega,\\ \alpha v+p = 0,&\text{in}\; \Omega,\\ u = 0,&\text{on} \; \partial\Omega, \\ p = 0,&\text{on} \; \partial\Omega.\\ \end{array} \right. \end{eqnarray}$

(3.11)

Table 3. MOONet for an optimal control of semi-linear elliptic problem.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Relu	(0.62 $\pm$ 0.08) $\%$	(0.076 $\pm$ 0.011) $\%$
10,000	Tanh	(6.35 $\pm$ 0.35) $\%$	(5.58 $\pm$ 0.22) $\%$

| Show Table

DownLoad: CSV

Table 4. Physics-informed MOONet for an optimal control of semi-linear elliptic problem.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Tanh	(2.40 $\pm$ 0.15) $\%$	(1.60 $\pm$ 0.12) $\%$

| Show Table

DownLoad: CSV

We use a multi-output operators network (MOONet) to solve the PDEs system (3.11). The corresponding solution operators $G_{1}$ and $G_{2}$ are shown below

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(u_{d}) = \sum\limits^{p}_{k = 1} b1^{u_{d}}_{k}tr_{k},\\ G_{\theta}^{2}(u_{d}) = \sum\limits^{p}_{k = 1} b2^{u_{d}}_{k}tr_{k},\\ \end{array} \right. \end{eqnarray}$

(3.12)

where the branch network is a 5-layer FNN with 300 neurons per hidden layer and output layer. The input layer contains 100 neurons. The trunk network is composed of five hidden layers with 300 neurons in each layer and 150 neurons in the output layer. The loss function for known data is similar to (3.5). Takeing into account the physics-informed MOONet and $\alpha = 1$ , the corresponding loss function is expressed as follows

$\begin{equation} \mathcal{L}(\theta) = \lambda_{0}\mathcal{L}_{data}(\theta)+\lambda_{1}\mathcal{L}_{physics}(\theta), \end{equation}$

(3.13)

where

$\begin{equation} \mathcal{L}_{physics}(\theta) = \lambda_{2}\mathcal{L}_{pde1}(\theta)+\lambda_{3}\mathcal{L}_{pde2}(\theta)+\lambda_{4}\mathcal{L}_{BC1}(\theta)+\lambda_{5}\mathcal{L}_{BC2}(\theta), \end{equation}$

(3.14)

and

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{pde1}(\theta) = \frac{1}{N_{1}}\sum\limits^{N_{1}}_{k = 1}\mid -\frac{\partial^{2} G_{\theta}^{1}(u_{d})(x_{k},y_{k})}{\partial x^{2}}-\frac{\partial^{2} G_{\theta}^{1}(u_{d})(x_{k},y_{k})}{\partial y^{2}}+(G_{\theta}^{1}(u_{d})(x_{k},y_{k}))^{3}-G_{\theta}^{2}(u_{d})(x_{k},y_{k})\mid^2, \\ \mathcal{L}_{pde2}(\theta) = \frac{1}{N_{2}}\sum\limits^{N_{2}}_{k = 1}\mid \frac{\partial^{2}G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})}{\partial x^{2}}+\frac{\partial^{2} G_{\theta}^{2}(f,u_{d})(x_{k},y_{k})}{\partial y^{2}}-3(G_{\theta}^{1}(u_{d})(x_{k},y_{k}))^{2}G_{\theta}^{2}(u_{d})(x_{k},y_{k})\\ \; \; \; \; \; \; \; \; \; \; \; \; \; +G_{\theta}^{1}(u_{d})(x_{k},y_{k})-u_{d}(x_{k},y_{k})\mid^2, \\ \mathcal{L}_{BC1}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}\mid G_{\theta}^{1}(f,u_{d})(x_{i},y_{i})\mid^2,\\ \mathcal{L}_{BC2}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}\mid G_{\theta}^{2}(f,u_{d})(x_{i},y_{i})\mid^2.\\ \end{array}\right. \end{eqnarray}$

(3.15)

When we use physics-informed MOONet, we choose $N = N_{1} = N_{2} = 10,000$ , $N_{BC} = 400$ , $\lambda_{0} = \lambda_{1} = \lambda_{2} = 1$ , $\lambda_{3} = \lambda_{4} = 300$ . Not using physics-informed MOONet, we take $N = 10,000$ , $\lambda_{0} = 1$ , the others are 0. The sample data are collected by finite difference and sequential quadratic programming. The experimental results are shown in Tables 7 and 8.

We find that the Relu of the activation function converges faster than Tanh using the MOONETS method, which is similar to the linear elliptic optimal control problem. The deep physical information MOONets method requires very little data to achieve the same precision of the deep MOONets method.

3.3. Parabolic optimal control problem

We consider the following parabolic optimal control problem,

$\begin{equation} \min J(u,v) = \frac{1}{2}\int_0^T\int_{\Omega}(u-u_{d})^2dxdt+\frac{\alpha}{2}\int_0^T\int_{\Omega}v^2dxdt,\\ \end{equation}$

(3.16)

$\begin{equation} \mbox{subject to}\; \; \; \left\{ \begin{array}{lll} \partial_{t}u-\Delta u-v = f,&\text{in}\; D,\\ u = 0,&\text{on}\; \partial\Omega, \\ u(x,0) = sin(\pi x), &\text{in}\; \Omega,\\ \end{array} \right. \end{equation}$

(3.17)

where $\Omega = (0, 1)$ , $D = \Omega\times(0, T]$ , $u:D\rightarrow \mathbb{R}$ is the unknown term satisfying (3.17), $u_{d}:D\rightarrow \mathbb{R}$ is the given desired temperature distribution, $v$ is the unknown control function, $f$ is the source term in $D$ , and $\alpha\geq 0$ is a regularization parameter. Here, we set $u_{d}(x, t) = \sin(\pi x)(t+1)+\frac{1}{2}e^{t}\sin(\pi x)-\frac{1}{2}(e^{t}-e)\sin(\pi x)$ , $\alpha = 1$ , $f(x, t) = \sin(\pi x)+\pi^{2}(t+1)(\sin(\pi x)+\frac{1}{2}(e^{t}-e)\sin(\pi x)$ , $\varphi(x) = sin(\pi x)$ . When $v(x, t) = -\frac{1}{2}(e^{t}-e)\sin(\pi x)$ , $u(x, t) = (t+1)\sin(\pi x)$ , the $J(u, v)$ gets the global minimum.

The optimal control problems (3.16) and (3.17) can be transformed into the optimality system,

$\begin{eqnarray} \left\{ \begin{array}{lll} \partial_{t}u-\Delta u-v = f,&\text{in}\; D,\\ \partial_{t}p+\Delta p+u = u_{d},&\text{in}\; D,\\ \alpha v+p = 0,&\text{in} \; D,\\ u(x,0) = sin(\pi x),&\text{in}\; \Omega,\\ p(x,T) = 0,&\text{in}\; \Omega,\\ u(x,t) = 0,&\text{on}\; \partial\Omega\times (0,T], \\ p(x,t) = 0,&\text{on}\; \partial\Omega\times (0,T]. \end{array} \right. \end{eqnarray}$

(3.18)

For the PDEs system (3.18), we use MIMOONets to solve it. The operator $G^{1}$ and $G^{2}$ can be learned from the source term $f$ , $u_{d}$ and $\varphi$ . Their representations are as follows

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(f,u_{d},\varphi) = \sum\limits^{p}_{k = 1} b1^{f}_{k}b1^{u_{d}}_{k}b1^{\varphi}_{k}tr_{k},\\ G_{\theta}^{2}(f,u_{d},\varphi) = \sum\limits^{p}_{k = 1} b2^{f}_{k}b2^{u_{d}}_{k}b2^{\varphi}_{k}tr_{k}. \end{array} \right. \end{eqnarray}$

(3.19)

Here, we select 100 sensors points for input functions $f$ , $u_{d}$ and $\varphi$ . Three branch networks are three separate 5-layer FNN with 300 neurons per hidden layer and output layer. The trunk network is a 5-layer FNN with 300 neurons per hidden layer and 150 neurons for output layer. Relu or Tanh is used as the activation function.

We use the same network structure of the MIMOONets as the network structure physics-informed MIMOONets where the activation function is Tanh. The corresponding loss function for the deep MIMOONets is expressed by

$\begin{equation} \mathcal{L}_{data}(\theta) = \frac{1}{N}\sum\limits^{N}_{k = 1}(\mid u(x_{k},t_{k})-G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{k},t_{k})\mid ^2+\mid p(x_{k},t_{k})-G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{k},t_{k})\mid ^2). \end{equation}$

(3.20)

The deep physics-informed MIMOONets loss function is the following form

$\begin{equation} \mathcal{L}(\theta) = \lambda_{0}\mathcal{L}_{data}(\theta)+\lambda_{1}\mathcal{L}_{physics}(\theta), \end{equation}$

(3.21)

where

$\begin{eqnarray} \mathcal{L}_{physics}(\theta) = \lambda_{2}\mathcal{L}_{pde1}(\theta)+\lambda_{3}\mathcal{L}_{pde2}(\theta)+\lambda_{4}\mathcal{L}_{IC}(\theta)+\lambda_{5}\mathcal{L}_{TC}(\theta)+\lambda_{6}\mathcal{L}_{BC}(\theta), \end{eqnarray}$

(3.22)

and

$\begin{eqnarray*} \label{ba34} \left\{ \begin{array}{ll} \mathcal{L}_{pde1}(\theta) = \frac{1}{N_{f}}\sum\limits^{N_{f}}_{k = 1}\mid \frac{\partial G_{\theta}^{1}(f,u_{d},\varphi)(x_{k},t_{k})}{\partial t} -\frac{\partial^{2} G_{\theta}^{1}(f,u_{d},\varphi)(x_{k},t_{k})}{\partial x} -G_{\theta}^{2}(f,u_{d},\varphi)(x_{k},t_{k})-f(x_{k},t_{k})\mid^2,\\ \mathcal{L}_{pde2}(\theta) = \frac{1}{N_{u_{d}}}\sum\limits^{N_{u_{d}}}_{k = 1}\mid \frac{\partial G_{\theta}^{2}(f,u_{d},\varphi)(x_{k},t_{k})}{\partial t}+\frac{\partial^{2} G_{\theta}^{2}(f,u_{d},\varphi)(x_{k},t_{k})}{\partial x^{2}}+G_{\theta}^{1}(f,u_{d},\varphi)(x_{k},t_{k})-u_{d}(x_{k},t_{k})\mid^2, \\ \mathcal{L}_{IC}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}\mid G_{\theta}^{1}(f,u_{d},\varphi)(x_{i})-\varphi(x_{i},t_{i})\mid^2,\\ \mathcal{L}_{TC}(\theta) = \frac{1}{N_{IC}}\sum\limits^{N_{TC}}_{i = 1}\mid G_{\theta}^{2}(f,u_{d},\varphi)(x_{i},t_{i})\mid^2,\\ \mathcal{L}_{BC}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}(\mid G_{\theta}^{1}(f,u_{d},\varphi)(x_{i},t_{i})\mid^2+\mid G_{\theta}^{2}(f,u_{d},\varphi)(x_{i},t_{i})\mid^2).\\ \end{array} \right. \end{eqnarray*}$

To calculate the value of the loss function, we take the random sample of initial data points $N = 10,000$ , the numbers of the residual training points $N_{f} = N_{u_{d}} = 10,000$ , the number $N_{IC}$ and $N_{TC}$ of the initial and termination condition points $x\in\Omega$ are 100, and the number $N_{BC}$ of the boundary condition is 100. Then, we use the Adam optimizer to train the deep MIMOONets and physics-informed MIMOONets ( $\lambda_{i} = 1, i = 0, 1, 2, 3;\lambda_{j} = 100, j = 4, 5, 6$ ) by minimizing the loss of Eqs (3.20) and (3.21). The learning rate is 0.002. The experimental results are shown in Tables 5 and 6, and Figures 6–8.

Table 5. MIMOONets for parabolic problem.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Relu	(0.11 $\pm$ 0.04) $\%$	(0.24 $\pm$ 0.05) $\%$
10,000	Tanh	(0.60 $\pm$ 0.10) $\%$	(1.60 $\pm$ 0.13) $\%$
40,000	Relu	(0.054 $\pm$ 0.006) $\%$	(0.10 $\pm$ 0.04) $\%$
40,000	Tanh	(0.40 $\pm$ 0.05) $\%$	(0.88 $\pm$ 0.10) $\%$

| Show Table

DownLoad: CSV

Table 6. Physics-informed MIMOONets for parabolic problem.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Tanh	(0.52 $\pm$ 0.12) $\%$	(4.9 $\pm$ 0.27) $\%$
40,000	Tanh	(0.32 $\pm$ 0.08) $\%$	(4.7 $\pm$ 0.21) $\%$

| Show Table

DownLoad: CSV

Figure 6. MIMONet iterations 10,000 times with Rule for parabolic-constraint optimal control problem. (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 7. MIMONet iterations 10,000 times with tanh for parabolic-constraint optimal control problem. (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 8. Physics-informed MIMONet iterations 10,000 times for parabolic-constraint optimal control problem (

$\lambda_{i} = 1, i = 0, 1, 2, 3; \lambda_{j} = 100, j = 4, 5, 6$ ). (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

We found that both the deep MIMOONets method and the deep physics-informed MIMOONets method are effective for parabolic optimal control problem, and the activation function Relu converges faster than Tanh with the MIMOONets method. We also found that the physics-informed MIMOONets not only attains comparable accuracy to the original MIMOONets but also satisfies the underlying PDEs constraint.

3.4. Hyperbolic optimal control problem

Consider the following hyperbolic optimal control problem ^[13]:

$\begin{equation} \min J(u,v) = \frac{1}{2}\int_0^T\int_{\Omega}(u-u_{d})^2dxdt+\frac{\alpha}{2}\int_0^T\int_{\Omega}v^2dxdt,\\ \end{equation}$

(3.23)

$\begin{equation} \mbox{subject to}\; \; \; \left\{ \begin{array}{lll} u_{tt}-\Delta u-v = f,&\text{in}\; D,\\ u = 0,&\text{on}\; \partial\Omega\times[0,T], \\ u(x,0) = \varphi(x),u_{t}(x,0) = \psi(x), &\text{in}\; \Omega,\\ a\leq v\leq b, &\text{in}\; D, a, b\in\mathbb{R},\\ \end{array} \right. \end{equation}$

(3.24)

where $\Omega = [0, 1]^2$ , $D = \Omega\times(0, T]$ . In the experiment, we take $u_{d}(x, t) = \sin(\pi x_{1})\sin(\pi x_{2})(e^t+2+2\pi^2(t-T)^2)$ , $T = 1, \alpha = 1, a = 0.2, b = 0.8$ , $f(x, t) = (1+2\pi^2)e^t \sin(\pi x_{1})\sin(\pi x_{2})-\max\{a, \min\{b, (t-T)^2\sin(\pi x_{1})\sin(\pi x_{2})\}\}$ , $\varphi(x) = \sin(\pi x_{1})\sin(\pi x_{2})$ , $\psi(x) = \sin(\pi x_{1})\sin(\pi x_{2})$ . The exact solution $u(x, t) = e^t\sin(\pi x_{1})\sin(\pi x_{2})$ , $v(x, t) = \max\{a, \min\{b, (t-T)^2\sin(\pi x_{1})\sin(\pi x_{2})\}\}$ .

Based on (3.23) and (3.24), the following optimality system can be obtained,

$\begin{eqnarray} \left\{ \begin{array}{lll} u_{tt}-\Delta u-v = f,&\text{in}\; D,\\ p_{tt}-\Delta p-u = -u_d,&\text{in}\; D,\\ v = \max\{a,\min\{b,-\frac{p}{\alpha}\}\}, &\text{in} \; D,\\ u(x,0) = \varphi(x),u_{t}(x,0) = \psi(x), &\text{in}\; \Omega,\\ u(x,t) = 0,&\text{on}\; \partial\Omega\times[0,T], \\ p(x,T) = 0, p_t(x,T) = 0,&\text{in}\; \Omega,\\ p(x,t) = 0, &\text{on}\; \partial\Omega\times [0,T]. \end{array} \right. \end{eqnarray}$

(3.25)

The solution operator of system (3.25) can be represented by $G^{1}$ and $G^{2}$ as follows:

$\begin{eqnarray} \left\{ \begin{array}{lll} G_{\theta}^{1}(f,u_{d},\varphi,\psi) = \sum\limits^{p}_{k = 1} b1^{f}_{k}b1^{u_{d}}_{k}b1^{\varphi}_{k}b1^{\psi}_{k}tr_{k},\\ G_{\theta}^{2}(f,u_{d},\varphi,\psi) = \sum\limits^{p}_{k = 1} b2^{f}_{k}b2^{u_{d}}_{k}b2^{\varphi}_{k}b2^{\psi}_{k}tr_{k}. \end{array} \right. \end{eqnarray}$

(3.26)

In the experiment, we use 100 sensor points as input functions $f$ , $u_{d}$ , $\varphi$ , and $\psi$ . Each of the four branch networks consists of an independent 5-layer FNN, with 300 neurons in each hidden layer and output layer. The trunk network is a 5-layer FNN with 300 neurons per hidden layer and 150 neurons for the output layer. The activation function used is either Relu or Tanh.

We employ the same network structure as the physics-informed MIMOONets, using Tanh as the activation function. The corresponding loss function for the deep MIMOONets is expressed as follows:

$\begin{eqnarray} \mathcal{L}_{data}(\theta) & = & \frac{1}{N}\sum\limits^{N}_{k = 1}(\mid u(x_{k},x_{2}^{k},t_{k})-G_{\theta}^{1}(f,u_{d},\varphi)(x_{1}^{k},x_{2}^{k},t_{k})\mid ^2\\ &&+\mid p(x_{1}^{k},x_{2}^{k},t_{k})-G_{\theta}^{2}(f,u_{d},\varphi)(x_{1}^{k},x_{2}^{k},t_{k})\mid ^2). \end{eqnarray}$

(3.27)

The deep physics-informed MIMOONets loss function is the following form

$\begin{equation} \mathcal{L}(\theta) = \lambda_{0}\mathcal{L}_{data}(\theta)+\lambda_{1}\mathcal{L}_{physics}(\theta), \end{equation}$

(3.28)

where

(3.29)

and

$\begin{eqnarray} \left\{ \begin{array}{lll} \mathcal{L}_{pde1}(\theta) = \frac{1}{N_{f}}\sum\limits^{N_{f}}_{k = 1}\mid \frac{\partial^2 G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k})}{\partial t^2} -\frac{\partial^{2} G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k}}{\partial x_{1}^{2}}-\frac{\partial^{2} G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k}}{\partial x_{2}^{2}}\\ \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; -\max\{a,\min\{b,-\frac{G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k})}{\alpha}\}\}-f(x_{1}^{k},x_{2}^{k},t^{k})\mid^2, \\ \mathcal{L}_{pde2}(\theta) = \frac{1}{N_{u_{d}}}\sum\limits^{N_{u_{d}}}_{k = 1}\mid \frac{\partial^2 G_{\theta}^{2}(f,u_{d},\varphi)(x_{1}^{k},x_{2}^{k},t^{k}))}{\partial t^2}+\frac{\partial^{2} G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k}))}{\partial x_{1}^{2}}+\frac{\partial^{2} G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k}))}{\partial x_{2}^{2}}\\ \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; +G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{k},x_{2}^{k},t^{k})-u_{d}(x_{1}^{k},x_{2}^{k},t^{k})\mid^2, \\ \mathcal{L}_{IC}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}(\mid G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})-\varphi(x_{1}^{i},x_{2}^{i})\mid^2\\ \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; +\mid \frac{\partial G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})}{\partial t}-\psi(x_{1}^{i},x_{2}^{i})\mid^2),\\ \mathcal{L}_{TC}(\theta) = \frac{1}{N_{IC}}\sum\limits^{N_{TC}}_{i = 1}(\mid G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})\mid^2 +\mid \frac{\partial G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})}{\partial t}\mid^2).\\ \mathcal{L}_{BC}(\theta) = \frac{1}{N_{BC}}\sum\limits^{N_{BC}}_{i = 1}(\mid G_{\theta}^{1}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})\mid^2+\mid G_{\theta}^{2}(f,u_{d},\varphi,\psi)(x_{1}^{i},x_{2}^{i},t^{i})\mid^2).\\ \end{array} \right. \end{eqnarray}$

(3.30)

In the experiment, uniformly random samples are taken with $N = 10,000$ initial data points, and the number of residual training points is $N_{f} = N_{u_{d}} = 125,000$ . The sample size for initial condition points $N_{IC}$ and termination condition points $N_{TC}$ in $\Omega$ is 10,000, and the number of points on the spatiotemporal boundary $\partial\Omega\times [0, T]$ is $N_{BC} = 400\times100$ . For the sampling of sensor points for input functions, besides the source function $f(x_{1}, x_{2}, t)$ , 100 sensor points are uniformly randomly sampled. However, for the source function $f(x_{1}, x_{2}, t)$ in the spatiotemporal space $D$ , Latin hypercube sampling is used to sample 100 sensor points. Subsequently, we use the Adam optimizer to train the deep MIMOONets and physics-informed MIMOONets ( $\lambda_{0} = 0, \lambda_{i} = 1, i = 1, 2, 3;\lambda_{j} = 100, j = 4, 5, 6$ ) by minimizing the loss of Eqs (3.27) and (3.28). The learning rate is set to 0.0001. The experimental results are shown in Tables 7 and 8, and Figures 9–11.

Table 7. MIMOONets for hyperbolic control problem at t = 0.5.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Relu	(0.48 $\pm$ 0.08)%	(1.89 $\pm$ 0.09)%
10,000	Tanh	(3.85 $\pm$ 0.30)%	(21.20 $\pm$ 1.10)%
40,000	Relu	(0.32 $\pm$ 0.06)%	(1.09 $\pm$ 0.30)%
40,000	Tanh	(0.72 $\pm$ 0.08)%	(1.90 $\pm$ 0.50)%

| Show Table

DownLoad: CSV

Table 8. Physics-informed MIMOONets for hyperbolic control problem at t = 0.5 without data.

Iterations	Activation function	Relative $L^{2}$ error of u	Relative $L^{2}$ error of p
10,000	Tanh	(6.25 $\pm$ 0.80)%	(13.57 $\pm$ 1.20)%
40,000	Tanh	(1.72 $\pm$ 0.30)%	(5.2 $\pm$ 0.31)%

| Show Table

DownLoad: CSV

Figure 9. MIMONet iterations 40,000 times with Rule for hyperbolic problem (

$t = 0.5$ ). (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 10. MIMONet iterations 40,000 times with tanh for hyperbolic problem

$(t = 0.5)$ . (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

Figure 11. Physics-informed MIMONet iterations 40,000 times for hyperbolic problem

$(\lambda_{0} = 0, \lambda_{i} = 1, i = 1, 2, 3; \lambda_{j} = 100, j = 4, 5, 6)$ . (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ .

DownLoad: Full-Size Img PowerPoint

For hyperbolic optimal control problems, both the deep MIMOONets method and the deep physics-informed MIMOONets method are effective. In the experiments, the MIMOONets method adopts a data-driven approach, and the Relu activation function converges faster than Tanh. As for the physics-informed MIMOONets method, we adopt a non-data-driven mode and conduct experiments by considering the PDE as a constraint. The experimental results are excellent.

4. Conclusions

In this work, a novel deep learning framework is presented, which enables the construction of fast surrogates for solving PDE-constrained optimization problems using MIMOONets and physics-informed MIMOONets. The MIMOONets frameworks (MIMOONets and physics-informed MIMOONets) offer flexibility and faster implementation compared to other traditional methods. Compared with MIMOONets, the physics-informed MIMOONets require little paired input-output data, and are more efficient and cost-effective.

Although our methods (MIMOONets and physics-informed MIMOONets) were initially designed for solving PDE-constrained optimization problems, they can also be extended to multi-equation-coupled problems, including hyperbolic and parabolic optimal control problems ^[31], Cahn-Hilliard-Navier-Stokes equation ^[32], and other scenarios. Moreover, they can be effectively employed for real-time prediction in optimal systems governed by PDEs.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Granted No. 11961008) and Funding from the Scientific Research Fund Project of Guizhou Education University (Granted No. 2024ZD007).

Conflict of interest

The authors declare there is no conflict of interest.

Appendix

A. Supplementary visualizations of elliptic optimal control problem

We present some numerical images for solving elliptic optimal control problems (3.1) and (3.2) using deep MIMOONets and physics-informed MIMOONets framework. The data-driven results are shown in Figures A.1–A.6. When there is no data, we can also use deep physics-informed MIMOONets to solve the problem. The experimental results are shown in Figure A.7.

Figure A.1. MIMONet iterations 10,000 times with Relu for elliptic constraint optimal control problem. (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.2. MIMONet iterations 10,000 times with Tanh for elliptic constraint optimal control problem. (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.3. MIMONet iterations 40,000 times with Relu for elliptic constraint optimal control problem. (a) Train loss error. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.4. Physics-informed MIMONet iterations 10,000 times for elliptic constraint optimal control problem (

$\lambda_{0} = \lambda_{1} = \lambda_{2} = 1$ ,

$\lambda_{3} = \lambda_{4} = 100$ ). (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.5. MIMONet iterations 40,000 times with Tanh for elliptic-constraint optimal control problem. (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.6. Physics-informed MIMONet iterations 40,000 times with Tanh for elliptic constraint optimal control problem

$(\lambda_{0} = \lambda_{1} = \lambda_{2} = 1, \lambda_{3} = \lambda_{4} = 100)$ . (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

Figure A.7. Physics-informed MIMONet iterations 40,000 times for elliptic-constraint optimal control problem (

$\lambda_{0} = 0, \lambda_{1} = \lambda_{2} = 1$ ,

$\lambda_{3} = \lambda_{4} = 100$ , and there is no initial data). (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of p. (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d})$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d})$ .

DownLoad: Full-Size Img PowerPoint

B. Supplementary visualizations of parabolic optimal control problem

We present some numerical images for solving parabolic optimal control problems using deep MIMOONets and physically informed MIMOONets frameworks. The data-driven results are shown in Figures B.1–B.6. When there is no initial data, we can also use MIMOONets based on depth physical information to solve this problem, and the experimental results are shown in Figure B.7.

Figure B.1. MIMONet iterations 10,000 times with Relu for parabolic-constraint optimal control problem. (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.2. MIMONet iterations 10,000 times with Tanh for parabolic-constraint optimal control problem. (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.3. MIMONet iterations 40,000 times with Relu for parabolic-constraint optimal control problem. (a) Train loss. (b) The absolute value of the error

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.4. MIMONet iterations 40,000 times with Tanh for parabolic-constraint optimal control problem. (a) Train loss. (b) The absolute value of the error

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.5. Physics-informed MIMONet iterations 10,000 times for parabolic-constraint optimal control problem (

$\lambda_{i} = 1, i = 0, 1, 2, 3;\lambda_{j} = 100, j = 4, 5, 6$ , there is no initial data). (a) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (b) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.6. Physics-informed MIMONet iterations 40,000 times for parabolic-constraint optimal control problem (

$\lambda_{i} = 1, i = 0, 1, 2, 3; \lambda_{i} = 100, i = 4, 5, 6$ , there is no initial data). (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

Figure B.7. Physics-informed MIMONet iterations 40,000 times for parabolic-constraint optimal control problem

$(\lambda_{0} = 0; \lambda_{i} = 1, i = 1, 2, 3; \lambda_{j} = 100, j = 4, 5, 6$ , there is no initial data). (a) Train loss. (b) The absolute value of the error of

$u$ . (c) The absolute value of the error of

$p$ . (d) The error of

$u$ :

$u-G_{\theta}^{1}(f, u_{d}, \varphi)$ . (e) The error of

$p$ :

$p-G_{\theta}^{2}(f, u_{d}, \varphi)$ .

DownLoad: Full-Size Img PowerPoint

References

[1]	G. Fabbri, Heat transfer optimization in corrugated wall channels, Int. J. Heat Mass Transfer, 43 (2000), 4299–4310. https://doi.org/10.1016/S0017-9310(00)00054-5 doi: 10.1016/S0017-9310(00)00054-5
[2]	G. Cornuéjols, J. Peña, R. Tütüncü, Optimization Methods in Finance, 2 $^{nd}$ , Cambridge University Press, New York, 2018.
[3]	J. C. De los Reyes, C. B. Schönlieb, Image denoising: learning the noise model via nonsmooth PDE-constrained optimization, Inverse Probl. Imaging, 7 (2013), 1183–1214. https://doi.org/10.3934/ipi.2013.7.1183 doi: 10.3934/ipi.2013.7.1183
[4]	J. Sokolowski, J. P. Zolésio, Introduction to Shape Optimization, Springer-Verlag, Berlin, 1992. https://doi.org/10.1007/978-3-642-58106-9_1
[5]	J. Haslinger, R. A. E. Mäkinen, Introduction to Shape Optimization: Theory, Approximation, and Computation, SIAM, Philadelphia, 2003. https://doi.org/10.1137/1.9780898718690
[6]	R. M. Hicks, P. A. Henne, Wing design by numerical optimization, J. Aircr., 15 (1978), 407–412. https://doi.org/10.2514/3.58379 doi: 10.2514/3.58379
[7]	P. D. Frank, G. R. Shubin, A comparison of optimization-based approaches for a model computational aerodynamics design problem, J. Comput. Phys., 98 (1992), 74–89. https://doi.org/10.1016/0021-9991(92)90174-W doi: 10.1016/0021-9991(92)90174-W
[8]	J. Ng, S. Dubljevic, Optimal boundary control of a diffusion-convection-reaction PDE model with time-dependent spatial domain: Czochralski crystal growth process, Chem. Eng. Sci., 67 (2012), 111–119. https://doi.org/10.1016/j.ces.2011.06.050 doi: 10.1016/j.ces.2011.06.050
[9]	S. P. Chakrabarty, F. B. Hanson, Optimal control of drug delivery to brain tumors for a distributed parameters model, in Proceedings of the 2005, American Control Conference, 2 (2005), 973–978. https://doi.org/10.1109/ACC.2005.1470086
[10]	W. B. Liu, N. N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Science Press, Beijing, 2008.
[11]	Y. P. Chen, F. L. Huang, N. Yi, W. B. Liu, A Legendre Galerkin spectral method for optimal control problems governed by Stokes equations, SIAM J. Numer. Anal., 49 (2011), 1625–1648. https://doi.org/10.1137/080726057 doi: 10.1137/080726057
[12]	A. Borzì, V. Schulz, Computational Optimization of Systems Governed by Partial Differential Equations, SIAM, Philadelphia, 2011.
[13]	X. Luo, A priori error estimates of Crank-Nicolson finite volume element method for a hyperbolic optimal control problem, Numer. Methods Partial Differ. Equations, 32 (2016), 1331–1356. https://doi.org/10.1002/num.22052 doi: 10.1002/num.22052
[14]	M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part Ⅰ): data-driven solutions of nonlinear partial differential equations, preprint, arXiv: 1711.10561.
[15]	S. Wang, H. Zhang, X. Jiang, Physics-informed neural network algorithm for solving forward and inverse problems of variable-order space-fractional advection-diffusion equations, Neurocomputing, 535 (2023), 64–82. https://doi.org/10.1016/j.neucom.2023.03.032 doi: 10.1016/j.neucom.2023.03.032
[16]	J. Sirignano, K. Spiliopoulos, DGM: a deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), 1339–1364. https://doi.org/10.1016/j.jcp.2018.08.029 doi: 10.1016/j.jcp.2018.08.029
[17]	W. N. E, B. Yu, The deep Ritz method: a deep-learning based numerical algorithm for solving variational problems, Commun. Math. Stat., 6 (2018), 1–12. https://doi.org/10.1007/s40304-018-0127-z doi: 10.1007/s40304-018-0127-z
[18]	Y. L. Liao, P. B. Ming, Deep Nitsche method: deep Ritz method with essential boundary conditions, Commun. Comput. Phys., 29 (2021), 1365–1384. https://doi.org/10.4208/cicp.OA-2020-0219 doi: 10.4208/cicp.OA-2020-0219
[19]	L. Lu, P. Z. Jin, G. F. Pang, Z. Q. Zhang, G. E. Karniadakis, Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators, Nat. Mach. Intell., 3 (2021), 218–229. https://doi.org/10.1038/s42256-021-00302-5 doi: 10.1038/s42256-021-00302-5
[20]	C. Moya, S. Zhang, G. Lin, M. Yue, DeepONet-grid-UQ: a trustworthy deep operator framework for predicting the power grid's post-fault trajectories, Neurocomputing, 535 (2023), 166–182. https://doi.org/10.1016/j.neucom.2023.03.015 doi: 10.1016/j.neucom.2023.03.015
[21]	Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, et al., Neural operator: graph kernel network for partial differential equations, preprint, arXiv: 2003.03485.
[22]	Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, et al., Fourier neural operator for parametric partial differential equations, preprint, arXiv: 2010.08895.
[23]	S. F. Wang, H. W. Wang, P. Perdikaris, Learning the solution operator of parametric partial differential equations with physics-informed DeepOnets, Sci. Adv., 7 (2021), eabi8605. https://doi.org/10.1126/sciadv.abi8605 doi: 10.1126/sciadv.abi8605
[24]	P. Jin, S. Meng, L. Lu, MIONet: learning multiple-input operators via tensor product, SIAM J. Sci. Comput., 44 (2022), A3490–A3514. https://doi.org/10.1137/22M1477751 doi: 10.1137/22M1477751
[25]	C. J. García-Cervera, M. Kessler, F. Periago, Control of partial differential equations via physics-informed neural networks, J. Optim. Theory Appl., 196 (2023), 391–414. https://doi.org/10.1007/s10957-022-02100-4 doi: 10.1007/s10957-022-02100-4
[26]	S. Mowlavi, S. Nabib, Optimal control of PDEs using physics-informed neural networks, J. Comput. Phys., 473 (2023), 111731. https://doi.org/10.1016/j.jcp.2022.111731 doi: 10.1016/j.jcp.2022.111731
[27]	J. Barry-Straume, A. Sarsha, A. A. Popov, A. Sandu, Physics-informed neural networks for PDE-constrained optimization and control, preprint, arXiv: 2205.03377.
[28]	S. F. Wang, M. A. Bhouri, P. Perdikaris, Fast PDE-constrained optimization via self-supervised operator learning, preprint, arXiv: 2110.13297.
[29]	J.L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer-Verlag, Berlin, 1971.
[30]	T. P. Chen, H. Chen, Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE Trans. Neural Networks, 6 (1995), 911–917. https://doi.org/10.1109/72.392253 doi: 10.1109/72.392253
[31]	I. Lasiecka, Mathematical Control Theory of Coupled PDEs, SIAM, Philadelphia, 2001. https://doi.org/10.1137/1.9780898717099
[32]	A. Miranville, The Cahn-Hilliard Equation: Recent Advances and Applications, SIAM, Philadelphia, 2019. https://doi.org/10.1137/1.9781611975925

This article has been cited by:

Jinjun Yong, Xianbing Luo, Shuyu Sun, Changlun Ye, Deep mixed residual method for solving PDE-constrained optimization problems, 2024, 176, 08981221, 510, 10.1016/j.camwa.2024.11.009

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Electronic Research Archive

1 1.3

Metrics

Article views(1221) PDF downloads(88) Cited by(1)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(25) / Tables(8)

Electronic Research Archive

Deep multi-input and multi-output operator networks method for optimal control of PDEs

Related Papers:

Abstract

1. Introduction

2. Methods

2.1. Optimality system

2.2. Multiple-input operators networks

2.3. Multi-input and multi-output operators networks

2.4. Physics-informed MIMOONets

2.5. PDE-constrained optimization with physics-informed MIMOONets

3. Main results

3.1. Linear elliptic optimal control problem

3.2. Semi-linear elliptic optimal control problem

3.3. Parabolic optimal control problem

3.4. Hyperbolic optimal control problem

4. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

Appendix

A. Supplementary visualizations of elliptic optimal control problem

B. Supplementary visualizations of parabolic optimal control problem

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Electronic Research Archive

Deep multi-input and multi-output operator networks method for optimal control of PDEs

Related Papers:

Abstract

1. Introduction

2. Methods

2.1. Optimality system

2.2. Multiple-input operators networks

2.3. Multi-input and multi-output operators networks

2.4. Physics-informed MIMOONets

2.5. PDE-constrained optimization with physics-informed MIMOONets

3. Main results

3.1. Linear elliptic optimal control problem

3.2. Semi-linear elliptic optimal control problem

3.3. Parabolic optimal control problem

3.4. Hyperbolic optimal control problem

4. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

Appendix

A. Supplementary visualizations of elliptic optimal control problem

B. Supplementary visualizations of parabolic optimal control problem

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog