Autoencoding for the "Good Dictionary" of eigenpairs of the Koopman operator

Neranjaka Jayarathne; Erik M. Bollt; Neranjaka Jayarathne; Erik M. Bollt

doi:10.3934/math.2024050

AIMS Mathematics

2024, Volume 9, Issue 1: 998-1022. doi: 10.3934/math.2024050

Previous Article Next Article

Research article Special Issues

Autoencoding for the "Good Dictionary" of eigenpairs of the Koopman operator

Neranjaka Jayarathne ^{1
,
,},
Erik M. Bollt ²

1.
Center for Complex Systems Science, Clarkson University, Potsdam, NY 13699-5815
2.
Department of Electrical and Computer Engineering and The Clarkson Center for Complex Systems Science, Clarkson University, Potsdam, New York 13699, USA

Received: 19 June 2023 Revised: 24 August 2023 Accepted: 31 August 2023 Published: 05 December 2023
MSC : 37M05, 68T07

Reduced order modelling relies on representing complex dynamical systems using simplified modes, which can be achieved through the Koopman operator(KO) analysis. However, computing Koopman eigenpairs for high-dimensional observable data can be inefficient. This paper proposes using deep autoencoders(AE), a type of deep learning technique, to perform nonlinear geometric transformations on raw data before computing Koopman eigenvectors. The encoded data produced by the deep AE is diffeomorphic to a manifold of the dynamical system and has a significantly lower dimension than the raw data. To handle high-dimensional time series data, Takens' time delay embedding is presented as a preprocessing technique. The paper concludes by presenting examples of these techniques in action.

Keywords:

Citation: Neranjaka Jayarathne, Erik M. Bollt. Autoencoding for the 'Good Dictionary' of eigenpairs of the Koopman operator[J]. AIMS Mathematics, 2024, 9(1): 998-1022. doi: 10.3934/math.2024050

Related Papers:

[1]	Yousef Jawarneh, Humaira Yasmin, Abdul Hamid Ganie, M. Mossa Al-Sawalha, Amjid Ali . Unification of Adomian decomposition method and ZZ transformation for exploring the dynamics of fractional Kersten-Krasil'shchik coupled KdV-mKdV systems. AIMS Mathematics, 2024, 9(1): 371-390. doi: 10.3934/math.2024021
[2]	Shumoua F. Alrzqi, Fatimah A. Alrawajeh, Hany N. Hassan . An efficient numerical technique for investigating the generalized Rosenau–KDV–RLW equation by using the Fourier spectral method. AIMS Mathematics, 2024, 9(4): 8661-8688. doi: 10.3934/math.2024420
[3]	Zhi-Ying Feng, Xiang-Hua Meng, Xiao-Ge Xu . The data-driven localized wave solutions of KdV-type equations via physics-informed neural networks with a priori information. AIMS Mathematics, 2024, 9(11): 33263-33285. doi: 10.3934/math.20241587
[4]	Shami A. M. Alsallami . Investigating exact solutions for the (3+1)-dimensional KdV-CBS equation: A non-traveling wave approach. AIMS Mathematics, 2025, 10(3): 6853-6872. doi: 10.3934/math.2025314
[5]	Amit Goswami, Sushila, Jagdev Singh, Devendra Kumar . Numerical computation of fractional Kersten-Krasil’shchik coupled KdV-mKdV system occurring in multi-component plasmas. AIMS Mathematics, 2020, 5(3): 2346-2368. doi: 10.3934/math.2020155
[6]	Yunmei Zhao, Yinghui He, Huizhang Yang . The two variable (φ/φ, 1/φ)-expansion method for solving the time-fractional partial differential equations. AIMS Mathematics, 2020, 5(5): 4121-4135. doi: 10.3934/math.2020264
[7]	Ye Zhao, Chunfeng Xing . Orbital stability of periodic traveling waves to some coupled BBM equations. AIMS Mathematics, 2023, 8(9): 22225-22236. doi: 10.3934/math.20231133
[8]	Qiuying Li, Xiaoxiao Zheng, Zhenguo Wang . Orbital stability of periodic standing waves of the coupled Klein-Gordon-Zakharov equations. AIMS Mathematics, 2023, 8(4): 8560-8579. doi: 10.3934/math.2023430
[9]	F. A. Mohammed, Mohammed K. Elboree . Soliton solutions and periodic solutions for two models arises in mathematical physics. AIMS Mathematics, 2022, 7(3): 4439-4458. doi: 10.3934/math.2022247
[10]	Junjie Li, Gurpreet Singh, Onur Alp İlhan, Jalil Manafian, Yusif S. Gasimov . Modulational instability, multiple Exp-function method, SIVP, solitary and cross-kink solutions for the generalized KP equation. AIMS Mathematics, 2021, 6(7): 7555-7584. doi: 10.3934/math.2021441

Abstract

1. Introduction

Many of the problems we encounter in the world that can be presented as optimal control problems contain constraints on both the state and control variables. Imagine a scheduling problem whereby the aim is to maximize profits, such as in ^[26], in which the control variable is the production rate and the state variable is the inventory level. Naively, one may want to greatly increase both the inventory level and production rate to maximize profits, but, in practice, there is a limit to the amount of inventory a factory can hold, as well as a limit on how quickly humans or machines can work. In order to accurately model this problem, and many others from a wide range of application areas, such as manufacturing, engineering, science, economics, etc., we must use an optimal control problem with both state and control constraints. Although these problems are commonly found in applications, they are much more difficult to solve than those optimal control problems which purely have control constraints.

We focus our attention on linear-quadratic (LQ) state- and control-constrained optimal control problems. These are infinite-dimensional optimization problems that involve the minimization of a quadratic objective function that is subject to linear differential equation (DE) constraints, affine constraints on the state variables, and affine constraints on the control variables. There is extensive literature on LQ control problems, as they can model many problems from a wide variety of disciplines (see ^{[1,14,15,16,24,27,28]}). Though many of the LQ control problems posed in the literature contain control constraints along with the linear DE constraints, it is rarer to see state constraints included since, as mentioned above, they are much more difficult to manage. Here, we propose a unique approach for solving state-constrained problems by applying the Douglas–Rachford (DR) algorithm.

The DR algorithm is a projection algorithm that is used to minimize the sum of two convex functions. In order to apply the algorithm one needs the proximal operators of the two convex functions. Splitting and projection methods such as the DR algorithm constitute a popular area of research in optimization, with a variety of applications (see ^[2,3,6,9,20]) for their use in sphere packing problems, protein reconstruction, etc. The use of these methods to solve discrete-time optimal control problems is not new, but there are very few applications of these methods to continuous-time optimal control problems. In ^[7], projection methods were used to solve the energy-minimizing double integrator problem with control constraints. Other papers ^[11,12,13] address more general energy-minimizing LQ control problems with control constraints. Additionally, a collection of general projection methods is used in ^[11], and the DR algorithm is used in ^[12,13].

Following the promising numerical results observed in ^[12,13], here, we have used the DR algorithm to address the more challenging control- and state-constrained LQ control problems.

The current technique for solving these control- and state-constrained LQ problems is a direct discretization approach whereby the infinite-dimensional problem is reduced to a large-scale finite-dimensional problem by using a discretization scheme, e.g., a Runge–Kutta method ^[5,22]. This discretized problem is then solved through the use of an optimization modelling language such as AMPL ^[19], paired with (finite-dimensional) large-scale optimization software such as Ipopt ^[33].

Through the present paper (as was done in ^[7,11] for control constraints only), we present the following contributions to research on the numerical solution of LQ control problems with state and control constraints.

(ⅰ) We rewrite the original LQ control problem as the minimization of the sum of two convex functions. This reformulation addresses much more general problems than those considered in ^[7,11]. Our setting encompasses important practical problems, such as those referenced above (i.e., problems that have constraints on the state variables).

(ⅱ) We find the proximal mappings of these functions by solving the resulting calculus of variations subproblems (Theorems 1 and 2). Although the proof of Theorem 1 is straightforward thanks to the separability of the variables $x$ and $u$ , the proof of Theorem 2 is more involved due to a new LQ control subproblem that needs to be solved.

(ⅲ) We define and apply the DR algorithm to the constrained LQ control problem. We define and calculate proximal mappings at $(x, u)$ , rather than just $u$ , as in ^[7,12]. Thus, we describe the DR algorithm for the iterate $(x^k, u^k)$ rather than for $u^k$ alone.

(ⅳ) We propose an algorithm for computing the projection of an iterate $(x^-, u^-)$ onto the affine constraint set defined by the ordinary differential equations (ODEs). The procedure is an extension of [, Algorithm 2], which projects only $u^-$ . The current algorithm involves a more complicated two-point boundary-value problem to solve.

(ⅴ) We perform numerical experiments to illustrate the higher efficiency of the DR algorithm as compared to the traditional approach of direct discretization. Solving optimal control problems with state constraints is well known to be more challenging than those without them. We demonstrate the value of the new approach in solving a class of such problems.

In addition to these, we also present the following contribution, which is new even for the purely control-constrained LQ control problems.

(ⅵ) We present a conjecture for finding the costates. Moreover, we present a technique for finding the multiplier for the state constraints in the case that only one state constraint is active at a given time. We have effectively employed the costates and the state control multiplier for numerical verification of the optimality conditions.

The two convex functions mentioned above are defined as follows: the information from the ODE constraints appears in one function, while the information from the control and state constraints, along with the objective functional appear in the other. We derive the proximal mappings of the two convex functions without reducing the original infinite-dimensional optimization problem to a finite-dimensional one, though we need to discretize the state and control variables over a partition of time when implementing the DR algorithm since a digital computer cannot iterate with functions.

The paper is structured as follows. In Section 2, we give the problem formulation and optimality conditions for the optimal control problem. In Section 3, we derive the proximal mappings used in the implementation of the DR algorithm. Section 4 introduces the DR algorithm. Then, Section 5 begins with an algorithm for computing one of the proximal mappings (namely, the projection onto an affine set) and a conjecture that is used to obtain the costate variable of the original LQ control problem. Next, this section introduces two example problems, a harmonic oscillator and a mass-spring system. At the end of this section, numerical experiments for the DR algorithm and the AMPL–Ipopt suite and their comparisons are given for these two problems. Finally, concluding remarks and comments for future work are provided in Section 6.

2. Optimal control problem

In this, section we formulate the general optimal control problem that will be the focus of this paper. We give some necessary definitions and provide conditions for optimality from optimal control theory.

Before introducing the optimal control problem, we will give some standard definitions. Unless otherwise stated, all vectors are column vectors. Let ${\cal L}^2([t_0, t_f]; \mathbb{R}^q)$ be the Banach space of Lebesgue measurable functions given by $z:[t_0, t_f]\rightarrow \mathbb{R}^q$ , with a finite ${\cal L}^2$ norm, namely,

${\cal L}^2([t_0,t_f];\mathbb{R}^q): = \left\{z:[t_0,t_f]\rightarrow \mathbb{R}^q\,\:|\,\:\|z\|_{{\cal L}^2} : = \left(\displaystyle {\int}_{t_0}^{t_f}\|z(t)\|^2\,dt\right)^{1/2} < \infty\right\},$

where $\|\cdot\|$ is the $\ell_2$ norm in $\mathbb{R}^q$ . Furthermore, ${\cal W}^{1, 2}([t_0, t_f]; \mathbb{R}^q)$ is the Sobolev space of absolutely continuous functions, namely,

${\cal W}^{1,2}([t_0,t_f];\mathbb{R}^q): = \left\{z\in {\cal L}^2([t_0,t_f];\mathbb{R}^q)\,|\,\dot{z}: = dz/dt\in {\cal L}^2([t_0,t_f];\mathbb{R}^q)\right\},$

endowed with the norm

$\|z\|_{{\cal W}^{1,2}}: = (\|z\|_{{\cal L}^2}^2+\|\dot{z}\|_{{\cal L}^2}^2)^{1/2}.$

With these definitions, we can now state the general LQ control problem as follows:

$\begin{align*} \mbox{(P) }\left\{\begin{array}{l} &\min \ \ \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f} \left(x(t)^TQ(t)x(t) + u(t)^TR(t)u(t)\right)\,dt, \\ &\mbox{subject to} \ \ \dot{x}(t) = A(t)x(t) + B(t)u(t)\,,\ \ x(t_0) = x_0\,,\ \ x(t_f) = x_f\,,\\ & \underline{u}\leq u(t)\leq \overline{u}\,, \ \ \underline{x}\leq x(t)\leq \overline{x}\,. \end{array} \right. \end{align*}$

The state variable $x\in {\cal W}^{1, 2}([t_0, t_f]; \mathbb{R}^n)$ , with $x(t) : = (x_1(t), \ldots, x_n(t))\in \mathbb{R}^n$ , and the control variable $u\in {\cal L}^2([t_0, t_f]; \mathbb{R}^m)$ , with $u(t) : = (u_1(t), \ldots, u_m(t))\in \mathbb{R}^m$ . The time-varying matrices $A:[t_0, t_f]\rightarrow \mathbb{R}^{n\times n}$ , $B:[t_0, t_f]\to\mathbb{R}^{n\times m}$ , $Q:[t_0, t_f]\rightarrow \mathbb{R}^{n\times n}$ , and $R:[t_0, t_f]\rightarrow \mathbb{R}^{m\times m}$ are continuous. For every $t\in [t_0, t_f]$ , the matrices $Q(t)$ and $R(t)$ are symmetric and respectively positive semi-definite and positive definite. For clarity of argument, these matrices are assumed to be diagonal, namely, $Q(t) : = {\rm{diag}}(q_1(t), \ldots, q_n(t))$ and $R(t): = {\rm{diag}}(r_1(t), \ldots, r_m(t))$ . These diagonality assumptions particularly simplify proximal mapping expressions that appear later. The initial and terminal states are given as $x_0$ and $x_f$ , respectively.

2.1. Optimality conditions

In this section, we state the maximum principle by using the direct adjoining approach from [, Theorem 4.1]. We note that the authors of ^[21] designate Theorem 4.1 in their paper as an "informal theorem" since it has not been proved fully (for the case of pure state and mixed control and state constraints). However, they also point to a paper by Maurer as an exception where Theorem 4.1 has been proved for the case of pure state and pure control constraints, which is exactly our setting. We start by defining the extended Hamiltonian function $H: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R}^n \times \mathbb{R}^n \times \mathbb{R}^n \times [t_0, t_f]\to \mathbb{R}$ for Problem (P) as follows:

$\begin{array}{l} \ \ H(x(t),u(t),\lambda(t),\mu^1(t),\mu^2(t),t) : = \\\frac{\lambda_0}{2}(x(t)^TQ(t)x(t)+u(t)^TR(t)u(t) )+\lambda(t)^T(A(t)x(t)+B(t)u(t))+\mu^1(t)^T(x(t)-\overline{x}(t))+\mu^2(t)^T(\underline{x}(t)-x(t)), \end{array}$

where the multiplier $\lambda_0\geq0$ , the adjoint variable vector $\lambda:[t_0, t_f]\rightarrow \mathbb{R}^n$ with $\lambda(t): = (\lambda_1(t), \dots, \lambda_n(t))\in\mathbb{R}^n$ , and the state constraint multiplier vectors $\mu^1, \mu^2:[t_0, t_f]\rightarrow \mathbb{R}^n$ with $\mu^i(t): = (\mu_1^i(t), \dots, \mu_n^i(t))\in\mathbb{R}^n$ , $i = 1, 2$ . For brevity, we use the following notation:

$H[t] : = H(x(t),u(t),\lambda(t),\mu^1(t),\mu^2(t),t).$

The adjoint variable vector is assumed to satisfy

$\begin{equation} \dot{\lambda}(t) : = -H_x[t] = -Q(t)x(t)-A(t)^T\lambda(t)-\mu^1(t)+\mu^2(t), \end{equation}$

(2.1)

for almost every $t\in[t_0, t_f]$ , where $H_x: = \partial H/\partial x$ .

We assume that the problem we are solving is normal, so we set $\lambda_0 = 1$ .

Maximum Principle. Suppose that the pair

$(x,u)\in{\cal W}^{1,2}([t_0,t_f];\mathbb{R}^n)\times{\cal L}^2([t_0,t_f];\mathbb{R}^m)$

is optimal for Problem (P). Then, there exists a piecewise absolutely continuous adjoint variable vector $\lambda$ , as defined in (2.1), and piecewise continuous multipliers $\mu^1, \mu^2\in{\cal L}^2([t_0, t_f]; \mathbb{R}^n)$ , such that, for almost every $t\in[t_0, t_f]$ ,

$\begin{aligned} u_i(t) & =\underset{\underline{u}_i \leq v_i \leq \bar{u}_i}{\operatorname{argmin}} H\left(x(t), u_1(t), \ldots, v_i, \ldots, u_m(t), \lambda(t), \mu^1(t), \mu^2(t), t\right) \\ & =\underset{\underline{u}_i \leq v_i \leq \bar{u}_i}{\operatorname{argmin}} \frac{1}{2} r_i(t) v_i^2+\lambda(t)^T b_i(t) v_i, \end{aligned}$

(2.2)

for $i = 1, \ldots, m$ , where $b_i(t)$ is the $i$ th column of the matrix $B(t)$ and $r_i(t)$ is the $i$ th diagonal element of $R(t)$ . Moreover, the multipliers $\mu^1(t), \mu^2(t)$ must satisfy the complementarity conditions given by

$\begin{align} \mu_i^1(t)\geq 0, \qquad \mu_i^1(t)(x_i(t)-\overline{x}_i(t)) = 0, \end{align}$

(2.3)

$\begin{align} \mu_i^2(t)\geq 0, \qquad \mu_i^2(t)(\underline{x}_i(t)-x_i(t)) = 0, \end{align}$

(2.4)

for all $i = 1, \dots, n$ .

Suppose that $\underline{u}_i = -\infty$ , $\overline{u}_i = \infty$ , $i = 1, \ldots, m$ , i.e., the control vector is unconstrained. Then, (2.2) becomes

$H_{u_i}[t] = 0,$

i.e.,

$\begin{equation} r_i(t)u_i(t)+b_i(t)^T\lambda(t) = 0\,, \end{equation}$

(2.5)

$i = 1, \ldots, m$ . Then, (2.5) can be solved for $u_i(t)$ as follows:

$\begin{equation} u_i(t) = -\frac{1}{r_i(t)}b_i(t)^T\lambda(t)\,, \end{equation}$

(2.6)

for $i = 1, \ldots, m$ ; or by using the matrices $B(t)$ and $R(t)$ :

$\begin{equation} u(t) = -[R(t)]^{-1}B(t)^T\lambda(t)\,. \end{equation}$

(2.7)

When we consider the constraints on $u$ , one gets the following from (2.2):

$\begin{array}{l} u_j(t) = \left\{\begin{array}{ll} \underline{u}_j, & \mbox{if } \frac{1}{r_j(t)}b_j(t)^T\lambda(t) \geq \underline{u}_j, \\ -\frac{1}{r_j(t)}b_j(t)^T\lambda(t), & \mbox{if } \underline{u}_j\leq -\frac{1}{r_j(t)}b_j(t)^T\lambda(t) \leq \overline{u}_j, \\ \overline{u}_j, & \mbox{if } \frac{1}{r_j(t)}b_j(t)^T\lambda(t) \leq \overline{u}_j, \end{array} \right. \end{array}$

(2.8)

for almost every $t\in[t_0, t_f]$ , $j = 1, \dots, m$ .

3. Splitting and proximal mappings

In this section, we rewrite Problem (P) as the minimization of the sum of two convex functions $f$ and $g$ , and we give the proximal mappings for these functions in Theorems 1 and 2.

We split the constraints from (P) into two sets $\cal{A}, \cal{B}$ , given as

$\begin{array}{l} {\cal A} : = &&\big\{(x,u)\in {\cal L}^{2}([t_0,t_f]; \mathbb{R}^n)\times {\cal L}^2([t_0,t_f]; \mathbb{R}^m)\ | \mbox{, which solves } \\ &&\dot{x}(t) = A(t)x(t)+B(t)u(t), \ \ x(t_0) = x_0, \ \ x(t_f) = x_f, {a.e.}\; t\in[t_0,t_f]\big\}, \end{array}$

(3.1)

$\begin{array}{l} {\cal B} : = &&\big\{(x,u)\in {\cal L}^{2}([t_0,t_f]; \mathbb{R}^n)\times {\cal L}^2([t_0,t_f]; \mathbb{R}^m)\ |\ \underline{x}_i\le x_i(t)\le \overline{x}_i\,,\ \underline{u}_j\le u_j(t)\le \overline{u}_j, \\ && a.e.\; t\in[t_0,t_f],\ i = 1,\dots,n,\ j = 1,\dots,m\big\}. \end{array}$

(3.2)

Despite previously defining $x\in{\cal W}^{1, 2}([t_0, t_f]; \mathbb{R}^n)$ in our sets ${\cal A}, {\cal B}$ , we let $x\in{\cal L}^2([t_0, t_f]; \mathbb{R}^n)$ to simplify the calculation of the proximal mappings. By the definition of ${\cal W}^{1, 2}([t_0, t_f]; \mathbb{R}^n)$ , if $x\in{\cal A}$ , then $x\in{\cal W}^{1, 2}([t_0, t_f]; \mathbb{R}^n)$ .

We assume that the control system $\dot{x}(t) = A(t)x(t) + B(t)u(t)$ is controllable; in other words, the control system can be driven from any $x_0$ to any other $x_f$ ; for a precise definition of controllability and the tests for controllability, see ^[29]. Then, there exists a (possibly not unique) $u(\cdot)$ such that, when this $u(\cdot)$ is substituted, the boundary-value problem given in the expression for ${\cal A}$ has a solution $x(\cdot)$ . In other words, ${\cal A} \neq \emptyset$ . Also, clearly, ${\cal B} \neq \emptyset$ . We note that the constraint set ${\cal A}$ is an affine subspace. Given that ${\cal B}$ is a box, the constraints turn out to be two convex sets in Hilbert space. Since every sequence converging in ${\cal L}^2$ has a subsequence converging pointwise almost everywhere, it is straightforward to see that the set ${\cal B}$ is closed in ${\cal L}^2$ . The closedness of ${\cal A}$ will be established later as a consequence of Theorem 2 (see Remark 2).

Fix $\beta > 0$ and let

$\begin{equation} f(x,u) : = \iota_{\cal{B}}(x,u) + \frac{\beta}{2}{\displaystyle {\int}_{t_0}^{t_f}}\big(x(t)^TQ(t)x(t) + u(t)^TR(t)u(t)\big){\,dt}\quad\mbox{and}\quad g(x,u) : = \iota_{\cal{A}}(x,u), \end{equation}$

(3.3)

where $\iota_{\cal{C}}$ is the indicator function of the set $\cal{C}$ , namely,

$\iota_{\cal C}(x,u) : = \left\{\begin{array}{ll} 0, & \mbox{ if}\ \ (x,u)\in{\cal C}, \\ \infty, & \mbox{ otherwise}. \end{array}\right.$

We note that the DE constraint in Problem (P) is represented by the indicator function $\iota_{\cal{A}}(x, u)$ and the box constraints on $(x, u)$ in Problem (P) are represented by the indicator function $\iota_{\cal{B}}(x, u)$ . Problem (P) is then equivalent to the following problem:

$\begin{equation} \min\limits_{{x,u}} \ \ f(x,u) + g(x,u). \end{equation}$

(3.4)

The equivalence between Problem (P) and (3.4) follows from the fact that the objective function in (3.4) is the sum of the $\beta$ weighted objective function of Problem (P) and the indicator functions of the constraints. Although the parameter $\beta$ does not change the solution of the problem, it will have a role in the performance of the algorithm we study.

In our setting, we assume that we are able to compute the projector operators $P_{\cal{A}}$ and $P_{\cal{B}}$ . These operators project a given point onto each of the constraint sets $\cal{A}$ and $\cal{B}$ , respectively. Recall that the proximal mapping of a functional $h$ is defined by [8, Definition 24.1]. For our setting,

$\begin{equation} {\operatorname{Prox}}_h(x,u) : = \underset{y\in {\cal L}^2([t_0,t_f]; \mathbb{R}^n) \atop v\in {\cal L}^2([t_0,t_f]; \mathbb{R}^m)}{\operatorname{argmin}} \left(h(y,v) + \frac{1}{2}\|y - x\|_{{\cal L}^2}^2 + \frac{1}{2}\|v - u\|_{{\cal L}^2}^2 \right), \end{equation}$

(3.5)

for any $(x, u)\in {\cal L}^2([t_0, t_f]; \mathbb{R}^n)\times {\cal L}^2([t_0, t_f]; \mathbb{R}^m)$ . Recall that the projection $P_{\cal C}(x, u)$ of a point $(x, u)$ onto ${\cal C}$ is characterized by $P_{\cal C}(x, u)\in {\cal C}$ and that $\forall (y, v)\in {\cal C}$ , $\langle (y, v)-P_{\cal C}(x, u)|(x, u)-P_{\cal C}(x, u)\rangle\leq0$ [8, Theorem 3.16].

Note that ${\operatorname{Prox}}_{\iota_{\cal C}} = P_{\cal C}$ .

In order to implement the DR algorithm, we must write the proximal mappings $f$ and $g$ . The proofs of Theorems 1 and 2 below follow the broad lines of proof of Lemma 2 in ^[13]. In both theorems, the major difference from ^[13] is that the proximal operators in the current paper have two variables $x^-$ and $u^-$ . Thanks to separability, the proof of Theorem 1 is a straightforward modification of the corresponding part of the proof of [13, Lemma 2]. We include a full proof of Theorem 1 for the convenience of the reader. On the other hand, the proof of Theorem 2 deals with the solution of a more involved optimal control subproblem, namely, Problem (Pg).

Theorem 1. The proximal mapping of $f$ is given as ${\operatorname{Prox}}_f(x^-, u^-) = (y, v)$ such that the components of $y$ and $v$ are respectively expressed as follows

$\begin{array}{l} y_i(t) = \left\{\begin{array}{ll} \overline{x}_i, & \mathit{\mbox{if}} \ \frac{1}{\beta q_i + 1}x^-_i(t) \geq \overline{x}_i, \\ \frac{1}{\beta q_i + 1}x^-_i(t), & \mathit{\mbox{if}} \ \underline{x}_i\leq \frac{1}{\beta q_i + 1}x^-_i(t) \leq \overline{x}_i, \\ \underline{x}_i, & \mathit{\mbox{if}} \ \frac{1}{\beta q_i + 1}x^-_i(t) \leq \underline{x}_i, \end{array} \right. \end{array}$

(3.6)

$\begin{array}{l} v_j(t) = \left\{\begin{array}{ll} \overline{u}_j, & \mathit{\mbox{if}} \ \frac{1}{\beta r_j + 1}u^-_j(t) \geq \overline{u}_j, \\ \frac{1}{\beta r_j + 1}u^-_j(t), & \mathit{\mbox{if}} \ \underline{u}_j\leq \frac{1}{\beta r_j + 1}u^-_j(t) \leq \overline{u}_j, \\ \underline{u}_j, & \mathit{\mbox{if}} \ \frac{1}{\beta r_j + 1}u^-_j(t) \leq \underline{u}_j, \end{array} \right. \end{array}$

(3.7)

for all $t\in[t_0, t_f], \ i = 1, \dots, n, \ j = 1, \dots, m$ .

Proof. From (3.5) and the definition of $f$ in (3.3), we have that

$\begin{align*} {\operatorname{Prox}}_f(x^-,u^-) = &\underset{x,u}{\operatorname{argmin}}\iota_{\cal{B}}(x,u) + \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f} \Bigl(\beta x(t)^TQ(t)x(t)+\|x(t)-x^-(t)\|^2 \\ &+ \beta u(t)^TR(t)u(t) + \|u(t)-u^-(t)\|^2\Bigr)\,dt. \end{align*}$

In other words, to find ${\operatorname{Prox}}_f(x^-, u^-)$ we need to find $(y, v)$ that solves the following problem:

$\begin{align*} \mbox{(Pf) }\left\{\begin{array}{ll} \min & \ \ \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f} \Bigl(\beta y(t)^TQ(t)y(t)+\|y(t)-x^-(t)\|^2 + \beta v(t)^TR(t)v(t) + \|v(t)-u^-(t)\|^2\Bigr)\,dt, \\ \mbox{subject to} & \ \ \underline{x}_i\le y_i(t)\le \overline{x}_i\,,\ \underline{u}_j\le v_j(t)\le \overline{u}_j\; \mbox{for all } t\in[t_0,t_f],\ i = 1,\dots,n, \ j = 1,\dots,m . \end{array} \right. \end{align*}$

Problem (Pf) is separable in the variables $y$ and $v$ , so we can consider the problems of minimizing with respect to $y$ and $v$ individually, and thus solve the following two subproblems:

$\begin{align*} \mbox{(Pf1) }\left\{\begin{array}{ll} \min & \ \ \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f} \beta y(t)^TQ(t)y(t)+\|y(t)-x^-(t)\|^2\,dt, \\ \mbox{subject to} & \ \ \underline{x}_i\le y_i(t)\le \overline{x}_i\,, \ \ \mbox{for all } t\in[t_0,t_f],\ i = 1,\dots,n , \end{array} \right. \end{align*}$

and

$\begin{align*} \mbox{(Pf2) }\left\{\begin{array}{ll} \min & \ \ \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f} \beta v(t)^TR(t)v(t) + \|v(t)-u^-(t)\|^2\,dt, \\ \mbox{subject to} & \ \ \underline{u}_j\le v_j(t)\le \overline{u}_j \ \ \mbox{for all } t\in[t_0,t_f],\ j = 1,\dots,m . \end{array} \right. \end{align*}$

The solution to Problem (Pf1) is given by

$y_i(t) = \underset{\underline{x}_i \leq z_i \leq \bar{x}_i}{\operatorname{argmin}} \big(\, \beta\,q_i\,z_i^2 + (z_i - x_i^-(t))^2\,\big)\,,$

$i = 1, \ldots, n$ , which, after straightforward manipulations, yields (3.6). The solution to Problem (Pf2) is obtained similarly as:

$v_j(t) = \underset{\underline{u}_j \leq w_j \leq \bar{u}_j}{\operatorname{argmin}}\big(\, \beta\,r_j\,w_j^2 + (w_j - u_j^-(t))^2\,\big)\,,$

$j = 1, \ldots, m$ , which yields (3.7) after straightforward manipulations. □

Theorem 2. The proximal mapping of $g$ is given as ${\operatorname{Prox}}_g(x^-, u^-) = P_{\cal{A}}(x^-, u^-) = (y, v)$ such that

$\begin{align} y(t) = &\ x(t), \end{align}$

(3.8)

$\begin{align} v(t) = &\ u^-(t)-B(t)^T\lambda(t), \end{align}$

(3.9)

where $x(t), \lambda(t)$ are obtained by solving the following two-point boundary-value problem (TPBVP):

$\begin{equation} \begin{array}{ll} &\dot{x}(t) = A(t)x(t) + B(t)u^-(t) - B(t)B(t)^T\lambda(t)\,, \ \ x(t_0) = x_0\,,\ \ x(t_f) = x_f\,, \\ &\dot{\lambda}(t) = -x(t) + x^-(t) - A(t)^T\lambda(t)\,. \end{array} \end{equation}$

(3.10)

Proof. Using [, Example 12.25], and the definition of $g$ in (3.3),

${\operatorname{Prox}}_g(x^-,u^-) = {\operatorname{Prox}}_{\iota_{\cal{A}}}(x^-,u^-) = P_{\cal{A}}(x^-,u^-),$

which verifies the very first assertion. From (3.5) and the definition of $g$ in (3.3), we have that

${\operatorname{Prox}}_g(x^-,u^-) = \underset{x, u}{\operatorname{argmin}}\iota_{\cal{A}}(x,u) + \frac{1}{2}\displaystyle {\int}_{t_0}^{t_f}(\|x(t)-x^-(t)\|^2 + \|u(t)-u^-(t)\|^2)\,dt.$

In other words, to find ${\operatorname{Prox}}_g(x^-, u^-)$ , we need to find $(y, v)$ that solves the following problem:

$(\mathrm{Pg}) \begin{cases}\min & \frac{1}{2} \displaystyle {\int}_{t_0}^{t_f}\left\|y(t)-x^{-}(t)\right\|^2+\left\|v(t)-u^{-}(t)\right\|^2 d t, \\ \text { subject to } & \dot{y}(t)=A(t) y(t)+B(t) v(t), \quad y\left(t_0\right)=x_0, \quad y\left(t_f\right)=x_f .\end{cases}$

Problem (Pg) is an optimal control problem wherein $y(t)$ is the state variable and $v(t)$ is the control variable. The Hamiltonian for Problem (Pg) is given by

$H(y(t),v(t),\lambda(t),t) : = \frac{1}{2}(\|y(t)-x^-(t)\|^2 + \|v(t)-u^-(t)\|^2) + \lambda(t)^T(A(t)y(t) + B(t)v(t)),$

and the associated costate equation is given by

$\begin{equation} \dot{\lambda}(t) = -\partial H / \partial y = -y(t)+x^-(t)-A(t)^T\lambda(t). \end{equation}$

(3.11)

If $v$ is the optimal control for Problem (Pg), then, by the maximum principle, $H_v[t] = 0$ for all $t\in[t_0, t_f]$ . In other words,

$v(t) - u^-(t) + B(t)^T\lambda(t) = 0\,,$

for all $t\in[t_0, t_f]$ , a re-arrangement of which yields (3.9). Collecting together the ODE in Problem (Pg) and the ODE in (3.11), substituting $v(t)$ from (3.9), and assigning $y(t) = x(t)$ , we can obtain the TPBVP in (3.10). □

Remark 1. We note from Theorem 2 that ${\operatorname{Prox}}_g$ is the projection onto the affine set $\cal A$ . Unlike ${\operatorname{Prox}}_f$ , in general, we cannot find an analytical solution to (3.10) to obtain ${\operatorname{Prox}}_g$ (or the projection onto $\cal A$ ) thus for this purpose, we will propose Algorithm 2 in Section 4.1 to get a numerical solution.

Remark 2. From Theorem 2, we see that every pair $(x, u)\in {\cal L}^2\times {\cal L}^2$ has a projection onto ${\cal A}$ . In other words, ${\cal A}$ is a Chebyshev set. It is well known that every Chebyshev set is closed (see [, Remark 3.11(ⅰ)]). Hence, the set ${\cal A}$ is closed in the topology of ${\cal L}^2\times {\cal L}^2$ .

4. Douglas–Rachford algorithm

The application of the DR algorithm to our problem is slightly different from that in ^[7,12]. Since we are solving control- and state-constrained optimal control problems, we must define the proximal mappings at the pair $(x, u)$ , rather than just at $u$ , as in ^[7,12]. Thus, in the implementation of the DR algorithm, we give the iterations for the pair $(x^k, u^k)$ , rather than for $u^k$ alone.

Given $\beta > 0$ , we specialize the DR algorithm (see ^[17,18,25]) to the case of minimizing the sum of the two functions $f$ and $g$ as in (3.3) and (3.4). The DR operator associated with the ordered pair $(f, g)$ is defined by

$T : = {\operatorname{Id}} - {\operatorname{Prox}}_f + {\operatorname{Prox}}_g(2 {\operatorname{Prox}}_f- {\operatorname{Id}})\,.$

Application of the operator to our case is given by

$\begin{align} T(x,u) & = (x,u)- {\operatorname{Prox}}_f(x,u)+P_{\cal{A}}(2 {\operatorname{Prox}}_f(x,u)-(x,u))\,, \end{align}$

(4.1)

where the proximal mappings of $f$ and $g$ are provided as in Theorems 1 and 2. Let $X$ be an arbitrary Hilbert space. Now, fix $x_0\in X$ . Given $x_n\in X$ , $k\geq 0$ , the DR iterations are set as follows:

$\begin{align*} (b_{x,k},b_{u,k}) &: = {\operatorname{Prox}}_f(x_k,u_k)\,,\\ (x_{k+1},u_{k+1}) &: = T(x_k,u_k) = (x_k,u_k)-(b_{x,k},b_{u,k}) +P_{\cal{A}}(2(b_{x,k},b_{u,k})-(x_k,u_k))\,. \end{align*}$

The DR algorithm is implemented as Algorithm 1. We define a new parameter $\gamma: = 1/(1+\beta)$ , where $\beta$ is the parameter that multiplies the objective as in (3.3) and Theorem 1. The choice of $\gamma \in ]0, 1[$ can be made because changing $\beta$ does not affect the solution of Problem (P).

Algorithm 1. (DR)
Step 1. (Initialization) Choose a parameter $\gamma\in\left]0, 1\right[$ and the initial iterate $(x^0, u^0)$ arbitrarily. Choose a small parameter $\varepsilon > 0$ , and set $k = 0$ .
Step 2. (Projection onto ${\cal B}$ ) Set $(x^-, u^-) = \gamma\, (x^k, u^{k})$ . Compute $(\widetilde{x}, \widetilde{u}) = P_{{\cal B}}(x^-, u^-)$ by using (3.6) and (3.7).
Step 3. (Projection onto ${\cal A}$ ) Set $(x^-, u^-) : = 2\, (\widetilde{x}, \widetilde{u})-(x^k, u^k)$ . Compute $(\widehat{x}, \widehat{u}) = P_{{\cal A}}(x^-, u^-)$ by using (3.8) and (3.9) or Algorithm 2.
Step 4. (Update) Set $(x^{k+1}, u^{k+1}) : = (x^k, u^k) + (\widehat{x}, \widehat{u}) - (\widetilde{x}, \widetilde{u})$ .
Step 5. (Stopping criterion) If $\max(\\|x^{k+1} - x^k\\|_{L^\infty}, \\|u^{k+1} - u^k\\|_{L^\infty}) \le \varepsilon$ or $k > = 200$ , then return $(\widetilde{x}, \widetilde{u})$ and stop. Otherwise, set $k : = k+1$ and go to Step 2.

Remark 3. We point out that, in general, only weak convergence is guaranteed for the DR algorithm (see [, Theorem 1] or [, Theorem 4.4]). It should be noted that the final iterate of the state-control pair in the box $\cal B$ is returned by the algorithm.

4.1. Algorithm for projector onto $\cal A$

We introduce a procedure for numerically projecting onto $\cal A$ , that is, an extension of Algorithm 2 from ^[12] to the case of LQ control problems with state and control constraints. The procedure below (Algorithm 2) can be employed in Step 3 of the DR algorithm. In the procedure, we effectively solve the TPBVP in (3.10) by implementing the standard shooting method ^[4,23,31]. Throughout the steps of Algorithm 2 below, we will solve the ODEs in (3.10), rewritten here in matrix form as follows:

$\begin{align} \begin{bmatrix} \dot{x}(t) \\ \dot{\lambda}^{DR}(t) \end{bmatrix} & = \begin{bmatrix} A(t) & -B(t)B(t)^T \\ -I_{n\times n} & -A(t)^T \end{bmatrix}\begin{bmatrix} x(t) \\ \lambda^{DR}(t) \end{bmatrix} + \begin{bmatrix} 0_{n\times n} & B(t) \\ I_{n\times n} & 0_{n\times m} \end{bmatrix}\begin{bmatrix} x^-(t) \\ u^-(t) \end{bmatrix}, \end{align}$

(4.2)

Algorithm 2. (Numerical Computation of the Projector onto ${\cal A}$ )
Step 0. (Initialization) The following parameters are given: Current iterate $u^-$ , the system and control matrices $A(t)$ and $B(t)$ , the numbers of state and control variables $n$ and $m$ , and the initial and terminal states $x_0$ and $x_f$ , respectively. Define $z(t, \lambda_0): = x(t)$ .
Step 1. (Near-miss function) Solve (4.2) with ICs in (4.3)(i) to find $z(t_f, 0) = x(t_f)$ .
Set $\varphi(0) : = z(t_f, 0)-x_f$ .
Step 2. (Jacobian) For $i = 1, \ldots, n$ , solve (4.2) with ICs in (4.3)(ⅱ) to get $z(t_f, e_i)$ .
Set $\beta_i(t) : = z(t_f, e_i) - z(t_f, 0)$ and $J_\varphi(0) : = \left[\beta_1(t)\ \|\ \dots\ \|\ \beta_n(t) \right]$ .
Step 3. (Missing IC) Solve $J_{\varphi}(0)\, \lambda^{DR}_0 : = -\varphi(0)$ for $\lambda^{DR}_0$ .
Step 4. (Projector onto ${\cal A}$ ) Solve (4.2) with ICs in (4.3)(ⅲ) to find $x(t)$ and $\lambda^{DR}(t)$ .
Set $P_{\cal{A}}(x^-, u^-)(t) : = (y(t), v(t))$ , where $y(t) = x(t)$ and $v(t) = u^-(t)-B(t)^T\lambda^{DR}(t)$ .

with various initial conditions (ICs):

$\begin{align} \mbox{(i)} \begin{bmatrix}x(t_0) \\ \lambda^{DR}(t_0)\end{bmatrix} = \begin{bmatrix}x_0 \\ 0 \end{bmatrix}, \ \ \mbox{(ⅱ)} \begin{bmatrix}x(t_0) \\ \lambda^{DR}(t_0)\end{bmatrix} = \begin{bmatrix}x_0 \\ e_i \end{bmatrix}, \ \ \mbox{(ⅲ)}\begin{bmatrix}x(t_0) \\ \lambda^{DR}(t_0)\end{bmatrix} = \begin{bmatrix}x_0 \\ \lambda^{DR}_0 \end{bmatrix}. \end{align}$

(4.3)

In the above equations, we use $\lambda^{DR}$ , instead of just $\lambda$ , to emphasize the fact that $\lambda^{DR}$ is the costate variable that results from solving Problem (Pg) to compute the projection onto $\cal A$ within the DR algorithm. We reiterate that Problem (Pg) is more involved than its counterpart in ^[13], which leads to the ODE in (4.2), which, in turn, is more complicated than its counterpart in ^[12].

4.2. A conjecture for the costates for Problem (P)

Recall that the optimal control for Problem (P) is given by the cases in (2.8).

A junction time $\overline{t}_j$ is a time when the control $u_j(t)$ falls into two cases of (2.8) simultaneously, i.e., a point in time when a control constraint transitions from "active" to "inactive", or vice versa. This definition of a junction time becomes important in the following conjecture, which has been formulated and tested by means of extensive numerical experiments.

Conjecture 1. Let $\lambda^{DR}(t)$ be the costate variable emerging from the projection into $\cal{A}$ as computed in Algorithm 2, and $\lambda(t)$ be the costate variable that results from Problem (P). Let $\overline{t}_j$ be a junction time for some $u_j$ , $j = 1, \dots, m$ , such that $b_j(\overline{t}_j)^T\lambda^{DR}(\overline{t}_j) \neq 0$ . Define

$\alpha: = \dfrac{-r_j(\overline{t}_j)u_j(\overline{t}_j)}{b_j(\overline{t}_j)^T\lambda^{DR}(\overline{t}_j)}\,.$

Then,

$\begin{equation} \lambda(t) = \alpha \lambda^{DR}(t)\,. \end{equation}$

(4.4)

Remark 4. The ability to obtain the costate variable by Conjecture 1 is desirable as a tool for checking that the necessary condition of optimality in (2.8) is satisfied. Without this conjecture, we are unable to verify whether the optimality condition is satisfied when using the DR algorithm, except when a dual version of the DR algorithm is employed, as in ^[13].

Once we have calculated $\lambda$ in this way, we can also find a multiplier $\mu^k$ , $k = 1, 2$ , numerically, for the case in which only one state constraint is active at a given time. Suppose that only the $i$ th state box constraint becomes active. By rearranging (2.1), using numerical differentiation to find $\dot{\lambda}$ , and assuming $\mu_i^2(t) = 0$ , we have that

$\begin{equation} \mu_i^1(t) = -Q(t)x(t)-A(t)^T\lambda(t)-\dot{\lambda}(t). \end{equation}$

(4.5)

If $\mu_i^1(t) = 0$ , then we compute the following:

$\begin{equation} \mu_i^2(t) = Q(t)x(t)+A(t)^T\lambda(t)+\dot{\lambda}(t). \end{equation}$

(4.6)

With (4.5), or with (4.6), the complementarity conditions in (2.3) or (2.4) can now be checked numerically.

5. Numerical experiments

We will now introduce two example problems. Along with posing the optimal control problems, we also give plots of their optimal controls, states, costates, and multipliers, with vertical lines signifying the regions in which the state constraints become active.

5.1. Harmonic oscillator

Problem (PHO) below contains the dynamics of a harmonic oscillator which, is typically used to model a point mass with a spring. The dynamics are given as $\ddot{y}(t) + \omega_0^2 y(t) = f(t)$ , where $\omega_0 > 0$ is known as the natural frequency and $f(t)$ is some forcing. In a physical system, $y$ represents the position of a unit mass, $\dot{y}$ is the velocity of said mass, the natural frequency is defined as $\omega_0 = \sqrt{k}$ , where $k$ is the stiffness of the spring producing the harmonic motion, and $f$ is the restoring force. In addition to the restoring force, we will introduce another force $u_1$ that will affect the velocity $\dot{y}$ directly. We let $x_1: = y$ , $x_2: = \dot{y}$ , and $u_2: = f$ to arrive at $\dot{x}_1(t) = x_2(t)+u_1(t)$ , $\dot{x}_2(t) = -\omega_0^2x_1(t)+u_2(t)$ .

In this example problem, the objective contains the squared sum of all four variables in the system. It is common to consider this problem with the objective of minimizing the energy of the control variable, but, in this case, we have also included the state variables to test the algorithm with a slightly more involved objective. The focus of this research is control- and state-constrained problems, so the constraints have been added as in Problem (P):

$\begin{align*} \mbox{(PHO) }\left\{\begin{array}{ll} \min & \ \ \frac{1}{2}\displaystyle {\int}_{0}^{2\pi} x_1^2(t) + x_2^2(t) + u_1^2(t) + u_2^2(t)\,dt, \\ \mbox{subject to} & \ \ \dot{x}_1(t) = x_2(t) + u_1(t)\,,\ \ x_1(0) = 0\,,\ \ x_2(2\pi) = 0, \\ & \ \ \dot{x}_2(t) = -4x_1(t) + u_2(t)\,,\ \ x_2(0) = 1\,,\ \ x(2\pi) = 0, \\ & \ \ \underline{u}\leq u(t)\leq \overline{u}\,, \ \ \underline{x}\leq x(t)\leq \overline{x}. \end{array} \right. \end{align*}$

5.2. Simple spring-mass system

The simple spring-mass system is another physical system that can be easily visualized, see ^[30]. This problem contains two masses and two springs that are connected in sequence with the dynamics given by $m_1\ddot{y}_1(t)+(k_1+k_2)y_1(t)-k_2y_2 = f_1(t)$ , $m_2\ddot{y}_2(t)-k_2y_1(t)+k_2y_2(t) = f_2(t)$ , where $m_1, m_2$ are the two masses, $k_1, k_2$ are the spring coefficients (stiffness), and $f_1(t), f_2(t)$ are the forces applied to $m_1, m_2$ , respectively. Let $x_1: = y_1$ , $x_2: = \dot{y}_1$ , $x_3: = y_2$ , $x_4: = \dot{y}_2$ , $u_1 : = f_1$ , $u_2 : = f_2$ , $m_1 = m_2 = 1$ , $k_1 = 1$ , and $k_2 = 2$ ; then, we retrieve the system in Problem (PSM). This dynamical system furnishes an optimal control problem with four state variables and two control variables. As in (PHO), we have added state and control constraints and set the objective as the integral of the squared sum of the state and control variables:

$\begin{align*} \mbox{(PSM) }\left\{\begin{array}{ll} \min & \ \ \frac{1}{2}\displaystyle {\int}_{0}^{2\pi} x_1^2(t) + x_2^2(t) + x_3^2(t) + x_4^2(t) + u_1^2(t) + u_2^2(t)\,dt, \\ \mbox{subject to} & \ \ \dot{x}_1(t) = x_2(t)\,,\ \ x_1(0) = 0\,,\ \ x_1(2\pi) = 0, \\ & \ \ \dot{x}_2(t) = -3x_1(t) + 2x_3(t) + u_1(t)\,,\ \ x_2(0) = 1\,,\ \ x_2(2\pi) = 0, \\ & \ \ \dot{x}_3(t) = x_4(t)\,,\ \ x_3(0) = 1\,,\ \ x_3(2\pi) = 0, \\ & \ \ \dot{x}_4(t) = 2x_1(t) - 2x_3(t) + u_2(t)\,,\ \ x_4(0) = -1\,,\ \ x_4(2\pi) = 0, \\ & \ \ \underline{u}\leq u(t)\leq \overline{u}\,, \ \ \underline{x}\leq x(t)\leq \overline{x}. \end{array} \right. \end{align*}$

5.3. Numerical discussion and comparisons

Technical specifications. In the numerical experiments, we used M ${\large{atlab}}$ version 2023b with the DR algorithm and compared its performance with that of the AMPL–Ipopt optimization computational suite ^[19,33] with Ipopt version 3.12.13. We chose to make comparisons to Ipopt since it is a free and easily obtainable solver that is used for problems such as those presented in this paper (also see ^[5]). All computations applied $\epsilon = 10^{-8}$ from Algorithm 1 or, in the case of Ipopt, $\text{tol} = 10^{-8}$ , and were run on a PC with an i5-10500T 2.30GHz processor with 8GB RAM. For the two examples (PHO) and (PSM), we experimented with a case with only control constraints and a case in which there was also an added state constraint. The problem specifications can be found in , along with the choices of $\gamma$ that were used in the implementation of the DR algorithm.

Table 1. Problem cases.

Problem	Case	$\gamma$	Constraints
(PHO)	Case 1	$0.60$	$-0.4\leq u_1(t)\leq0.1, \ -0.5\leq u_2(t)\leq0.1$
(PHO)	Case 2	$0.95$	$-0.4\leq u_1(t)\leq0.1, \ -0.5\leq u_2(t)\leq0.1, \ -0.025\leq x_1(t)$
(PSM)	Case 1	$0.55$	$-0.5\leq u_1(t)\leq0.5, \ -0.4\leq u_2(t)\leq0.4$
(PSM)	Case 2	$0.95$	$-0.5\leq u_1(t)\leq0.5, \ -0.4\leq u_2(t)\leq0.4, \ -0.2\leq x_1(t)$

| Show Table

DownLoad: CSV

Generation of "true" solutions. Since we did not have analytical solutions to these problems, we generated higher-accuracy numerical solutions to our problems that we will use as "true" solutions in our error calculations. In the Case 1 examples (only control constraints present) the DR algorithm was able to successfully converge to an acceptable solution by using $N = 10^7$ and $\epsilon = 10^{-12}$ , without reaching 200 iterations (the maximum number of iterations we allowed). The 8 GB of RAM provided insufficient memory for Ipopt to generate a solution with such a high number of discretization points, so the DR algorithm was used to generate the "true" solutions in this case. In Case 2 (state-constrained case), the DR algorithm was unable to converge to an accepted solution in less than 200 iterations, so we instead relied on Ipopt. Due to the memory limitations, the "true" solution for (PHO) Case 2 was generated with $N = 10^6, \ \epsilon = 10^{-12}$ , and (PSM) Case 2 used $N = 7\times10^5, \ \epsilon = 10^{-12}$ .

Choice of an algorithmic parameter. The values of $\gamma$ in were decided by generating plots that display the number of iterations required for the DR algorithm to find an acceptable solution for 500 values of $\gamma$ in the interval $(0, 1)$ . For both (PHO) and (PSM) in Case 2, for all values of $\gamma$ that were tested the DR algorithm required more than the maximum 200 iterations to converge. For these cases, we instead generated plots that compared the errors in the states and controls for the different values of $\gamma$ . A specific value of $\gamma$ that would provide the smallest errors was not obvious since many values provided similar performance; but, values closer to 1 appeared optimal; thus, the choice of $\gamma = 0.95$ was made for these experiments.

Graphical results. In Figures 1 and 2, we have plots for (PHO) and (PSM) using the DR algorithm. We have generated similar figures based on the results of Ipopt as well, but, since they mostly overlap Figures 1 and , we will not show them all here but will point out the differences. In , we give the multiplier vector $\mu^2$ components for Case 2 (PHO) as obtained by using the DR algorithm and Ipopt. As indicated by the black vertical lines in the bottom left plot of , when using the DR algorithm the constraint on $x_1$ is active over a time interval that aligns with the interval in which $\mu_1^2$ is positive, as expected from (2.3) and (2.4).

Figure 1. (PHO) Case 2 plots (see Table 1) using the DR algorithm with

$N = 10^3, -0.025\leq x_1(t)$ . Vertical lines indicate the interval in which the state constraint becomes active.

DownLoad: Full-Size Img PowerPoint

Figure 2. (PSM) Case 2 plots (see Table 1) using the DR algorithm with

$N = 10^3, -0.2\leq x_1(t)$ . Vertical lines indicate the interval in which the state constraint becomes active.

DownLoad: Full-Size Img PowerPoint

Figure 3. Multipliers

$\mu_1^2, \mu_2^2$ for Case 2 (PHO) with

$N = 10^3$ ; as obtained by using the DR algorithm (solid lines) and Ipopt (dotted lines). Note that

$\mu_1^2(t)$ found by Ipopt attained a maximum value of 33, which is not shown in the graph.

DownLoad: Full-Size Img PowerPoint

In the results from Ipopt, we observe that the state constraint only became active at a single point in time, but shows that $\mu_1^2$ from Ipopt (yellow dotted line) is positive for a larger interval of time; thus, (2.4) is violated. This appears to be a numerical error that is not present in the DR results since there is an interval of points around the spike where the state constraint $-0.025\leq x_1(t)$ becomes active. When $N = 10^4, 10^5$ , for (PHO), we observe that, when using Ipopt, the lower bound on $x_1$ is never reached, though, again, we see the interval of points that are almost equal to the lower bound.

In , we also see a discrepancy in $\mu_2^2$ since we have not imposed a constraint on the variable $x_2$ . Equation (2.4) implies that $\mu_2^2(t) = 0$ for all $t\in[t_0, t_f]$ . We can see in that $\mu_2^2$ from Ipopt (purple dotted line) fails to satisfy this requirement, while $\mu_2^2$ from the DR algorithm is, at least to the eye, equal to zero.

Another difference between the multipliers $\mu$ from the DR algorithm and Ipopt can be seen in the maximum values reached by the functions. In , we see that $\mu_1^2$ obtained via the DR algorithm and Ipopt have similar shapes in their plots, but the maximum value reached by using the DR algorithm is approximately $0.7$ , while, from Ipopt, the maximum value is approximately $33$ . For (PSM) Case 2 with $N = 10^3$ , the maximum value for $\mu_1^2$ that was obtained by using the DR algorithm was approximately 16, while Ipopt yielded a maximum value of approximately 310. Along with the functions having very different maximum values, we noted that, when generating these plots for $N = 10^3, 10^4, 10^5$ , the results from the DR algorithm were clearly converging to a single function $\mu_1^2$ , while this was not obvious from the Ipopt results. For (PSM), we see the approximate maximum values of $\mu_1^2$ that were obtained when using Ipopt were 310 for $N = 10^3$ , 2500 for $N = 10^4$ , and 3000 for $N = 10^5$ . In the same example, using the DR algorithm, the maximum values of the function were 16 for $N = 10^3$ , 14 for $N = 10^4$ , and 14 for $N = 10^5$ .

We also observed some slight variation between the plots obtained using the DR algorithm and Ipopt at the junction times in the control variables (not shown in this paper in order to avoid excessive amount of visual material). The intervals in which the control variables reached their boundaries via the DR algorithm always appeared slightly larger than those obtained from Ipopt. The control variables in the region in which they transition between active and inactive constraints appeared more rounded at these corners when using Ipopt and exhibited a sharper transition when using the DR algorithm.

Errors and CPU times. contains the errors in the controls, states, and costates for the DR algorithm and Ipopt, while contains the errors in the multipliers, objective values, and CPU times. The CPU times were computed as averages of over 200 runs (up to 1,000 runs in the faster examples). The values within boxes are the smaller errors and CPU times between the DR algorithm and Ipopt. At a glance, we can see that, more often than not, the DR algorithm produced smaller errors and faster CPU times than Ipopt. Upon closer inspection, we see that, in many of the Case 2 results, the errors from the DR algorithm and Ipopt are comparable. We note that the "true" solutions used to calculate these errors are those explained earlier in this subsection, except for the multipliers $\mu^2$ in the Case 1 examples. In those examples, we applied the zero vector as our "true" solution.

Table 2. Errors in controls

$u$ , states

$x$ , and costates

$\lambda$ and CPU times for the DR algorithm and AMPL–Ipopt, with

$\varepsilon = 10^{-8}$ and specifications from Table 1.

		$L^\infty$ error in control		$L^\infty$ error in states		$L^\infty$ error in costates
$N$	Problem	DR	Ipopt	DR	Ipopt	DR	Ipopt
$10^3$	(PHO)-Case 1	$\boxed{7.9\times10^{-3}}$	$1.4\times10^{-2}$	$6.9\times10^{-3}$	$\boxed{2.7\times10^{-3}}$	$\boxed{6.3\times10^{-3}}$	$1.4\times10^{-2}$
	(PHO)-Case 2	$\boxed{1.4\times10^{-2}}$	$1.6\times10^{-2}$	$4.5\times10^{-3}$	$\boxed{2.9\times10^{-3}}$	$\boxed{1.4\times10^{-2}}$	$1.6\times10^{-2}$
	(PSM)-Case 1	$\boxed{2.3\times10^{-2}}$	$3.8\times10^{-2}$	$2.0\times10^{-2}$	$\boxed{1.8\times10^{-2}}$	$\boxed{5.5\times10^{-2}}$	$1.5\times10^{-1}$
	(PSM)-Case 2	$\boxed{7.1\times10^{-2}}$	$1.1\times10^{-1}$	$3.8\times10^{-1}$	$\boxed{3.7\times10^{-1}}$	$\boxed{8.4\times10^{-1}}$	$1.5\times10^0$
$10^4$	(PHO)-Case 1	$\boxed{7.8\times10^{-4}}$	$3.4\times10^{-3}$	$6.7\times10^{-4}$	$\boxed{3.6\times10^{-4}}$	$\boxed{7.5\times10^{-4}}$	$1.6\times10^{-3}$
	(PHO)-Case 2	$\boxed{1.3\times10^{-3}}$	$3.0\times10^{-3}$	$4.7\times10^{-4}$	$\boxed{4.2\times10^{-4}}$	$5.0\times10^{-3}$	$\boxed{4.8\times10^{-3}}$
	(PSM)-Case 1	$\boxed{2.2\times10^{-3}}$	$3.6\times10^{-3}$	$2.0\times10^{-3}$	$\boxed{1.8\times10^{-3}}$	$\boxed{4.8\times10^{-3}}$	$1.5\times10^{-2}$
	(PSM)-Case 2	$2.4\times10^{-2}$	$\boxed{1.1\times10^{-2}}$	$\boxed{3.7\times10^{-1}}$	$\boxed{3.7\times10^{-1}}$	$\boxed{7.9\times10^{-1}}$	$1.5\times10^0$
$10^5$	(PHO)-Case 1	$\boxed{7.7\times10^{-5}}$	$6.3\times10^{-3}$	$\boxed{6.7\times10^{-5}}$	$9.5\times10^{-4}$	$\boxed{6.5\times10^{-5}}$	$2.1\times10^{-3}$
	(PHO)-Case 2	$\boxed{7.8\times10^{-4}}$	$7.5\times10^{-3}$	$\boxed{6.4\times10^{-5}}$	$1.4\times10^{-3}$	$\boxed{5.7\times10^{-3}}$	$7.3\times10^{-3}$
	(PSM)-Case 1	$\boxed{2.2\times10^{-4}}$	$1.1\times10^{-2}$	$\boxed{2.0\times10^{-4}}$	$6.8\times10^{-4}$	$\boxed{4.3\times10^{-4}}$	$4.0\times10^{-3}$
	(PSM)-Case 2	$2.6\times10^{-2}$	$\boxed{6.0\times10^{-3}}$	$\boxed{3.7\times10^{-1}}$	$\boxed{3.7\times10^{-1}}$	$8.0\times10^{-1}$	$\boxed{7.9\times10^{-1}}$

| Show Table

DownLoad: CSV

Table 3. Errors in multipliers

$\mu$ , errors in objective values, and CPU times for the DR algorithm and AMPL–Ipopt, with

$\varepsilon = 10^{-8}$ and specifications from Table 1.

		$L^\infty$ error in $\mu$		$L^\infty$ error in objective values		CPU time [s]
$N$	Problem	DR	Ipopt	DR	Ipopt	DR	Ipopt
$10^3$	(PHO)-Case 1	$\boxed{4.9\times10^{-3}}$	$3.1\times10^{-2}$	$\boxed{2.9\times10^{-3}}$	$4.8\times10^{-3}$	$\boxed{4.8\times10^{-2}}$	$2.9\times10^{-1}$
	(PHO)-Case 2	$\boxed{5.4\times10^{-1}}$	$1.5\times10^{0}$	$\boxed{2.9\times10^{-3}}$	$4.9\times10^{-3}$	$\boxed{3.3\times10^{-1}}$	$3.7\times10^{-1}$
	(PSM)-Case 1	$\boxed{2.4\times10^{-3}}$	$8.6\times10^{-2}$	$\boxed{4.8\times10^{-2}}$	$5.1\times10^{-2}$	$\boxed{1.0\times10^{-1}}$	$4.4\times10^{-1}$
	(PSM)-Case 2	$1.6\times10^{1}$	$\boxed{3.1\times10^{0}}$	$\boxed{6.8\times10^{-2}}$	$7.4\times10^{-2}$	$\boxed{5.2\times10^{-1}}$	$5.3\times10^{-1}$
$10^4$	(PHO)-Case 1	$\boxed{1.5\times10^{-4}}$	$3.0\times10^{-3}$	$\boxed{2.8\times10^{-4}}$	$4.8\times10^{-4}$	$\boxed{4.7\times10^{-1}}$	$2.7\times10^{0}$
	(PHO)-Case 2	$\boxed{1.7\times10^{1}}$	$\boxed{1.7\times10^{1}}$	$\boxed{2.8\times10^{-4}}$	$4.9\times10^{-4}$	$3.5\times10^{0}$	$\boxed{3.3\times10^{0}}$
	(PSM)-Case 1	$\boxed{7.1\times10^{-4}}$	$8.2\times10^{-3}$	$\boxed{4.6\times10^{-3}}$	$5.0\times10^{-3}$	$\boxed{1.1\times10^{0}}$	$4.6\times10^{0}$
	(PSM)-Case 2	$1.4\times10^{1}$	$\boxed{2.5\times10^{0}}$	$\boxed{4.4\times10^{-3}}$	$6.9\times10^{-3}$	$\boxed{5.9\times10^{0}}$	$6.4\times10^{0}$
$10^5$	(PHO)-Case 1	$\boxed{3.7\times10^{-5}}$	$3.0\times10^{-4}$	$\boxed{2.8\times10^{-5}}$	$7.7\times10^{-5}$	$\boxed{5.1\times10^{0}}$	$2.6\times10^{1}$
	(PHO)-Case 2	$\boxed{1.8\times10^{1}}$	$\boxed{1.8\times10^{1}}$	$\boxed{2.4\times10^{-5}}$	$1.4\times10^{-4}$	$3.7\times10^{1}$	$\boxed{3.1\times10^{1}}$
	(PSM)-Case 1	$\boxed{1.1\times10^{-4}}$	$8.2\times10^{-4}$	$\boxed{4.5\times10^{-4}}$	$5.3\times10^{-4}$	$\boxed{1.2\times10^{1}}$	$1.2\times10^{2}$
	(PSM)-Case 2	$\boxed{2.2\times10^{2}}$	$3.7\times10^{3}$	$1.5\times10^{-3}$	$\boxed{7.2\times10^{-4}}$	$\boxed{6.6\times10^{1}}$	$1.6\times10^{2}$

| Show Table

DownLoad: CSV

We observe that most of the results in show a smaller error in the control variable from the DR algorithm, especially in the Case 1 examples where there are no state constraints. Regarding the state variables, we see that Ipopt has smaller errors when $N = 10^3, 10^4$ , though there is little difference from the DR algorithm results, and we see an improvement in the DR algorithm when $N = 10^5$ . Like with the error in the control variables, we see that the error in the costates is smaller for the DR algorithm in almost all examples.

From , the errors in the multipliers show an improvement in the DR results realtive to Ipopt, though the "true" solution in the Case 2 examples was obtained by using results from Ipopt, which, as previously mentioned, did not appear to converge to a specific value as we increased $N$ thus, the quality of this "true" solution is not guaranteed. The DR algorithm produced slightly smaller objective values that were closest to the "true" solution in almost all cases, though the difference compared with Ipopt was marginal. We see that the CPU times were faster for the DR algorithm, especially in the examples where $N = 10^5$ .

6. Conclusions

We have applied the DR algorithm to find a numerical solution of LQ control problems with state and control constraints after re-formulating these problems as the sum of two convex functions and deriving expressions for the proximal mappings of these functions (Theorems 1 and 2). These proximal mappings were used in the DR iterations. Within the DR algorithm (Algorithm 1), we proposed a procedure (Algorithm 2) for finding the projection onto the affine set defined by the ODEs numerically.

We carried out extensive numerical experiments on two nontrivial example problems and illustrated both the performance of the algorithm and its efficiency (in both accuracy and speed) compared to the traditional approach of direct discretization. We observed that, in general, the DR algorithm produced smaller errors and faster run times for these problems, most notably in the examples where we have increased the number of discretization points. From these numerical results, the DR algorithm could, in general, be recommended over Ipopt when high-quality solutions are desired.

Based on further extensive experiments, we conjectured on how the costate variables can be determined. We successfully used the costate variables constructed as in the conjecture, as well as the state constraint multipliers that can be calculated by using these costate variables, for the numerical verification of the optimality conditions.

We recall that Algorithm 2 involves repeated numerical solution of the ODEs in (4.2) with various ICs. To solve (4.2), we implemented the (explicit) Euler's method, which requires only a continuous right-hand side of the ODEs. Algorithm 2 appears to be successful for the examples partly owing to the fact that, in these examples, the optimal control is continuous. We tried to apply our approach to the machine tool manipulator problem from ^[16], which has 7 state variables, one control variable, and upper and lower bounds imposed on the single control variable and one of the state variables. However, our approach did not seem to yield a solution (so far) for this problem, conceivably owing to the fact that the optimal control variable, as well as the optimal costate variable vector, is not continuous, in that these variables jump a number of times during the process (see [16, Figure 5]). Note that discontinuities in the control make the right-hand side of (4.2) discontinuous, rendering Euler's method ineffective. Therefore, problems of the kind in ^[16] require further investigation.

We believe that our approach can be extended to more general convex optimal control problems, such as those with a non-quadratic objective function or mixed state and control constraints, as long as the pertaining proximal operators are not too challenging to derive.

It would also be interesting to employ and test, in the future, other projection-type methods, such as the Peaceman-Rachford algorithm [8, Section 26.4 and Proposition 28.8], which, to the best of the knowledge of the authors, has not been applied to optimal control problems.

Use of AI tools declaration

The authors declare that they have not used artificial intelligence tools in the creation of this article.

Acknowledgments

The authors thank the two anonymous reviewers whose comments and suggestions improved the paper. BIC was supported by an Australian government research training program scholarship.

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

References

[1]	S. Brunton, J. Kutz, Data-driven science and engineering: Machine learning, dynamical systems, and control, Cambridge University Press, 2022. https://doi.org/10.1017/9781009089517
[2]	E. Bollt, N. Santitissadeekorn, Applied and computational measurable dynamics, SIAM, 2013. https://doi.org/10.1137/1.9781611972641
[3]	E. Bollt, Geometric considerations of a good dictionary for Koopman analysis of dynamical systems: Cardinality, "primary eigenfunction, " and efficient representation, Commun. Nonlinear Sci., 100 (2021), 105833. https://doi.org/10.1016/j.cnsns.2021.105833 doi: 10.1016/j.cnsns.2021.105833
[4]	M. Budišić, R. Mohr, I. Mezić, Applied koopmanism, Chaos: An Interdisciplinary J. Nonlinear Sci., 22 (2012), 047510. https://doi.org/10.1063/1.4772195
[5]	J. Kutz, S. Brunton, B. Brunton, J. Proctor, Dynamic mode decomposition: data-driven modeling of complex systems, SIAM, 2016.
[6]	Y. Lan, I. Mezić, Linearization in the large of nonlinear systems and Koopman operator spectrum, Physica D: Nonlinear Phenomena, 242 (2013), 42–53. https://doi.org/10.1016/j.physd.2012.08.017 doi: 10.1016/j.physd.2012.08.017
[7]	A. Avila, I. Mezić, Data-driven analysis and forecasting of highway traffic dynamics, Nat. Commun., 11 (2020), 1–16. https://doi.org/10.1038/s41467-020-15582-5 doi: 10.1038/s41467-020-15582-5
[8]	I. Mezić, Spectral properties of dynamical systems, model reduction and decompositions, Nonlin. Dynam., 41 (2005), 309–325. https://doi.org/10.1007/s11071-005-2824-x doi: 10.1007/s11071-005-2824-x
[9]	I. Mezić, Spectrum of the Koopman operator, spectral expansions in functional spaces, and state-space geometry, J. Nonlinear Sci., 30 (2020), 2091–2145. https://doi.org/10.1007/s00332-019-09598-5 doi: 10.1007/s00332-019-09598-5
[10]	I. Mezić, A. Banaszuk, Comparison of systems with complex behavior, Physica D, 197 (2004), 101–133. https://doi.org/10.1016/j.physd.2004.06.015 doi: 10.1016/j.physd.2004.06.015
[11]	C. Rowley, I. Mezić, S. Bagheri, P. Schlatter, D. Henningson, Spectral analysis of nonlinear flows, J. Fluid Mech., 641 (2009), 115–127. https://doi.org/10.1017/S0022112009992059 doi: 10.1017/S0022112009992059
[12]	P. Schmid, Dynamic mode decomposition of numerical and experimental data, J. Fluid Mech., 656 (2010), 5–28. https://doi.org/10.1017/S0022112010001217 doi: 10.1017/S0022112010001217
[13]	M. Jovanovic, P. Schmid, J. Nichols, Low-rank and sparse dynamic mode decomposition, Center Turbulence Res. Annual Res. Briefs, 2012 (2012), 139–152.
[14]	I. Kevrekidis, C. Rowley, M. Williams, A kernel-based method for data-driven Koopman spectral analysis, J. Comput. Dynam., 2 (2016), 247–265.
[15]	M. Williams, I. Kevrekidis, C. Rowley, A data–driven approximation of the koopman operator: Extending dynamic mode decomposition, J. Nonlinear Sci., 25 (2015), 1307–1346. https://doi.org/10.1007/s00332-015-9258-5 doi: 10.1007/s00332-015-9258-5
[16]	Q. Li, F. Dietrich, E. Bollt, I. Kevrekidis, Extended dynamic mode decomposition with dictionary learning: A data-driven adaptive spectral decomposition of the Koopman operator, Chaos: An Interdisciplinary J. Nonlinear Sci., 27 (2017), 103111. https://doi.org/10.1063/1.4993854 doi: 10.1063/1.4993854
[17]	E. Kaiser, J. Kutz, S. Brunton, Data-driven approximations of dynamical systems operators for control, The Koopman Operator In Systems And Control: Concepts, Methodologies, And Applications, (2020), 197–234. https://doi.org/10.1007/978-3-030-35713-9_8
[18]	I. Mezić, Analysis of fluid flows via spectral properties of the Koopman operator, Annual Rev. Fluid Mech., 45 (2013), 357–378. https://doi.org/10.1146/annurev-fluid-011212-140652 doi: 10.1146/annurev-fluid-011212-140652
[19]	P. Gaspard, Chaos, scattering and statistical mechanics, Chaos, 2005.
[20]	R. Abraham, J. Marsden, Foundations of mechanics, American Mathematical Soc., 2008. https://doi.org/10.1090/chel/364
[21]	A. Ackleh, E. Allen, R. Kearfott, P. Seshaiyer, Classical and modern numerical analysis: Theory, methods and practice, Crc Press, 2009. https://doi.org/10.1201/b12332
[22]	D. Floryan, M. Graham, Charts and atlases for nonlinear data-driven models of dynamics on manifolds, arXiv Preprint arXiv: 2108.05928, (2021).
[23]	C. Fefferman, S. Mitter, H. Narayanan, Testing the manifold hypothesis, J. Am. Math. Soc., 29 (2016), 983–1049. https://doi.org/10.1090/jams/852 doi: 10.1090/jams/852
[24]	H. Narayanan, S. Mitter, Sample complexity of testing the manifold hypothesis, Adv. Neural Inf. Process. Syst., 23 (2010).
[25]	A. Izenman, Introduction to manifold learning, Wires. Comput. Stat., 4 (2012), 439–446. https://doi.org/10.1002/wics.1222 doi: 10.1002/wics.1222
[26]	J. Tenenbaum, V. Silva, J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319–2323. https://doi.org/10.1126/science.290.5500.2319 doi: 10.1126/science.290.5500.2319
[27]	S. Roweis, L. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, 290 (2000), 2323–2326. https://doi.org/10.1126/science.290.5500.2323 doi: 10.1126/science.290.5500.2323
[28]	M. Balasubramanian, E. Schwartz, The isomap algorithm and topological stability, Science, 295 (2002), 7. https://doi.org/10.1126/science.295.5552.7a doi: 10.1126/science.295.5552.7a
[29]	M. Belkin, P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Inf. Process. Syst., 14, 2001. https://doi.org/10.7551/mitpress/1120.003.0080
[30]	Z. Ma, Z. Zhan, Z. Feng, J. Guo, Manifold learning based on straight-like geodesics and local coordinates, IEEE T. Neural Net. Lear., 32 (2020), 4956–4970. https://doi.org/10.1109/TNNLS.2020.3026426 doi: 10.1109/TNNLS.2020.3026426
[31]	W. Boothby, W. Boothby, An introduction to differentiable manifolds and Riemannian geometry, Revised, Gulf Professional Publishing, 2003.
[32]	X. Chen, J. Weng, W. Lu, J. Xu, J. Weng, Deep manifold learning combined with convolutional neural networks for action recognition, IEEE T. Neural Net. Lear., 29 (2017), 3938–3952. https://doi.org/10.1109/TNNLS.2017.2740318 doi: 10.1109/TNNLS.2017.2740318
[33]	R. Wang, X. Wu, J. Kittler, Symnet: A simple symmetric positive definite manifold deep learning method for image set classification, IEEE T. Neural Net. Lear., 33 (2021), 2208–2222. https://doi.org/10.1109/TNNLS.2020.3044176 doi: 10.1109/TNNLS.2020.3044176
[34]	K. Lee, K. Carlberg, Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders, J. Comput. Phys., 404 (2020), 108973. https://doi.org/10.1016/j.jcp.2019.108973 doi: 10.1016/j.jcp.2019.108973
[35]	J. Bakarji, K. Champion, J. Nathan Kutz, S. L. Brunton, Discovering governing equations from partial measurements with deep delay autoencoders, P Royal Soc. A, 479 (2023), 20230422. https://doi.org/10.1098/rspa.2023.0422 doi: 10.1098/rspa.2023.0422
[36]	Y. LeCun, PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models), (Universite P. et M. Curie (Paris 6), 1987.
[37]	J. Zhai, S. Zhang, J. Chen, Q. He, Autoencoder and its various variants, 2018 IEEE International Conference On Systems, Man, And Cybernetics (SMC), (2018), 415–419. https://doi.org/10.1109/SMC.2018.00080
[38]	S. Gu, B. Kelly, D. Xiu, Autoencoder asset pricing models, J. Econometrics, 222 (2021), 429–450. https://doi.org/10.1016/j.jeconom.2020.07.009 doi: 10.1016/j.jeconom.2020.07.009
[39]	C. Bishop, N. Nasrabadi, Pattern recognition and machine learning, Springer, 2006.
[40]	B. Karlik, A. Olgac, Performance analysis of various activation functions in generalized MLP architectures of neural networks, Int. J. Artif. Intell. Expert Syst., 1 (2011), 111–122.
[41]	P. Pant, R. Doshi, P. Bahl, A. Barati Farimani, Deep learning for reduced order modelling and efficient temporal evolution of fluid simulations, Phys. Fluids, 33 (2021), 107101. https://doi.org/10.1063/5.0062546 doi: 10.1063/5.0062546
[42]	Z. Bai, Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems, Appl. Numer. Math., 43 (2002), 9–44. https://doi.org/10.1016/S0168-9274(02)00116-2 doi: 10.1016/S0168-9274(02)00116-2
[43]	D. Lucia, P. Beran, W. Silva, Reduced-order modeling: new approaches for computational physics, Prog. Aerosp. Sci., 40 (2004), 51–117. https://doi.org/10.1016/j.paerosci.2003.12.001 doi: 10.1016/j.paerosci.2003.12.001
[44]	N. Kazantzis, C. Kravaris, L. Syrou, A new model reduction method for nonlinear dynamical systems, Nonlinear Dynam., 59 (2010), 183–194. https://doi.org/10.1007/s11071-009-9531-y doi: 10.1007/s11071-009-9531-y
[45]	O. San, R. Maulik, Neural network closures for nonlinear model order reduction, Adv. Comput. Math., 44 (2018), 1717–1750. https://doi.org/10.1007/s10444-018-9590-z doi: 10.1007/s10444-018-9590-z
[46]	R. Fu, D. Xiao, I. Navon, F. Fang, L. Yang, C. Wang, et al., A non-linear non-intrusive reduced order model of fluid flow by auto-encoder and self-attention deep learning methods, Int. J. Numer. Meth. Eng., (2023). https://doi.org/10.1002/nme.7240
[47]	N. Aubry, P. Holmes, J. Lumley, E. Stone, The dynamics of coherent structures in the wall region of a turbulent boundary layer, J. Fluid Mech., 192 (1988), 115–173. https://doi.org/10.1017/S0022112088001818 doi: 10.1017/S0022112088001818
[48]	G. Berkooz, P. Holmes, J. Lumley, The proper orthogonal decomposition in the analysis of turbulent flows, Annu. Rev. Fluid Mech., 25 (1993), 539–575. https://doi.org/10.1146/annurev.fl.25.010193.002543 doi: 10.1146/annurev.fl.25.010193.002543
[49]	P. Holmes, J. Lumley, G. Berkooz, C. Rowley, Turbulence, coherent structures, dynamical systems and symmetry, Cambridge university press, 2012. https://doi.org/10.1017/CBO9780511919701
[50]	H. Hotelling, Analysis of a complex of statistical variables into principal components, J. Educ. Psychol., 24 (1933), 417–441. https://psycnet.apa.org/doi/10.1037/h0071325 doi: 10.1037/h0071325
[51]	E. Lorenz, Empirical orthogonal functions and statistical weather prediction, Massachusetts Institute of Technology, Department of Meteorology Cambridge, 1956.
[52]	M. Loeve, Probability theory: foundations, random sequences, New York, NY: Van Nostrand, 1955.
[53]	K. Taira, S. Brunton, S. Dawson, C. Rowley, T. Colonius, B. McKeon, et al., Modal analysis of fluid flows: An overview, Aiaa J., 55 (2017), 4013–4041. https://doi.org/10.2514/1.J056060 doi: 10.2514/1.J056060
[54]	P. Schmid, L. Li, M. Juniper, O. Pust, Applications of the dynamic mode decomposition, Theor. Comp. Fluid Dyn., 25 (2011), 249–259. https://doi.org/10.1007/s00162-010-0203-9 doi: 10.1007/s00162-010-0203-9
[55]	B. Brunton, L. Johnson, J. Ojemann, J. Kutz, Extracting spatial–temporal coherent patterns in large-scale neural recordings using dynamic mode decomposition, J. Neurosci. Meth., 258 (2016), 1–15. https://doi.org/10.1016/j.jneumeth.2015.10.010 doi: 10.1016/j.jneumeth.2015.10.010
[56]	E. Berger, M. Sastuba, D. Vogt, B. Jung, H. Amor, Dynamic mode decomposition for perturbation estimation in human robot interaction, The 23rd IEEE International Symposium On Robot And Human Interactive Communication, (2014), 593–600. https://doi.org/10.1109/ROMAN.2014.6926317
[57]	B. Koopman, Hamiltonian systems and transformation in Hilbert space, P. Natl. Acad. Sci., 17 (1931), 315–318. https://doi.org/10.1073/pnas.17.5.315 doi: 10.1073/pnas.17.5.315
[58]	E. Bollt, Q. Li, F. Dietrich, I. Kevrekidis, On matching, and even rectifying, dynamical systems through Koopman operator eigenfunctions, SIAM J. Appl. Dyn. Syst., 17 (2018), 1925–1960. https://doi.org/10.1137/17M116207X doi: 10.1137/17M116207X
[59]	T. Kanamaru, Van der Pol oscillator, Scholarpedia, 2007. Available from: http://www.scholarpedia.org/article/Van_der_Pol_oscillator
[60]	I. Triandaf, I. Schwartz, Karhunen-Loeve mode control of chaos in a reaction-diffusion process, Phys. Rev. E, 56 (1997), 204–212. https://doi.org/10.1103/PhysRevE.56.204 doi: 10.1103/PhysRevE.56.204
[61]	H. Goldstein, C. Poole, J. Safko, Classical mechanics, American Association of Physics Teachers, 2002.
[62]	F. Takens, Detecting strange attractors in turbulence, Dynamical Systems And Turbulence, Warwick 1980: Proceedings Of A Symposium Held At The University Of Warwick 1979/80, (2006), 366–381. https://doi.org/10.1007/BFb00919
[63]	D. Ruelle, F. Takens, On the nature of turbulence, Les Rencontres Physiciens-mathématiciens De Strasbourg-RCP25, 12 (1971), 1–44.
[64]	K. Falconer, Fractal geometry: Mathematical foundations and applications, John Wiley & Sons, 2004. 10.1002/0470013850
[65]	M. Adachi, Embeddings and immersions, American Mathematical Soc., 2012. https://doi.org/10.1090/mmono/124
[66]	A. Skopenkov, Embedding and knotting of manifolds in Euclidean spaces, London Math. Soc. Lecture Note Series, 347 (2008), 248. https://doi.org/10.1017/CBO9780511666315.008 doi: 10.1017/CBO9780511666315.008

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)