1.
Introduction
The applicability of dynamic programming results for optimal control and for optimal growth problems [1,2,3,4], in which fixed-point theory-based methods were used, is well-known. In this article we investigate the application of the Ćirić fixed-point theorem to prove the existence and uniqueness results of the solution of the dynamic programming Bellman's equation under assumptions that are significantly weaker than the ones generally considered in the specialty literature [5].
Our work afforded us to further extend the topic of an already relevant class of methods to solve optimization problems (notably optimal control, respectively dynamic programming), that have a long study in the history of mathematical techniques.
Discrete Markov decision models for the financial field may be studied using the dynamic programming principles developed by Richard Bellman (1956). Dynamic programming is based on the Principle of Optimality, which was explained by Bellman in the following text: "An optimal policy has the property that, whatever the initial state and decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision" [6].
For the simulation of the mathematical model, Ricard Torres published in 2014 a MATLAB script implementing policy iteration [7]. Torres' result provides insight to how we can programmatically find the optimal policies, by developing a way to programmatically organize the Dynamic Programming problem into a cost network.
Dynamic programming and fixed point theory are important mathematical tools in the financial and economic fields (for modern financial, respectively econometric modulization). The book by Stokey and Lucas [8] published in 1989 is an example of an important book in this direction. Numerical methods to solve dynamic programming (DP) models can be DP models with sequential decision making, the optimal inventory model (Arrow et al. [9]), the optimal investment model (Lucas and Prescott [10]), the optimal growth model under uncertainty (Brock and Mirman [11]), the asset pricing models (Lucas [12] and Brock [13]), the business cycle model (Kydland and Prescott [14]) or DP formulated as a Markov decision process (MDP) (Androulakis [15]).
Blackwell (1965) and Denardo (1967) showed that the Bellman operator is a contraction mapping. Kamihigashi [16] showed that the Bellman operator has a unique fixed point, and this fixed point is a value function. The consistent Bellman operator (Bellemare et al. [17]) is a modified operator that addresses the problem of inconsistency of the optimal action-value functions for suboptimal actions. The distributional Bellman operator (Bellemare et al. [18]) enables operation of the whole return distribution, instead of its expectation, i.e., the value function (Bellemare et al. [19], Bellman [20]). The logistic Bellman operator uses a logistic loss to solve a convex linear programming problem to find optimal value functions (Bas-Serrano et al. [21]).
The Markov decision process (MDP) is a fundamental framework for stochastic games, control design in stochastic environments, reinforcement learning, etc. (Filar and Vrieze [22]). Relying on the fact that the optimal value function is the fixed point of the Bellman operator, dynamic programming methods iteratively use different types of the Bellman operator to converge to the optimal value function (Puterman [23], Alden and Smith [24]).
The literature on infinite horizon optimization and its applications is rich. For example, in 1989, Schochetman and Smith [25] considered the general problem of choosing a discounted cost. They solved examples for equipment replacement and production planning, by applying the study of the minimizing infinite sequence of decisions from a closed subset of the product space formed by a sequence of arbitrary compact metric spaces.
Modeling for optimizing production was analyzed in 1955 by Chanes et al. [26] and by Modigliani and Hohn [27], and sequential production planning was described in an article from the year 1957 (Johnson [28]).
The existence of forecast horizons in undiscounted discrete-time lot size models was investigated in 1990 by Chand et al. [29].
The literature on infinite horizon optimization is vast and encompasses diverse fields, including finance, electrical engineering (especially control systems), economics, operation research, management science, statistics and mathematics (Sethi and Thompson [30], Denardo [31], Cheevaprawatdomrong [32], Bes and Sethi [33]).
The planning of the horizon model of cash management was investigated by Sethi in 1971 (Sethi [34]), and optimal backlogging over an infinite horizon under time-varying convex production and inventory costs was studied by Ghate and Smith in an article from the year 2009 (Ghate and Smith [35]).
The use of mathematics within the field of finance has been increasing, for analyzing and solving problems such as development [1], control theory, differential game theory, capital asset pricing model from stochastic optimum [37], risk management, derivative security pricing and valuation, portofolio creation and structuring [38], efficiency quantification of capital markets [39], quantitative investing strategies, models of growth, exchange economy and effects of mergers on consumers on both sides of the market [40], etc.
Fixed point theory plays an important role in topology, nonlinear analysis, dynamic optimization and obtaining results in the theory of differential and integral equations, notably in the existence of differential and integral equations or inclusions. These results are essential in many branches of science, economics, management and finance; thus, its increasing use in control and optimization is not surprising [1,41,42].
In [43], Richard Bellman introduced dynamic programming, which is typically useful to investigate problems that involve choices to be made over an infinite number of periods, such as the problem that considers a periodic wage offer for a worker. The acceptance implies receiving this wage in all future periods while the rejection implies receiving a new wage offer in the next period. Another example that is increasingly important concerns choosing how to allocate output between consumption and investment, that is, the consumption yields utility in the current period, while investment increases future production.
Based on the work of Torres, in this article, we will develop tools that allow us to solve the stationary, infinite-horizon optimization problems, that is, precisely, the maximization of the utility of households under weaker assumptions than those considered so far.
Let us formulate our problem as the following:
We can solve (1.1) by defining and solving the following associated functional equation:
where Kt={kt}∞0, kt is Capital at start of period t, and f(kt) is the production. ct is the consumption, and this means
U(.) is the current period of utility function, and β∈(0,1).
The well-known Banach contraction principle states that, if (X,d) is a complete metric space, and T:X→X is a mapping satisfying
for some σ∈(0,1) and for all x,y∈X, then T has a unique fixed point x∗, and the sequence {xn} generated by the iterative process xn+1=Txn converges to x∗ for some x∈X.
The Banach contraction has been widely generalized in several settings. In [44], Ćirić introduced quasi contraction map and showed that the condition of quasi contraction implies all the conditions of Banach's contraction principle.
In this paper we will consider Ćirić contractions in Banach space, and the main result consists of the existence of a fixed point. The practical relevance of our result is that it can be applied even to discontinuous operators. An important class of problems consists of, for example, infinite-horizon iterative schemes for optimization problems with state constraints for which the value function is merely lower semi-continuous.
In this paper we investigate the Ćirić operator to prove the properties of infinite-horizon problems. A self map T:X→X on a metric space (X,d) is said to be a Ćirić mapping if, for all x and y in X,
We can write
To see the relevance of this extension, just consider the following very simple example Ćirić contractive mapping that is not a contraction. Let T:X→X be defined by
The mapping T is Ćirić type with σ=35. Indeed, if both x and y are in X1 or in X2, then
Therefore, T satisfies the condition
and hence the inequality (1.4).
To show that T is not Banach contraction on X, let x=9991000 and y=10011000.
Then, we have d(x,y)=|9991000−10011000|=4020000
and, on other side, d(Tx,Ty)=99120000.
Then, we get, d(Tx,Ty)=99120000>σ4080=σd(x,y), as σ∈(0,1). Therefore, the Banach contraction condition is not satisfied.
The Ćirić contractive map does not have to be continuous in general, but it is always convergent to a fixed point.
This article is organized as follows. In the next section, we formulate the infinite horizon problem that we are going to investigate in two cases: Implicit and fully explicit. Then, we provide the basic definitions, as well as the assumptions to be satisfied by its data. In the ensuing section, Section 3, we present several fixed point results that are fundamental for the development of our contributions: Notably, existence, monotonicity, and attainability. Also pertinent to our results is the Ćirić contraction [44], which will also be introduced in this section. In Section 5, we show that the convergence in norm with probability is one of the iterative procedures defined for our problem under the stated assumptions and application to dynamical programming. Section 6, numerical simulation, contains an algorithm in C++, which was applied for a particular case. Finally, some conclusions, and prospective future work are briefly addressed in Section 7.
2.
Problem formulation
This section is organized into two parts. In the first part, we introduce the operator that is defined via an explicit function in a complete metric space, while in the second part, we study the one that defines an implicit function.
2.1. Explicit case
The optimal value function is a unique solution of the Bellman equation given by
Given some positive function ν:X⇒R, we denote by C(X) the set of functions v such that ‖v‖<∞, where the norm ‖⋅‖ on C(X) is defined by ‖⋅‖:
Hence, C(X) is complete with metric induced by norm. To proceed further and in order to get the optimal cost function, we need to consider for each policy y∈M the mapping T:C(X)→C(X) defined by
Thus, in order to solve Bellman's equation, we only need to introduce the material that leads us to prove the existence and uniqueness of the fixed point, that is, Tv∗=v∗.
Example 2.1. One sector sustainable growth
Further, we need the following notations related to the economy concepts.
- Capital at the start of period t is denoted by kt.
- Production (including depreciated capital) is f(kt) and consumption is ct,
so kt+1=f(kt)−ct.
Given k0≥0 and 0≤kt+1≤f(kt), the planner maximizes
Let U:R+→R be an increasing and bounded map and let δ∈(0,1). Suppose the problem has a solution for all k0≥0. Let us define the value function v:R+→R, where v(k0) is value of the maximized objective function given k0.
Then, the planner's problem in period 0 is maxc0,k1(U(c0)+δV(k1), subject to c0+k1=f(k0), c0,k1≥0 and k0≥0 given.
The value function must satisfy the following functional equation:
Note v∈B(X), where X=R+.
We define T:C(X)→C(X) as follows:
A solution is a function v∗ satisfying v∗=Tv∗.
2.2. Implicit case
Note, however, that evaluating an optimal policy requires not only availability of the optimal value function v∗ but also the dynamics function f and stage cost function U. If f and U are unknown, then knowing v∗ is not sufficient for evaluating an optimal policy. To overcome this issue, we will use the optimal H-function to compute an optimal policy without knowing f and U explicitly.
We consider a set X of states, a set N of controls (the current period of utility) and, for each x∈X, a nonempty control constraint N(x)⊂N. We denote by M the set of all functions μ:X→N with μ(x)∈N(x) for all x∈X, which will be referred to as current period choices.
We denote by V(X) the set of functions v:X→R and by ˉV(X) the set of functions v:X→¯R where ¯R=R∪{−∞,∞}. We study the operator of the form
and, for each policy μ∈M, we consider the mapping T:V(X)→ˉV(X) defined by
In view of the definition of M, T and Tμ, we have the following relations:
Example 2.2. Let T:C(X)→C(X) be defined as follows:
where
and
Then, it is clear that T is a Ćirić contractive map, but it does not satisfy the Banach contraction condition.
3.
Auxiliary results
In this section, many results that will be instrumental in the proof of the main result of this article are presented.
Definition 3.1. (Ćirić map) A self map T of B(X) is called a Ćirić contractive map if
for all v,v′∈C(X), where σ∈(0,1).
Assumption 3.1. The self map Tμ is a Ćirić contractive map.
Theorem 3.1. (Existence) Let the operators T,Tμ:C(X)→C(X) be Ćirić contractive. Then, T and Tμ have, respectively, v∗ and vμ as fixed points.
Lemma 3.1. The following hold:
i) For an arbitrary v0∈C(X), the sequence {vk} defined by vk+1=Tμvk converges in norm to vμ.
ii) For an arbitrary v0∈C(X), the sequence {vk} defined by vk+1=Tvk converges in norm to v∗.
Lemma 3.2. C(X) is complete with respect to the topology induced by ‖⋅‖.
It is not difficult to observe that C(X) is closed and convex. Thus, given {vk}∞k=1⊂C(X) and v∈C(X), if vk→V in the sense that limk→∞‖vk−v‖=0, then limk→∞vk(x)=v(x) for all x∈X.
Now, we introduce the following standard assumptions:
Assumption 3.2. (Well-posedness) For all v∈C(X) and for allμ∈M, we have that Tμv∈C(X) and Tv∈C(X).
From Definition 3.1, we conclude that every contraction T is also a Ćirić contractive map. However, the inverse is not always true.
We will require the following properties to hold.
Assumption 3.3. (Monotonicity) For allv,v′∈C(X), we have that v≤v′ implies
where "≤" is defined in a pointwise sense in X.
Assumption 3.4. (Attainability) For all v∈C(X), there exists μ∈M such that Tμv=Tv.
4.
Main results
Blackwell's sufficient condition for Ćirić contractive map
Theorem 4.1. (Blackwell [7]) Let T be an operator defined on C(X) satisfying the following properties:
(a) [monotonicity] For all x∈X and for all v,v′∈C(X), v(x)≤v′(x), implies that for all x∈X,Tv(x)≤Tv′(x).
(b) [discounting] There exists some β∈(0,1) such that
for allv∈C(X),a≥0,x∈X,[T(v+a)](x)≤(Tv)(x)+βa.
Then, T is a Ćirić map with modulus β.
Proof. For any v and v′∈C(X) and any x∈X,
Hence, v(x)≤M(v,v′)+v′(x) for all x∈X, so
Suppose T:C(X)→C(X) satisfies Blackwell's condition. Then, there exists β∈(0,1) such that
By going through the exact same reasoning, by reversing the role of v and v′ we will end up with
Hence, for all x∈X, with |Tv(x)−Tv′(x)|≤βM(v,v′), then
Thus, T is a Ćirić contractive map.
5.
Convergence result
Theorem 5.1. Let T:C(X)→C(X) be a Ćirić contractive map. Then, we have the following.
1) There exists a unique solution to the Bellman equation Tv=v.
2) The solution can be found by iterating the relation vk+1=Tvk.
Remark 5.1. The unique solution of the Bellman equation here is the optimal cost function, which is the maximum of the utility.
Remark 5.2. Unlike the Banach contraction operator, the Ćirić operator is not necessarily continuous.
5.1. Example and implementation
Stochastic shortest path problem
Next, we present an example illustrating our results.
Let
where
g:X×U→R is incurred when control u∈U(x) is selected at state x.
Let ν(x)=1, for all x∈X. Given an arbitrary v0∈B(X), since H(⋅,⋅,⋅) is a Ćirić contractive map, it is clear that the Bellman equation defined by
is not a Banach contraction, but it is a Ćirić contractive map.
5.2. Another application to dynamical programming
If in [37] we have
where f∈Bd(Kt) is the set of all real-valued bounded functions on Kt, we define a norm on Bd(Kt) by ‖f‖=supt∈[0,∞)|f(kt)| for all f∈Bd(Kt).
Then, (Bd(Kt),‖⋅‖) forms a Banach space equipped with the metric defined by d(f1,f2):=supt∈[0,∞)|f1(kt)−f2(kt)| for all f1,f2∈Bd(Kt).
We are going to analyze the existence of the solution for the functional equation [46] by using the concept of the extended interpolative Reich-Rus type ψF-contraction. For this, we use an operator T:Bd(Kt)→Bd(Kt) defined by
for all f∈Bd(Kt).
Obviously, for the cases in which U and v are bounded, T becomes well-defined.
Theorem 5.2. Let T:Bd(Kt)→Bd(Kt) be an operator defined by (5.4) and the following properties:
1) U and v are continuous and bounded.
2) For all f1,f2∈Bd(Kt)∖Fix(T), satisfying
the Eq (5.3) has a bounded solution.
Proof. Let ϵ>0 and f1,f2∈Bd(Kt)∖Fix(T). Then, there exist k1,k2 such that
From (5.6) and (5.7) results
Analogously, (5.8) and (5.9) imply
From (5.10) and (5.11) results
Since ε>0 is arbitrary, we have
for all k, and thus,
We apply the logarithm function, and we obtain
The relation (5.13) is nothing but extended interpolative Reich-Rus type ψF-contractions for the operator T, with F(x)=lnx, ψ(x)=x−1,a=b=c=13 and s=1, because: A mapping T:Bd(kt)→Bd(kt) (Bd(kt) being a b - metric space) is an extended interpolative Reich-Rus type ψT-contraction if there exist F∈F and ψ∈Ψ such that for all f1,f2∈Bd(kt)∖Fix(T) with Tf1≠Tf2,
for some constants a,b,c∈[0,1] with 0<a+b+c≤1,
where F denotes the class of all functions F:(0,∞)→R for which the following hold:
(F1) limn→∞xn=0⇔limn→∞F(xn)=−∞, for all sequences {xn}∈(0,∞).
(F2) There exists k∈(0,1), and xkF(x)→0 for x→0+.
Ψ={ψ:R→R|ψ is monotone increasing and ψn(t)→−∞ for n→∞ for all t∈R}; ψn is the n−t composition of ψ [45].
We have ψn(x)=x−n, and for f∗∈Bd(Kt)∖Fix(T),β∗=d(f∗,Tf∗)=supt∈[0,∞)|f∗(t)−Tf∗(t)|∈R∗. Then, for all p∈(0,1), the series ∑n|ψnF(f∗)|−1/p is convergent. Hence, by Theorem 3.1 from [46], T has a fixed point in Bd(Kt), and that implies that Eq (5.3) has a bounded solution.
6.
Numerical simulation
Example 6.1. We consider the functions f:R→R+, U:R+→R, U(t)=t+1t+2, v:R→R, v(y)=2y+1y+1, the parameter β∈(0,1) and a point x∈R. Define the function hβ,x:[0,f(x)]→R,
After a simple calculation we find
It follows that h′β,x(y)=0 if and only if y=y0=f(x)+2−1√β1√β+1. If y0∈[0,f(x)], then h′β,x(y)≥0 for y∈[0,y0], and h′β,x(y)≤0 for y∈[y0,f(x]). Consequently, y0=f(x)+2−1√β1√β+1 is the unique maximum point of the function hβ,x(y) on the interval [0,f(x)]. Moreover, the maximum value of the function hβ,x(y) on the interval [0,f(x)] is given by
Considering the particular case for β=14, we find that y0=f(x)3∈[0,f(x)] is the unique maximum point of the function h14,x(y) on the interval [0,f(x)]. The maximum value of the function h14,x(y) on the interval [0,f(x)] is
ALGORITHM
Input parameters
- The analytical expression of the function f:R→R+;
- The analytical expression of the function U:R+→R;
- The analytical expression of the function v:R→R;
- The parameter β∈(0,1);
- The point x∈R for which we calculate the solution of the optimization problem
maxy∈[0,f(x)]{U[f(x)−y]+βv(y)}=−miny∈[0,f(x)]{−U[f(x)−y]−βv(y)};
- The error for the computation of the maximum point of the optimization problem, ε>0.
Output parameters
xmax - the maximum point of the optimization problem, calculated with the error ε;
gmax - the maximum value of the function to be optimized;
n - the number of iterations.
Pseudocode
Define the function gβ,x:[0,f(x)]→R,
gβ,x(y)=−U[f(x)−y]−βv(y)
A=0
B=f(x)
F0=1
F1=1
n=1
While Fn≤B−Aε do
n=n+1 Fn=Fn−1+Fn−2
End While
a1=A
b1=B
λ1=a1+Fn−2Fn∗(b1−a1)
μ1=a1+Fn−1Fn∗(b1−a1)
For k=1,2,...,n−1 do
If gβ,x(λk)<gβ,x(μk) then
ak+1=ak
bk+1=μk
λk+1=ak+1+Fn−k−2Fn−k∗(bk+1−ak+1)
μk+1=λk
else
ak+1=λk
bk+1=bk
λk+1=μk
μk+1=ak+1+Fn−k−1Fn−k∗(bk+1−ak+1)
End If
End For
xmax=an+bn2
gmax=−gβ,x(xmax)
PROCEDURE IN C++
#include <iostream>
#include <math.h>
using namespace std;
double f(double x)
{
return pow(2, -fabs(x));
}
double U(double t)
{
return (t+1)/(t+2);
}
double v(double y)
{
return (2*y+1)/(y+1);
}
double g(double beta, double x, double y)
{
return -U(f(x)-y)-beta*v(y);
}
void InputParameters(double & beta, double & x, double & eps)
{
cout << "beta="; cin >> beta;
cout << "x="; cin >> x;
cout << "eps="; cin >> eps;
}
void OptimizationMethod(double beta, double x, double eps, int & n, double F[100], double a[100], double b[100], double lambda[100], double miu[100], double & ymax, double & gmax)
{
double A, B;
int k;
A = 0;
B = f(x);
F[0] = 1;
F[1] = 1;
n = 1;
while (F[n]<=(B-A)/eps)
{
n++;
F[n] = F[n-1]+F[n-2];
}
a[1] = A;
b[1] = B;
lambda[1] = a[1]+F[n-2]*(b[1]-a[1])/F[n];
miu[1] = a[1]+F[n-1]*(b[1]-a[1])/F[n];
for(k = 1;k<=n-1;k++)
if (g(beta, x, lambda[k])<g(beta, x, miu[k]))
{
a[k+1] = a[k];
b[k+1] = miu[k];
lambda[k+1] = a[k+1]+F[n-k-2]*(b[k+1]-a[k+1])/F[n-k];
miu[k+1] = lambda[k];
}
else
{
a[k+1] = lambda[k];
b[k+1] = b[k];
lambda[k+1] = miu[k];
miu[k+1] = a[k+1]+F[n-k-1]*(b[k+1]-a[k+1])/F[n-k];
}
ymax = (a[n]+b[n])/2;;
gmax = -g(beta, x, ymax);
}
void OutputParameters(double beta, double x, int n, double F[100], double a[100], double b[100], double lambda[100], double miu[100], double ymax, double gmax)
{
int k;
cout<<"The number of iterations is n="<<n<<endl;
cout<<"k g(beta, x, lambda[k]) g(beta, x, miu[k]) a[k] b[k] lambda[k] miu[k]"<<endl;
cout.precision(7);
for(k = 1;k<=n; k++)
cout<<k<<" "<<g(beta, x, lambda[k])<<" "<<g(beta, x, miu[k])<<" "<<a[k]<<" "
<<b[k]<<" "<<lambda[k]<<" "<<miu[k]<<endl;
cout<<"The maximum point of the function is ymax="<<ymax<<endl;
cout<<"The maximum value of the function is gmax="<<gmax<<endl;
}
int main()
{
int n;
double beta, x, eps, F[100], a[100], b[100], lambda[100], miu[100], ymax, gmax;
InputParameters(beta, x, eps);
OptimizationMethod(beta, x, eps, n, F, a, b, lambda, miu, ymax, gmax);
OutputParameters(beta, x, n, F, a, b, lambda, miu, ymax, gmax);
return 0;
}
TEST DATA
f:R→R+, f(x)=2−|x|
U:R+→R, U(t)=t+1t+2
v:R→R, v(y)=2y+1y+1
beta = 0.25
x = -0.25
eps = 0.00001
RESULTS
The number of iterations is n = 25
The maximum point of the function is ymax = 0.2803
The maximum value of the function is gmax = 0.9141993
The outputs and outcomes are presented in Table 1 and also in a graphical representation in Figure 1:
7.
Conclusions
In this article, we approached an economic/financial problem, the stationary infinite horizon problem, under a much wider class of contractive operators by applying methods of fixed point theory. By resorting to the Ćirić operator and the Reich-Rus type ψF-contraction, we proved the convergence, the existence and the uniqueness of the results of the optimal cost function of the infinite horizon problem in Banach space. The numerical simulation showed that a stationary infinite horizon problem can be solved for particular cases with applications in the financial/economic field.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
Project financed by Lucian Blaga University of Sibiu through the research grant LBUS-IRG-2023.
The first author acknowledges the support of FCT for the grant 2021.07608.BD, the ARISE Associated Laboratory LA/P/0112/2020, and the R&D Unit SYSTEC - Base - UIDB/00147/2020 and Programmatic - UIDP/00147/2020 funds.
Conflict of interest
The authors declare that they have no conflict of interest. The authors contributed equally to the writing of this manuscript. All authors have approved the final version of the manuscript.