n | sn | Sn | Vn(x) | fn(x) |
1 | 49.79 | 53.89 | 1200.22 | 13.89 |
3 | 52.75 | 55.28 | 1438.65 | 14.80 |
5 | 55.72 | 55.75 | 1496.62 | 15.28 |
7 | 53.82 | 53.89 | 1497.04 | 15.75 |
8 | 53.82 | 53.89 | 1498.92 | 15.75 |
This paper presents a characterization of (s,S)-inventory policies for Lindley systems with possibly unbounded costs, where the objective is to minimize the expected discounted total cost by ordering (production) strategies. Moreover, the existence of a subsequence of minimizers of the value iteration functions that converge to a (s,S) optimal inventory system policy is shown. A numerical example is given to illustrate the theory.
Citation: Rubén Blancas-Rivera, Hugo Cruz-Suárez, Gustavo Portillo-Ramírez, Ruy López-Ríos. (s,S) Inventory policies for stochastic controlled system of Lindley-type with lost-sales[J]. AIMS Mathematics, 2023, 8(8): 19546-19565. doi: 10.3934/math.2023997
[1] | Xiaolong Li . Asymptotic optimality of a joint scheduling–control policy for parallel server queues with multiclass jobs in heavy traffic. AIMS Mathematics, 2025, 10(2): 4226-4267. doi: 10.3934/math.2025196 |
[2] | K. Jeganathan, S. Selvakumar, N. Anbazhagan, S. Amutha, Porpattama Hammachukiattikul . Stochastic modeling on M/M/1/N inventory system with queue-dependent service rate and retrial facility. AIMS Mathematics, 2021, 6(7): 7386-7420. doi: 10.3934/math.2021433 |
[3] | YeongJae Kim, YongGwon Lee, SeungHoon Lee, Palanisamy Selvaraj, Ramalingam Sakthivel, OhMin Kwon . Design and experimentation of sampled-data controller in T-S fuzzy systems with input saturation through the use of linear switching methods. AIMS Mathematics, 2024, 9(1): 2389-2410. doi: 10.3934/math.2024118 |
[4] | Avijit Duary, Md. Al-Amin Khan, Sayan Pani, Ali Akbar Shaikh, Ibrahim M. Hezam, Adel Fahad Alrasheedi, Jeonghwan Gwak . Inventory model with nonlinear price-dependent demand for non-instantaneous decaying items via advance payment and installment facility. AIMS Mathematics, 2022, 7(11): 19794-19821. doi: 10.3934/math.20221085 |
[5] | Omar Kahouli, Amina Turki, Mohamed Ksantini, Mohamed Ali Hammami, Ali Aloui . On the boundedness of solutions of some fuzzy dynamical control systems. AIMS Mathematics, 2024, 9(3): 5330-5348. doi: 10.3934/math.2024257 |
[6] | Wentao Le, Yucai Ding, Wenqing Wu, Hui Liu . New stability criteria for semi-Markov jump linear systems with time-varying delays. AIMS Mathematics, 2021, 6(5): 4447-4462. doi: 10.3934/math.2021263 |
[7] | Mohammad H. Almomani, Mahmoud H. Alrefaei . Selecting a set of best stochastic inventory policies measured by opportunity cost. AIMS Mathematics, 2023, 8(2): 4892-4906. doi: 10.3934/math.2023244 |
[8] | Linhong Li, Wei Xu, Zhen Wang, Liwei Liu . Improving efficiency of the queueing system with two types of customers by service decomposition. AIMS Mathematics, 2023, 8(11): 25382-25408. doi: 10.3934/math.20231295 |
[9] | Doaa Basalamah, Bader Alruwaili . The weighted Lindley exponential distribution and its related properties. AIMS Mathematics, 2023, 8(10): 24984-24998. doi: 10.3934/math.20231275 |
[10] | Yuxin Lou, Mengzhuo Luo, Jun Cheng, Xin Wang, Kaibo Shi . Double-quantized-based H∞ tracking control of T-S fuzzy semi-Markovian jump systems with adaptive event-triggered. AIMS Mathematics, 2023, 8(3): 6942-6969. doi: 10.3934/math.2023351 |
This paper presents a characterization of (s,S)-inventory policies for Lindley systems with possibly unbounded costs, where the objective is to minimize the expected discounted total cost by ordering (production) strategies. Moreover, the existence of a subsequence of minimizers of the value iteration functions that converge to a (s,S) optimal inventory system policy is shown. A numerical example is given to illustrate the theory.
This paper considers a discrete-time, single-item, infinite-stock inventory system in which excess demand is not backlogged. The dynamics of the system is determined by a random walk with a lower barrier; in particular, the state zero is considered as a barrier. Also, it is assumed that the control variable is affected by a random noise in the dynamic of the system; see (2.1). Furthermore, the following costs are considered: ordering/production, holding, and shortage. Then, the objective function is the infinite-horizon expected total discounted cost. The goal of the paper is to guarantee the existence of stationary policies and characterize the optimal stationary policies as (s,S) policies (see [5,18,23,28]). The (s,S)-policies operate in the following manner: If the inventory level falls below a minimum s≥0, the controller will request a replenishment demand to restore the inventory stock to a maximum number S≥s.
As mentioned earlier, a crucial component of the inventory model is the dynamics of the system; in this paper is considered a controlled version of a Lindley random walk. This random walk has been used to model waiting times in a queuing system with a server (see [20]). A controlled version of this dynamic was introduced in [14] to illustrate the convergence of the value iteration algorithm for Markov control processes with average costs. Furthermore, this dynamic has been used in several contexts of Markov control processes; see, e.g., [16,27]. Despite this, in these works, the optimal policy is not characterized, except by [11], where it is considered as a compact action set and bounded cost. The characterization of (s,S) policies is an interesting problem in inventory theory, this problem can be traced back to the papers of [13,22,28]. One advantage of guaranteeing the existence of (s,S) policies is their ease of implementation. However, it is not always possible to guarantee the existence of a (s,S) inventory policy; see, for instance, [2]. Therefore, it is necessary to establish conditions in the stochastic control model that guarantee the existence of (s,S) inventory policies. In addition, there is recent work in applied areas where strategies (s, S) are implemented in the real world, such as in healthcare [1,29] and in machine learning [8], to name a few. A survey work presenting applications of inventory theory in the real world can be found in [22]. On the other hand, recent research from the theoretical point of view of the model is presented in [6], where the optimality of strategies (s, S) is proved as a particular case of their setup concave linear piecewise ordering costs. The perishable inventory with backlogging demand was studied in [30]. In contrast to the work described in this manuscript, there is an extensive literature on continuous-time inventories, e.g., the recent work of [4], which studies perishable inventories.
Based on the review conducted, the main objectives of this work can be described as follows:
● To provide conditions that guarantee the optimality of inventory strategies for a system with lost sales (see Theorem 4.10).
● To propose approximation procedures for the strategies described in the previous point (see Theorem 5.3).
It is important to note that the goal of the first point is a more complex challenge, as opposed to a backlog inventory system as described in [22].
The methodology of the paper is as follows. First, the dynamic programming approach is validated, and the existence of optimal stationary policies is verified. Second, results of convex analysis are applied to guarantee that the minimizers of the value iteration functions and the optimal policies of the inventory system are (s,S)-policies. This characterization is achieved under assumptions of continuity and monotonicity in the components of the inventory control model; these conditions guarantee that the cost function is convex. The convexity property has been used previously in the inventory systems literature, see for example [9,10,13,18,23,25,26,28,31].
The paper is organized as follows. Section 2 introduces the inventory control model and the assumptions about the components of the inventory control model. Section 3 introduces the dynamic programming approach. Section 4 presents the characterization of (s,S) policies. Section 5 contains an analysis of the convergence of minimizers of value iteration. Finally, a numerical example is given in Section 6.
Consider a discrete-time inventory control system. If Xt denotes the inventory at time t=0,1,…, the evolution of the system is modeled by the following Lindley-type dynamical system:
Xt+1=(Xt+ηtat−Dt+1)+, | (2.1) |
with X0=x∈X:=[0,∞) known, where
i) {ηt} is a sequence of independent and identically distributed (i.i.d.) Bernoulli random variables with parameter p, 0<p<1, in this case the event {ηt=1} means that an order placed at instant t has been supplied.
ii) at denotes the control or decision applied at time t and represents the quantity ordered by the inventory manager (or decision maker).
iii) The sequence {Dt} is conformed by i.i.d. non-negative random variables with a common distribution F. Dt denotes the demand within period t. It is assumed that sequence {Dt} is independent of the sequence {ηt}.
Note that the difference equation given in (2.1) induces a stochastic kernel that can be expressed on X by
K:={(x,a):x∈X,a∈[0,∞)}, |
is defined as follows
Q(Xt+1∈(−∞,y]|Xt=x,at=a)=p(F(x+a)−F(x+a−y))+p(1−F(x+a))+(1−p)(F(x)−F(x−y))+(1−p)(1−F(x))) | (2.2) |
with y,x,a∈[0,∞) and
Q(Xt+1∈(−∞,y]|Xt=x,at=a)=0, |
if x,a∈[0,∞) and y<0.
Suppose further that it is associated with a one-step cost function C:K⟶[0,∞), defined as follows:
C(x,a)=KI{a:a>0}(a)+ca+E[h((x+ηa−D)+)]+E[l((D−(x+ηa))+)], | (2.3) |
where K≥0 is a fixed order price, c>0 is the order price per unit, h:[0,∞)⟶[0,∞) denotes the holding cost per period, l:[0,∞)⟶[0,∞) indicates the shortage cost for unfilled demand and E denotes the expectation with respect to the joint distribution of the random vector (η,D), where (η,D) is a generic element of the sequence {(ηt,Dt)}. In all the following sections, the cost function is determined by the sum of two cost functions:
C(x,a)=g(a)+H(x,a), |
where g(a)=KI{a:a>0}(a)+ca and
H(x,a)=E[h((x+ηa−D)+)]+E[l((D−(x+ηa))+)], | (2.4) |
with (x,a)∈K.
Then, the inventory control model is identified with a Markov control process [17], in short, MCP. The components of the associate Markov control model are as follows: X:=[0,∞) is the state space, A:=[0,∞) is the action space, the dynamic and the cost function are given by (2.2) and (2.3), respectively. Consequently, the inventory system evolves as follows: If the stock of the inventory systems occupies state Xt=x at time t and a controller (inventory manager) requests the quantity of product at=a. Then, a cost C(x,a) is incurred and the system jumps into a state Xt+1 according to the transition law Q(⋅|x,a). Once the transition into the new state has occurred, a new order (control) is requested and the process is repeated. Thus, for each t≥1 an admissible history ht of the inventory system up to the transition t given by, ht=(X0,a0,…,Xt−1,at−1,Xt). Let Ht, t≥1, be the set of all admissible histories of the system up to the transition t. The following definition will be used to characterize optimal strategies in the inventory control model.
Definition: A policy π={πt} is a sequence of stochastic kernels πt on A given Ht, satisfying the constraint: πt(A|ht)=1, for each ht∈Ht, t≥1. The collection of all policies is denoted by Π. Define F as the set of all measurable functions f:X⟶A. Thus, a Markov policy is a sequence ft such that ft∈F, for t≥1. In particular, a Markov policy π={ft} is said to be stationary if ft is independent of t, i.e., ft=f∈F, for all t≥1, in this case, ft is denoted by f and refers to F as the set of stationary policies.
In the subsequent, for each x∈X and π∈Π will be denoted by Pπx the measure defined on the measurable space Ω:=((X×A)∞,F), where F is the corresponding product σ-algebra. The measure Pπx is induced by the theorem of Ionescu Tulcea [3]. The expectation operator with respect to Pπx is denoted by Eπx.
Inventory control problem
The goal of this subsection is to introduce the inventory control problem, then consider π∈Π, x∈X and define the following objective function:
v(π,x):=Eπx[∞∑t=0αtC(Xt,at)], | (2.5) |
where α∈(0,1) denotes a discount factor. The performance criterion defined in (2.5) is called expected total discounted cost (see [17]). Hence, the optimal inventory control problem consists in determining a policy π∗∈Π such that,
v(π∗,x)=infπ∈Πv(π,x), | (2.6) |
for each x∈X, in this case π∗ is denominated optimal policy. The function V defined by
V(x):=infπ∈Πv(π,x), | (2.7) |
for each x∈X, it will be called the optimal value function.
In subsequent sections, the following assumption will be considered.
Basic Assumption (BA):
i) h is a non-decreasing convex function such that h(0+)=h(0), where h(0+):=inf{h(y):y>0}.
ii) l is a convex function such that l(0)=0 and l′(u) exists and is non-negative for each u≥0.
iii) E[h((u−D)+)]<∞ and E[l((D−u)+)]<∞, for each u≥0 and D is a generic element of the sequence {Dt}.
iv) If D is a generic element of the sequence {Dt}, it is assumed that D has a continuous density denoted by Δ.
BA will not be mentioned in each lemma or theorem throughout the paper, and we will assume that it holds.
Remark 2.1. i) Observe that the cost function (2.4) is convex, due to the convexity and monotonicity of functions h and l (see BA i) and ii)). Therefore, the cost function (2.3) is a convex function, this property will be used in the proof of Lemma 4.3.
ii) Moreover, under BA i) and ii), h and l are continuous functions. This characterization will help us apply to demonstrate the validity of Lemma 3.1.
Assuming BA, the inventory control model in the MCP literature is identified as a semi-continuous model [17]. Moreover, the following lemmas guarantee the existence of optimal stationary policies and the dynamic programming approach; see Theorem 3.7 below.
Lemma 3.1. The cost function C, is a lower semi-continuous (l.s.c.) function, i.e., the set {(x,a)∈K:C(x,a)≤λ} is closed for all λ∈R.
Proof. First, we can see directly that the cost g is a continuous function for a>0 and l.s.c. at a=0. We now want to prove that the function H is continuous for any (x,a)∈K. For this purpose, we fix x∈X and a∈A. Take xn∈X, an∈A, n≥1, such that xn⟶x, an⟶a, when n goes to infinity. Consequently, Eq (2.4) gives H(xn,an)⟶H(x,a) as n⟶∞, by applying the dominated convergence theorem [3] and continuity of the functions h and l; the latter by Remark 2.1. Therefore, the cost function C is a l.s.c..
Corollary 3.2. The cost C is an inf-compact function, namely the set {a∈A:C(x,a)≤λ} is compact for each x∈X and λ∈R.
Proof. Note that {a∈A:ca≤λ}=[0,λ/c], for any λ∈R. On the other hand, by Lemma 3.1, for all λ∈R and x∈X the set defined as M:={a∈A:C(x,a)≤λ} is closed set. Consequently, since M is contained in the compact set {a∈A:ca≤λ}, it is concluded that the cost function C is inf-compact.
Lemma 3.3. Transition law Q (2.2) induced by the difference equation (2.1) is strongly continuous, i.e., w(x,a):=∫Xu(y)Q(dy|x,a) is continuous and bounded on K for every measurable bounded function u on X.
Proof. Let u be a measurable and bounded function defined on X and let {(xn,an)} be a sequence on K convergent to (x,a)∈K. Then, from (2.2), we have that
w(xn,an)=∫Xu(y)Q(dy|xn,an)=p∫∞0u((xn+an−s)+)Δ(s)ds+(1−p)∫∞0u((xn−s)+)Δ(s)ds=p∫xn+an0u(xn+an−s)Δ(s)ds+(1−p)∫xn0u(xn−s)Δ(s)d(s)+(1−p)u(0)∫∞xnΔ(s)ds+pu(0)∫∞xn+anΔ(s)ds. |
Observe by applying an adequate change of variable, the following identity holds:
∫xn+an0u(xn+an−s)Δ(s)ds=∫I(−∞,xn+an](s)u(s)Δ(xn+an−s)ds. |
On the other hand, it is easy to prove that {I(−∞,xn+an]} converges to {I(−∞,x+a]} almost everywhere (a.e.) with respect to the Lebesgue measure m on R. Moreover, let θ∈(0,∞) such that |u(x)|≤θ, for all x∈X and consider the following functions:
rn(s)=I(−∞,xn+an](s)u(s)Δ(xn+an−s),r(s)=I(−∞,x+a](s)u(s)Δ(x+a−s),gn(s)=θI(−∞,xn+an](s)Δ(xn+an−s),g(s)=θI(−∞,x+a](s)Δ(x+a−s), |
with s∈[0,∞). Then, notice that the following statements hold:
i) {rn} converges to r a.e. with respect to m.
ii) {gn} converges to g a.e. with respect to m.
iii) For each n=1,2,..., |rn(⋅)|≤gn(⋅).
iv) ∫gn(s)ds=∫g(s)=θ.
Statements i)–iv) guarantee the hypothesis of the Dominated Convergence Theorem (see[3]), in consequence, it is obtained that limn→+∞∫rn(s)ds=∫r(s)ds, i.e.,
∫xn+an0u(xn+an−s)Δ(s)ds⟶∫x+a0u(x+a−s)Δ(s)ds. | (3.1) |
Similarly, it is possible to prove that:
∫xn0u(xn−s)Δ(s)d(s)=∫I(−∞,xn](s)u(s)Δ(xn−s)ds⟶∫I(−∞,x](s)u(s)Δ(x−s). | (3.2) |
Besides, due to the continuity of the distribution function F, it yields that
∫∞xnΔ(s)ds=1−F(xn)⟶1−F(x)=∫∞xΔ(s)ds,%∫∞xn+anΔ(s)ds=1−F(xn+an)⟶1−F(x+a)=∫∞x+aΔ(s)ds. | (3.3) |
In consequence, from (3.1)–(3.3), it is obtained that
limn→+∞w(xn,an)=p∫x+a0u(s)Δ(x+a−s)ds+(1−p)∫x0u(s)Δ(x−s)ds+(1−p)u(0)∫∞xΔ(s)ds+pu(0)∫∞x+aΔ(s)ds=w(x,a). |
Therefore, w is a continuous function on K.
The demonstration of Lemma 3.4 is based on a result of renewal processes used in classical theory of inventories (see, e.g., [4,12]). For this purpose, define for each x∈X, the renewal process
N(x):=sup{t≥0|Wt≤x}, | (3.4) |
where W0:=0 and Wt:=∑tj=1Dj. Observe that E[N(x)]<∞ for each x≥0, a proof of this fact can be consulted in [15]. Furthermore, consider the residual lifetime defined as
R(x):=WN(x)+1−x,x∈X. | (3.5) |
Lemma 3.4. For all x∈X, E[l(R(x))]<∞.
Proof. Let x∈X be fixed. Observe that the tail of the distribution of the residual lifetime is given for z≥0 as follows
P(R(x)>z)=∞∑n=1P(Wn−1≤x,Wn>x+z)=1−F(x+z)+∫x0(1−F(x+z−s))dU(s), |
where U:=E[N(x)] is the renewal function. On the other hand, from assumption BA i), it is obtained that
E[l(R(x))]=∫l(∞)0P(l(R(x)>z))dz=∫∞0P(l(R(x))>l(z)))dl(z)=∫∞0P(R(x)>z)l′(z)dz, |
where l(∞):=limu→∞l(u). In consequence, by substituting the tail distribution of function R in the last equation, it yields that
E[l(R(x))]=∫∞0l′(z)(1−F(x+z))dz+∫∞0(∫x0l′(z)(1−F(x+z−s))dU(s))dz. |
Now, observe that
∫∞0l′(z)(1−F(x+z))dz≤∫∞0l′(z)P(D>z)dz=E[l(D)], |
and from Fubini's theorem [3], it is obtained that
∫∞0∫x0l′(z)(1−F(x+z−s))dU(s)dz≤∫x0(∫∞0P(D>z)l′(z)dz)dU(s)=∫x0E[l(D)]dU(s)=E[l(D)]E[N(x)]. |
Therefore,
E[l(R(x))]≤E[l(D)](1+E[N(x)])<∞. |
Since state x is arbitrary, the result follows.
Lemma 3.5. There exists ˜π∈Π such that v(˜π,x)<∞, for all x∈X.
Proof. Consider x∈X fixed and the stationary policy ˜π={˜g,˜g,…}⊆F with ˜g(y)=0 for all y∈X. Hence, the stochastic path {X˜gt} generated by ˜π is given by
X˜gt+1=(X˜gt−Dt+1)+, | (3.6) |
with X˜g0=x. Consequently, observe that
X˜gt={x−Wtift=0,…,N(x),0ift≥N(x)+1. | (3.7) |
Then, due to (2.3) and (2.4), it yields that
v(˜π,x)=E˜πx[∞∑t=0αtH(X˜gt,0)]=E˜πx[∞∑t=0αt(E[h((X˜gt−Dt+1)+)]+E[l((Dt+1−X˜gt)+)])]. |
Now, from (3.6) and the renewal process (3.4), it may be found that
v(˜π,x)≤E[l(D)]+h(0)1−α+E˜πx[N(x)∑t=0αt(E[h((X˜gt−Dt+1)+)]+E[l((Dt+1−X˜gt)+)])]. |
On the other hand, observe that
N(x)∑t=0αt(E[h((X˜gt−Dt+1)+)]+E[l((Dt+1−X˜gt)+)])=N(x)∑t=0αt(E[h((x−Wt+1)+)]+E[l((Wt+1−x)+)])≤1αN(x)∑t=1αt(E[h((x−Wt)+)]+E[l((Wt−x)+)])+h(0)+E[l(WN(x)+1−x)]≤11−α(h(x)+l(0))+h(0)+E[l(WN(x)+1−x)],P˜πx−a.s, |
where the second inequality was obtained since h is a non-decreasing function (see BA i)). Finally, both inequalities follow from (3.4). In consequence, since l(0)=0, it is obtained that
v(˜π,x)≤E[l(D)]+h(x)+h(0)1−α+E[l(WN(x)+1−x)]. |
By applying Lemma 3.4 to the last relation and due to E[l(D)]<∞ as a consequence of BA iii), it is obtained that v(˜π,x)<∞. Since x∈X is arbitrary, the result holds.
Remark 3.6. Note that Lemma 3.5 guarantees that V(x)<∞, for all x∈X, due to V(x)≤v(˜π,x), for all x∈X.
As a consequence of the previous results, the following theorem is valid (see Theorem 4.2.3 and Lemma 4.2.8 in [17]).
Theorem 3.7. The following statements hold:
i) For each x∈X, the optimal value function satisfies the dynamic programming equation:
V(x)=mina∈A{C(x,a)+αE[V((x+ηa−D)+)]}. |
ii) There exists an optimal stationary policy f∈F such that for each x∈X the following equation holds
V(x)=C(x,f(x))+αE[V((x+ηf(x)−D)+)]. |
iii) The value iteration functions, defined as V0(x):=0 and
Vn(x):=mina∈A{C(x,a)+αE[Vn−1((x+ηa−D)+)]}, |
n≥1,x∈X, converge monotonically increasing to the optimal value function V.
Remark 3.8. Minimizers of the value iteration functions will be denoted by fn, n≥0. These minimizers satisfy the following, f0(x)=0 and for each n≥1: Vn(x)=C(x,fn(x))+αE[Vn−1((x+ηfn(x)−D)+)], for all x∈X.
This section deals with the characterization of the minimizers of the value iteration functions and the optimal policies by (s,S) policies.
Definition 4.1. Let f∈F be a stationary policy, if there exists (s,S)∈R2 such that 0≤s≤S and
f(x)={S−xifx≤s,0ifx>s, | (4.1) |
f is called a (s,S) stationary policy.
Define the following functions
ˆVn(u):=E[Vn((u−D)+)],n=0,1,…, | (4.2) |
and
Gn(u):=cu+pˆH(u)+αpˆVn−1(u),n=1,2,…, | (4.3) |
for u∈X, where ˆH(u)=H(u,0). In consequence, value iteration functions {Vn} can be expressed for each n≥1 and x∈X as follows,
Vn(x)=min{cx+pˆH(x)+αpˆVn−1(x),infa>0{K+c(x+a)+pˆH(x+a)+αpˆVn−1(x+a)}}−cx+(1−p)ˆH(x)+α(1−p)ˆVn−1(x)=min{Gn(x),infa>0{K+Gn(x+a)}}−cx+(1−p)ˆH(x)+α(1−p)ˆVn−1(x). |
Making the change of variable y:=x+a, the previous equation is equivalent to
Vn(x)=min{Gn(x),infy≥x{K+Gn(y)}}−cx+(1−p)ˆH(x)+α(1−p)ˆVn−1(x), | (4.4) |
for n≥1 and V0(x)=0, x∈X.
The following definition will be applied to characterize the optimal policy [21].
Definition 4.2. A function ϑ:[0,∞)⟶[0,∞) is called norm-like if ϑ(x)→∞ as x→∞; this means that the sub-level sets {x:ϑ(x)≤r} are precompact for each r>0.
Lemma 4.3. For each n=1,2,…, the following statements hold
i) Vn is a non-decreasing convex function,
ii) Gn is a convex function on X,
iii) Gn is norm-like.
Proof. i) The proof is by induction. It will be proved that V1 is a non-decreasing function,
mina∈[0,∞)C(x,a)≤mina∈[0,∞)C(y,a), |
if x,y∈X and x≤y, due to C(⋅,a) is a non-decreasing function for each a∈[0,∞)] (see (2.3) and AIM). Then, V1 is a non-decreasing function. On the other hand, using Lemma 1 in [19] together with the fact that C is a convex function, it is obtained that V1 is a convex function. Now, suppose that Vn is a non-decreasing convex function. To prove that Vn+1 is a non-decreasing function, observe that, as a consequence of Vn is non-decreasing, the following inequality holds
Vn((x+ηa−s)+)≤Vn((y+ηa−s)+), |
for all x,y∈X, with x≤y and a,s∈[0,∞). In consequence,
C(x,a)+αE[Vn((x+ηa−D)+)]≤C(y,a)+αE[Vn((y+ηa−D)+)], | (4.5) |
for all x,y∈X, with x≤y and a,s∈[0,∞). Consequently, taking the minimum with respect to a, in each side of inequality (4.5), it results that Vn+1(x)≤Vn+1(y), x,y∈X with x≤y. Now, it will be proved that Vn+1 is a convex function. To this end, note that C(x,a)+αE[Vn((x+ηa−D)+)] is a convex function for each (x,a)∈K, this statement is a consequence of the induction hypothesis and the convexity of the cost function. Then, from Lemma 1 of [19], it may be concluded that Vn+1 is a convex function.
ii) The previous statement implies that ˆVn is a convex function for each n≥1, as the following relations evidence it:
ˆVn(λu1+(1−λ)u2)=E[Vn((λu1+(1−λ)u2−D)+)]≤E[λVn((u1−D)+)+(1−λ)Vn((u2−D)+)]=λˆVn(u1)+(1−λ)ˆVn(u2), |
for each u1,u2∈X, λ∈[0,1]. Consequently, since ˆH is convex, (4.3) implies that Gn is a convex function for each n≥1.
iii) Observe that Gn(u)≥cu, for n≥1 with c>0. Hence, for each n≥0, Gn(u)→∞, when u→∞. This implies that Gn is a norm-like function for each n≥1.
Remark 4.4. Observe that for each n≥1, we have that
i)Gn (4.3) is a continuous function on (0,∞), as a consequence of convexity of function Gn.
ii) Given that Gn is a norm-like function it follows that
argminy∈XGn(y):={z≥0:Gn(z)=miny∈XGn(y)}, |
is a non-empty set.
Lemma 4.5 is a modified version of Lemma 2.1 of [7], the proof of this lemma is presented here for the completeness of the paper.
Lemma 4.5. For each n≥1, let Sn∈argminy≥0Gn(y) and sn:=inf{0≤x≤Sn:Gn(x)≤K+Gn(Sn)} then the following statements hold:
i)Gn(sn)=K+G(Sn), if sn>0.
ii)Gn(x)≤K+Gn(Sn), 0≤sn≤x≤Sn.
iii)K+Gn(Sn)≤Gn(x), 0≤x<sn.
iv)Gn(x) is a decreasing function on [0,sn).
Proof. Let n≥1. i) Suppose that sn>0 then Gn(0)>K+Gn(Sn)≥Gn(sn). Hence, there exists u∈(0,sn) such that Gn(u)=Gn(Sn)+K, because of the continuity of function Gn (see Remark 4.4). Now, observe that sn≤u, due to the definition of sn, then sn=u and the result holds.
ii) This statement follows directly from the definition of sn.
iii) Consider 0≤x<sn, by Lemma 4.3 ii), it follows that
Gn(sn)≤sn−xSn−xGn(Sn)+Sn−snSn−xGn(x), |
then
Gn(sn)+Sn−snsn−x(Gn(sn)−Gn(x))=Sn−sn+sn−xsn−xGn(sn)−Sn−snsn−xGn(x)≤Gn(Sn)≤K+Gn(Sn). |
Thus,
Gn(sn)+Sn−snsn−x(Gn(sn)−Gn(x))≤K+Gn(Sn). | (4.6) |
Now, due to K+Gn(Sn)−Gn(sn)=0 given that and x≤sn≤Sn, (4.6) implies that Gn(sn)≤Gn(x). Therefore, K+Gn(Sn)=Gn(sn)≤Gn(x) with 0≤x<sn.
iv) Let x1,x2∈[0,sn) with x1≤x2. Hence, by Lemma 4.3 ii),
K+Gn(Sn)≥Gn(x2)+Sn−x2x2−x1(Gn(x2)−Gn(x1)). | (4.7) |
Now, from statement iii), it is obtained that Gn(x2)≥K+Gn(Sn), this relation and (4.7) together lead to 0≥Gn(x2)−Gn(x1) and the result follows.
A consequence of Lemma 4.5 is the following result.
Theorem 4.6. Consider {(sn,Sn):n=1,2,…} in Lemma 4.5. Then, the minimizers of the value iteration functions are given by
fn(x)={Sn−xifx≤sn0ifx>sn, | (4.8) |
with 0<sn≤Sn and n=1,2,...
Proof. Let n≥1 and x∈X. The proof will proceed by considering three cases depending on the order relation between state x, sn, and Sn. The following simple but important claim is established.
Claim 1: u∈argminy≥x{K+Gn(y)} if and only if u−x∈argmina≥0{K+Gn(x+a)}.
Case 1. Suppose that x<sn, Lemma 4.5 iii) yields that K+Gn(Sn)≤Gn(x). Consequently, by (4.4) and Claim, it follows that fn(x)=Sn−x.
Case 2. Consider sn≤x≤Sn, then by Lemma 4.5 ii), it is obtained that Gn(x)≤K+Gn(Sn). In consequence, (4.4) implies that fn(x)=0.
Case 3. Finally, assume that Sn<x, since Gn is a convex function, we have that Gn is a non-decreasing function on (Sn,∞), this fact implies that x∈argminy≥x{K+Gn(y)}. Now, from Claim 1, it follows that 0∈argmina≥0{K+Gn(x+a)}. Therefore, fn(x)=0.
The previous cases guarantee the truth of Theorem 4.6.
Remark 4.7. Observe that Theorem 3.7 and (4.4) imply that the optimal value function satisfies the following equation
V(x)=min{G(x),infy≥xG(y)}−cx+(1−p)ˆH(x)+α(1−p)V∗(x), | (4.9) |
with
G(u):=cu+pˆH(u)+αpV∗(u),u≥0, | (4.10) |
and
V∗(u):=E[V((u−ξ)+)]. |
Since Vn↑V because of Theorem 3.7 and by the Dominated Convergence Theorem, it yields that
Gn↑G. | (4.11) |
Lemma 4.8. G is a norm-like and convex function.
Proof. Observe that G(x)≥cx for each x≥0. Hence, G(x)→∞ as x goes to infinity. Thus, by Definition 4.2, G is a norm-like function. Furthermore, (4.11) implies that G is a convex function since Gn is a convex function for each n≥1 (see Lemma 4.3).
The proofs of Lemma 4.9 and Theorem 4.10 are in similar lines with the proofs of Lemma 4.5 and Theorem 4.6, respectively. Thus, the proofs will be omitted.
Lemma 4.9. Let S∈argminy≥0G(y) and s:=inf{0≤x≤S:G(x)≤K+G(S)} then the following statements are valid.
i)G(s)=G(S)+K, if s>0.
ii)G(x)≤K+G(S), 0≤s≤x≤S.
iii)G(S)+K≤G(x), 0≤x<s.
iv)G(x) is a decreasing function on [0,s).
Theorem 4.10. Consider (s,S) as in Lemma 4.9. Hence, the optimal policy for the inventory system is an (s,S) policy given by
f(x)={S−xsix≤s,0six>s, |
with 0<s≤S.
Remark 4.11. Observe that in Theorem 4.10, if s=0 then G(x)≤G(S)+K, for all x≥0, which implies that f(x)=0, for all x≥0. A similar argumentation guarantees the following assertion: fn(x)=0, x∈X, if sn=0 for n≥1.
In this section, the convergence of the minimizers of the value iteration function will be analyzed. First, the next auxiliary results are exposed.
Lemma 5.1. Let ˆC⊂(0,∞) be a compact set. Then, the following statements hold.
i){Gn} converges uniformly to G on ˆC.
ii) For each {un}⊂ˆC such that un→u∈ˆC, limn→∞Gn(un)=G(u).
Proof. i) This statement is a direct consequence of (4.11) and Theorem 7.13 in [24].
ii) Applying i) of Lemma 5.1, it follows that for all ϵ>0 there exists N1≥1 such that for each x∈ˆC,
|Gn(x)−G(x)|<ϵ/2, | (5.1) |
if n≥N1. Furthermore, by continuity of function G on ˆC, there exists δ>0 such that
|G(y)−G(x)|<ϵ/2, | (5.2) |
if |y−x|<δ. Now, since un converges to u when n goes to infinity, there exists N2≥1 such that
|un−u|<δ, | (5.3) |
if n≥N2. Let N=max{N1,N2}. Thus, (5.2) and (5.3) together leads to
|G(un)−G(u)|<ϵ/2, | (5.4) |
if n≥N. Then, taking x=un in (5.1), it yields that
|Gn(un)−G(un)|<ϵ/2. | (5.5) |
Therefore, by (5.4) and (5.5), |Gn(un)−G(u)|<ϵ. This last relation implies that Gn(un) converges to G(u), when n goes to infinity.
Lemma 5.2. Let B:={x>0|G1(x)≤infy≥0G(y)+K} and {(sn,Sn):n≥1} as in Lemma 4.5. Then, the following statements hold
i)B is a compact set in R.
ii)sn,Sn∈B, for each n=1,2,....
iii) There exists a subsequence {(snk,Snk)} of {(sn,Sn)}, which converges to (s∗,S∗) such that S∗∈argminy≥0G(y) and G(s∗)=K+G(S∗) with 0<s∗≤S∗.
Proof. i) First, observe that B is a closed set, since G1 is a continuous function on (0,∞). Now suppose that B is not bounded, then there exists a sequence {bn}⊂B such that bn converges to infinity. Thus,
∞=lim infn→∞G1(bn)≤infy≥0G(y)+K. |
The previous relation is a contradiction since infy≥0G(y)+K<∞. Therefore B is a compact set.
ii) Observe that for each n=1,2,... the following inequalities are valid
G1(Sn)≤Gn(Sn)=infy≥0Gn(y)≤infy≥0G(y)+K,G1(sn)≤Gn(sn)=infy≥0Gn(y)+K≤infy≥0G(y)+K, |
since {Gn} is a non-decreasing sequence whose limit is G and the equalities are valid due to Lemma 4.5. Then, for each n=1,2,..., sn, Sn∈B.
iii) The previous statements i) and ii) imply that there exists a subsequence {(snk,Snk)} convergent to (s∗,S∗)∈B2, with 0<s∗≤S∗. Furthermore, due to Lemma 5.1 ii), it yields that
limk→∞Gnk(Snk)=G(S∗), | (5.6) |
limk→∞Gnk(snk)=G(s∗). | (5.7) |
Now, Lemma 4.5 implies that Gnk(Snk)≤Gnk(x), x≥0. Then, when k goes to infinity in the last inequality, it is obtained that G(S∗)≤G(x), x≥0. In consequence, S∗∈argminy≥0G(y). On the other hand, as a consequence of Lemma 4.5 ii), (5.6) and (5.7), it follows that G(s∗)=K+G(S∗).
Theorem 5.3. Let {fn} be the sequence of minimizers of value iteration functions. Then, there exists a subsequence {fnk} such that converges uniformly on X to an (s∗,S∗) optimal policy.
Proof. First, in view of Theorem 4.6, for each n≥1, fn(x)=(Sn−x)I{x:x≤sn}(x) with 0<sn≤Sn. Then, using iii) of Lemma 5.2, it is obtained a subsequence (snn,Snk) such that (snn,Snk)⟶(s∗,S∗) when k goes to infinity and (s∗,S∗) satisfies the statements i)−iv) of Lemma 4.9. Thus, consider
f∗(x)={S∗−xsix≤s∗0six>s∗, |
and observe that,
supx∈X|fnk(x)−f∗(x)|≤|Snk−S∗|⟶0, |
when k⟶∞. Furthermore, f∗ is an optimal policy due to Theorem 4.10. This concludes the proof of the theorem.
Consider an inventory control system with an exponential distribution on the demand with parameter λ=0.1 and suppose that the parameter of the variable η is p=0.5. On the other hand, for each u≥0, suppose that h(u)=γ1u, l(u)=γ2u, where γ1, γ2 are non negative constants. In this case, γ1 represents the cost per unit shortage and γ2 represents the cost per unit demand not supplied.
Furthermore, observe that for each u≥0,
E[h((u−D)+)]=γ1λ(λu+e−λu−1), |
E[l((D−u)+)]=γ2(e−λuλ). |
Then by (4.3) for n=1, it yields that
G1(u)=(pγ1+c)u+p(γ1+γ2)e−λuλ−γ1λ,u≥0. |
Lemma 6.1. For each n=1,2,..., Gn is a strictly convex function.
Proof. First, observe that the derivative of G1 is given by G′1(u)=pγ1+c−p(γ1+γ2)e−λu and the second derivative G′′1(u)=λp(γ1+γ2)e−λu>0,u≥0. Consequently, the function G1 is strictly convex. Then, since Gn(u)=G1(u)+αpˆVn−1(u), u≥0 (see 4.3), it is concluded that for each n=1,2,..., Gn is a strictly convex function, due to ˆVn is a convex function for each n≥1 (see proof of Lemma 4.3) and G1 is a strictly convex function. Then, the result follows.
Remark 6.2. A consequence of Lemma 6.1 is the uniqueness of minimizers of the value iteration functions (see (4.4)). Furthermore, observe that (4.10) can be rewritten for each u≥0 as follows G(u)=G1(u)+αpV∗(u), which is a strictly convex function, as a consequence of Lemma 6.1 and Remark 4.7. Hence, the optimal policy is unique due to the Eq (4.9).
Lemma 6.3. The unique minimizer of function G1 is y∗=−λ−1(ln(γ1+c/p)−ln(γ1+γ2)>0, if c/p<γ2.
Proof. Observe that, due to Lemma 6.1, the minimizer y∗ is characterized by the first order condition:
pγ1+c−p(γ1+γ2)e−λy∗=0. | (6.1) |
Thus, y∗=−λ−1ln((γ1+c/p)(γ1+γ2)−1). In particular, since c/p<γ2, it is obtained that (γ1+c/p)(γ1+γ2)−1<1, in consequence, y∗>0.
The following algorithm is applied to approximate the optimal value function V and the optimal policy f, for each state x given as initial condition. The validity of the next algorithm is sustained in the following results: Lemma 4.9, Theorem 4.10, Lemmas 5.2 and 6.3.
Consider the following parameters: γ1=γ2=30, K=1.5, c=2.5, ϵ=0.01, α=0.2 and x=40, observe that γ2>c/p. Table 1 illustrates the numerical results obtained by applying the previous Algorithm 1.
n | sn | Sn | Vn(x) | fn(x) |
1 | 49.79 | 53.89 | 1200.22 | 13.89 |
3 | 52.75 | 55.28 | 1438.65 | 14.80 |
5 | 55.72 | 55.75 | 1496.62 | 15.28 |
7 | 53.82 | 53.89 | 1497.04 | 15.75 |
8 | 53.82 | 53.89 | 1498.92 | 15.75 |
Algorithm 1 Approximation of the optimal policy f and the optimal value function V |
Require: K>0,c>0,λ>0,γ1>0,γ2>c/p>0,x>0 and ϵ,α∈(0,1) |
1: S1←y∗ |
2: G1(u)←(pγ1+c)u+p(γ1+γ2)e−λu/λ |
3: s1←x∗ such that G1(x∗)=G1(S1)+K |
4: Repeat |
5: for n do |
6: ˆVn(u)←E[Vn((u−D)+)] |
7: if u≤sn |
8: Vn(u)←Gn(Sn)+K−cu+(1−p)ˆH(u)+α(1−p)ˆVn−1(u) |
9: else |
10: Vn(u)←Gn(u)−cu+(1−p)ˆH(u)+α(1−p)ˆVn−1(u) |
11: Gn(u)←G1(u)+ˆVn(u) |
12: Sn←argminu≥0Gn(u) |
13: sn←x∗ such that Gn(x∗)=Gn(Sn)+K |
14: until |Sn−1−Sn|<ϵ and |sn−1−sn|<ϵ |
15: S∗←Sn and s∗←sn |
16: f(x)←S∗−x if x≥s∗ else f(x)←0 |
17: V(x)←Vn(x) |
18: return f(x) and V(x) |
Observe that the convergence of the optimal value function and (s,S) policy are illustrated, see Theorem 4.10 and Lemma 5.2, respectively. Therefore, f(x)=15.75 and V(x)=1498.92 for x=40 with an error ϵ=0.01.
In this paper, a discrete-time inventory system was presented. In the manuscript, conditions for convexity and monotonicity in the components of the Markov control model were proposed. In this inventory system, the existence of stationary policies (s, S) was proved. To achieve this goal, the methodology of dynamic programming in discrete time was applied. Moreover, the existence of a subsequence of minimizers of the value iteration functions converging to a (s, S) optimal policy of the inventory system was proved. Finally, a numerical algorithm for approximating the optimal inventory cost and policy is presented and applied in a numerical example. Future work in this direction includes the following:
● Incorporating Markovian demand into the inventory model.
● Investigating the lost-sales inventory system under other performance measures, e.g., considering the long average or risk-sensitive criteria.
● Implementing the manuscript's proposal on real-world data by finding a suitable database for the assumptions presented in this manuscript.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors would like to thank the referees for their valuable comments, corrections and suggestions, which lead to an improvement of the original paper.
The authors declare that they have no competing interests.
[1] |
E. Ahmadi, H. Mosadegh, R. Maihami, I. Ghalehkhondabi, M. Sun, G. A. Süer, Intelligent inventory management approaches for perishable pharmaceutical products in a healthcare supply chain, Comput. Oper. Res., 147 (2022), 105968. https://doi.org/10.1016/j.cor.2022.105968 doi: 10.1016/j.cor.2022.105968
![]() |
[2] |
Y. Aneja, A. Noori, The optimality of (s, S) policies for a stochastic inventory problem with proportional and lump-sum penalty cost, J. Manag. Sci., 33 (1987), 750–755. https://doi.org/10.1287/mnsc.33.6.750 doi: 10.1287/mnsc.33.6.750
![]() |
[3] | R. Ash, Real analysis and probability, New York: Academic Press, 1972. https://doi.org/10.1016/C2013-0-06164-6 |
[4] |
Y. Barron, O. Baron, QMCD Approach for perishability models: The (S, s) control policy with lead time, IISE Trans., 52 (2020), 133–150. https://doi.org/10.1080/24725854.2019.1614697 doi: 10.1080/24725854.2019.1614697
![]() |
[5] | A. Bensoussan, Dynamic programming and inventory control, In: Studies in Probability, Amsterdam: IOS Press, 3 (2011). https://doi.org/10.3233/978-1-60750-770-3-i |
[6] | A. Bensoussan, M. A. Helal, V. Ramakrishna, Optimal policies for inventory systems with piecewise-linear concave ordering costs, Available at SSRN 3601262, 2020. https://dx.doi.org/10.2139/ssrn.3601262 |
[7] | D. Bertsekas, Dynamic programming and optimal control, 2 Eds., Athena scientific, 2012. |
[8] |
B. Chen, X. Chao, C. Shi, Nonparametric learning algorithms for joint pricing and inventory control with lost sales and censored demand, Math. Oper. Res., 46 (2021), 726–756. https://doi.org/10.1287/moor.2020.1084 doi: 10.1287/moor.2020.1084
![]() |
[9] | D. Cruz-Suárez, R. Montes-de-Oca, Uniform convergence of value iteration policies for discounted Markov decision processes, B. Soc. Mat. Mex., 12 (2006), 133–148. |
[10] |
D. Cruz-Suárez, R. Montes-de-Oca, F. Salem-Silva, Conditions for the uniqueness of optimal policies of discounted Markov decision processes, Math. Method. Oper. Res., 60 (2004), 415–436. https://doi.org/10.1007/s001860400372 doi: 10.1007/s001860400372
![]() |
[11] |
H. Daduna, P. Knopov, L. Tur, Optimal strategies for an inventory system with cost functions of general form, Cybern. Syst. Anal., 35 (1999), 602–618. https://doi.org/10.1007/bf02835856 doi: 10.1007/bf02835856
![]() |
[12] | E. Feinberg, D. Kraemer, Continuity of discounted values and the structure of optimal policies for periodic-review inventory systems with setup costs, Nav. Res. Log., 2023, 1–13. https://doi.org/10.1002/nav.22108 |
[13] | E. Feinberg, Optimality conditions for inventory control, Optim. Chall. Complex Netw. Risk. Syst. INFORMS, 2016, 14–45. https://doi.org/10.1287/educ.2016.0145 |
[14] |
E. Gordienko, O. Hernández-Lerma, Average cost Markov control processes with weighted norms: Value iteration, Appl. Math., 23 (1995), 219–237. https://doi.org/10.4064/am-23-2-219-237 doi: 10.4064/am-23-2-219-237
![]() |
[15] | A. Gut, Stopped random walks, Limit Theorems and Applications, 2 Eds., New York: Springer, 2009. https://doi.org/10.1007/978-0-387-87835-5 |
[16] |
X. Guo, Q. Zhu, Average optimality for Markov decision processes in Borel spaces: A new condition and approach, J. Appl. Probab., 43 (2006), 318–334. https://doi.org/10.1239/jap/1152413725 doi: 10.1239/jap/1152413725
![]() |
[17] | O. Hernández-Lerma, J. Lasserre, Discrete-time Markov control processes: Basic optimality criteria, New York: Springer Science & Business Media, 2012. |
[18] |
D. Iglehart, Optimality of (s, S) policies in the infinite horizon dynamic inventory problem, Manag. Sci., 9 (1963), 259–267. https://doi.org/10.1287/mnsc.9.2.259 doi: 10.1287/mnsc.9.2.259
![]() |
[19] |
D. Iglehart, Capital accumulation and production for the firm: Optimal dynamic policies, Manag. Sci., 12 (1965), 193–205. https://doi.org/10.1287/mnsc.12.3.193 doi: 10.1287/mnsc.12.3.193
![]() |
[20] |
D. Lindley, The theory of queues with a single server, Math. Proc. Cambridge, 48 (1952), 277–289. https://doi.org/10.1017/S0305004100027638 doi: 10.1017/S0305004100027638
![]() |
[21] | S. Meyn, R. Tweedie, Markov chains and stochastic stability, New York: Springer Science & Business Media, 1993. https://doi.org/10.1007/978-1-4471-3267-7 |
[22] |
S. Perera, S. Sethi, A survey of stochastic inventory models with fixed costs Optimality of (s, S) and (s, S) type policies discrete-time case, Prod. Oper. Manag., 32 (2022), 131–153. https://doi.org/10.1111/poms.13820 doi: 10.1111/poms.13820
![]() |
[23] |
E. Porteus, On the optimality of generalized (s, S) policies, Manag. Sci., 17 (1971), 411–426. https://doi.org/10.1287/mnsc.17.7.411 doi: 10.1287/mnsc.17.7.411
![]() |
[24] | W. Rudin, Principles of mathematical analysis (Vol. 3), New York: McGraw-hill (1976). |
[25] |
M. Schäl, On the optimality of (s, S)-policies in dynamic inventory models with finite horizon, SIAM J. Appl. Math., 30 (1976), 528–537. https://doi.org/10.1137/0130048 doi: 10.1137/0130048
![]() |
[26] |
S. Sethi, F. Feng, Optimality of (s, S) policies in inventory models with Markovian demand, Oper. Res., 45 (1997), 931–939. https://doi.org/10.1287/opre.45.6.931 doi: 10.1287/opre.45.6.931
![]() |
[27] |
O. Vega, R. Montes-de-Oca, Application of average dynamic programming to inventory systems, Math. Method. Oper. Res., 47 (1998), 451–471. https://doi.org/10.1007/bf01198405 doi: 10.1007/bf01198405
![]() |
[28] |
A. Veinott, H. Wagner, Computing optimal (s, S) inventory policies, Manag. Sci., 11 (1965), 525–552. https://doi.org/10.1287/mnsc.11.5.525 doi: 10.1287/mnsc.11.5.525
![]() |
[29] | X. Xu, S. P. Sethi, S. H. Chung, Ordering COVID-19 vaccines for social welfare with information updating: Optimal dynamic order policies and vaccine selection in the digital age, IISE, 2023, 1–28. https://doi.org/10.1080/24725854.2023.2204329 |
[30] |
H. Zhang, J. Zhang, R. Q. Zhang, Simple policies with provable bounds for managing perishable inventory, Prod. Oper. Manag., 29 (2020), 2637–2650. https://doi.org/10.2307/3214683 doi: 10.2307/3214683
![]() |
[31] |
Y. Zheng, A simple proof for optimality of (s, S) policies in infinite-horizon inventory systems, J. Appl. Probab., 28 (1991), 802–810. https://doi.org/10.2307/3214683 doi: 10.2307/3214683
![]() |
n | sn | Sn | Vn(x) | fn(x) |
1 | 49.79 | 53.89 | 1200.22 | 13.89 |
3 | 52.75 | 55.28 | 1438.65 | 14.80 |
5 | 55.72 | 55.75 | 1496.62 | 15.28 |
7 | 53.82 | 53.89 | 1497.04 | 15.75 |
8 | 53.82 | 53.89 | 1498.92 | 15.75 |