
To optimize the control of a queuing system with multiple classes of customers and multiple servers, we introduce a novel joint scheduling–control policy that includes customer admission control, service scheduling control, and service rate control. In this policy, any server can serve any class of customers; the service rate control for a server is a unique feature of this policy and is determined by the overall state of the system, not the state of a server or the class of customers it serves. Given the inherent complexity of the system′s equations and the difficulty of solving them directly, we apply diffusion approximation theory and consider the Halfin–Whitt heavy traffic regime. This approach yields a formally weak limit of the joint scheduling–control problem. This limit problem, which we call the diffusion control problem (DCP), is a stochastic differential equation (SDE). Next, we present the corresponding Hamilton–Jacobi–Bellman (HJB) equation and prove the existence and uniqueness of the solution to this equation. This solution is the optimal Markov policy for the diffusion control problem, and we use this solution to devise a policy for the original joint scheduling–control problem and prove its asymptotic optimality. We designed several experiments to compare the system′s performance and value functions under different control policies. Our designed joint scheduling–control policy has significant advantages in reducing the system′s cost and improving service efficiency.
Citation: Xiaolong Li. Asymptotic optimality of a joint scheduling–control policy for parallel server queues with multiclass jobs in heavy traffic[J]. AIMS Mathematics, 2025, 10(2): 4226-4267. doi: 10.3934/math.2025196
[1] | K. Jeganathan, S. Selvakumar, N. Anbazhagan, S. Amutha, Porpattama Hammachukiattikul . Stochastic modeling on M/M/1/N inventory system with queue-dependent service rate and retrial facility. AIMS Mathematics, 2021, 6(7): 7386-7420. doi: 10.3934/math.2021433 |
[2] | Yuejiao Wang, Chenguang Cai . Equilibrium strategies of customers and optimal inventory levels in a make-to-stock retrial queueing system. AIMS Mathematics, 2024, 9(5): 12211-12224. doi: 10.3934/math.2024596 |
[3] | Zhen Wang, Liwei Liu, Yuanfu Shao, Yiqiang Q. Zhao . Joining strategies under two kinds of games for a multiple vacations retrial queue with N-policy and breakdowns. AIMS Mathematics, 2021, 6(8): 9075-9099. doi: 10.3934/math.2021527 |
[4] | Shaojun Lan, Yinghui Tang . An unreliable discrete-time retrial queue with probabilistic preemptive priority, balking customers and replacements of repair times. AIMS Mathematics, 2020, 5(5): 4322-4344. doi: 10.3934/math.2020276 |
[5] | Rani Rajendiran, Indhira Kandaiyan . Transient scrutiny of MX/G(a,b)/1 queueing system with feedback, balking and two phase of service subject to server failure under Bernoulli vacation. AIMS Mathematics, 2023, 8(3): 5391-5412. doi: 10.3934/math.2023271 |
[6] | Alexander Dudin, Sergey Dudin, Rosanna Manzo, Luigi Rarità . Queueing system with batch arrival of heterogeneous orders, flexible limited processor sharing and dynamical change of priorities. AIMS Mathematics, 2024, 9(5): 12144-12169. doi: 10.3934/math.2024593 |
[7] | Ciro D'Apice, Alexander Dudin, Sergei Dudin, Rosanna Manzo . Study of a semi-open queueing network with hysteresis control of service regimes. AIMS Mathematics, 2025, 10(2): 3095-3123. doi: 10.3934/math.2025144 |
[8] | Bharathy Shanmugam, Mookkaiyah Chandran Saravanarajan . Unreliable retrial queueing system with working vacation. AIMS Mathematics, 2023, 8(10): 24196-24224. doi: 10.3934/math.20231234 |
[9] | Ciro D'Apice, Alexander Dudin, Sergei Dudin, Rosanna Manzo . Analysis of a multi-server retrial queue with a varying finite number of sources. AIMS Mathematics, 2024, 9(12): 33365-33385. doi: 10.3934/math.20241592 |
[10] | Jiaqi Qu, Yunlan Wei, Yanpeng Zheng, Zhaolin Jiang . Fast algorithms for a linear system with infinitesimal generator structure of a Markovian queueing model. AIMS Mathematics, 2025, 10(3): 6546-6559. doi: 10.3934/math.2025299 |
To optimize the control of a queuing system with multiple classes of customers and multiple servers, we introduce a novel joint scheduling–control policy that includes customer admission control, service scheduling control, and service rate control. In this policy, any server can serve any class of customers; the service rate control for a server is a unique feature of this policy and is determined by the overall state of the system, not the state of a server or the class of customers it serves. Given the inherent complexity of the system′s equations and the difficulty of solving them directly, we apply diffusion approximation theory and consider the Halfin–Whitt heavy traffic regime. This approach yields a formally weak limit of the joint scheduling–control problem. This limit problem, which we call the diffusion control problem (DCP), is a stochastic differential equation (SDE). Next, we present the corresponding Hamilton–Jacobi–Bellman (HJB) equation and prove the existence and uniqueness of the solution to this equation. This solution is the optimal Markov policy for the diffusion control problem, and we use this solution to devise a policy for the original joint scheduling–control problem and prove its asymptotic optimality. We designed several experiments to compare the system′s performance and value functions under different control policies. Our designed joint scheduling–control policy has significant advantages in reducing the system′s cost and improving service efficiency.
We consider a stochastic service system consisting of n servers and m classes of impatient customers and introduce a novel joint scheduling–control policy with three parts: customer admission control, service scheduling control, and service rate control (see Figure 1 and Section 2 for a detailed illustration).
Let k=1,2,⋯,m; in this case, Unk(⋅) represents the customer admission control process in Figure 1, and ξnk(⋅) represents the service scheduling control process. Let ξn=(ξn1,ξn2⋯ξnm); thus, Hnk(ξn(⋅)) represents the service rate control process. The arrivals of each customer class in the system follow a renewal process, represented by λnk in Figure 1. Any server can offer service to any customer, and the corresponding service rate depends on the overall state of the system, not only on the individual customer or the individual server, which is a unique feature of our proposed policy. Note that the vector ξn is used in the rate control process Hnk(⋅) for each type of service, not some component ξnk.
Our joint scheduling–control policy has three parts, which are dynamically adjusted to each other and combined with a cost function to optimize our system. For example, if too many customers of a certain type arrive, we can either stop letting them in (customer admission control), schedule a few more service stations to work for such customers (service scheduling control), or increase the service rate for such customers (service rate control). These three approaches can be used simultaneously, or one or two of them can be chosen in combination, which has to be chosen in conjunction with the cost function to optimize the system.
Our service discipline is not first in–first out (FIFO) but a priority model and is inclusive of both non-preemptive and preemptive resume policies (i.e., policies that either prevent or allow the service to be interrupted and the customer to return to the queue without completing the service). For example, if the cost of losing or waiting for a certain type of customer is particularly large, the service in progress can be interrupted so that this type of customer can be served quickly and the waiting time can be reduced.
Incoming customers are impatient, with random waiting times that they tolerate, shown as θnk in Figure 1. These waiting times are independent and identically distributed (i.i.d.) and are independent of the arrival and service processes. Customers face two situations in the system: being served or being in suspension. A customer departs if his or her service is completed, and abandons the queue if he or she has spent too much time waiting.
Of course, these added control components increase the cost, and the ultimate goal of this paper is to obtain an asymptotically optimal joint scheduling–control policy to show that the added control cost is worthwhile and that the total cost decreases. Our costs are linear or nonlinear expected cumulative discount functions of appropriately normalized performance metrics. Our costs include fixed components, such as the depreciation of equipment and basic salaries of personnel, as well as some dynamic components, such as the cost of idle servers, the cost of customer abandonment, the cost of customer queuing, the cost of customer access control, the cost of server scheduling control, and the cost of server rate control. Of course, in practice, they can be added or subtracted accordingly.
In contrast to the control policies proposed in existing studies (see, e.g., Bell and Williams [1], Harrison [2], Krichagina and Taksar [3], Kumar [4], Kushner and Chen [5], Martins et al. [6], Plambeck et al. [7], Van Mieghem [8], Weerasinghe [9] and Atar et al. [10]), our joint scheduling–control policy is designed more intelligently via a non-negative vector function of queue states and their corresponding ratios. This problem is new to the literature concerning the design of joint scheduling and control policies.
Given the inherent complexity of the system′s equations and the difficulty of solving them directly, we apply diffusion approximation theory and consider the Halfin–Whitt heavy traffic regime (see, e.g., Halfin and Whitt [11] and Jagerman [12]). This approach yields a formally weak limit of the joint scheduling control problem. This limit problem, which we call the diffusion control problem (DCP), is a stochastic differential equation (SDE). Next, we present the corresponding Hamilton–Jacobi–Bellman (HJB) equation and prove the existence and uniqueness of the solution to this equation. This solution is the optimal Markov policy for the diffusion control problem (cf. [13]), and we use this solution to devise a policy for the original joint scheduling control problem and prove its asymptotic optimality.
The effectiveness of the designed joint scheduling–control policy is illustrated by numerical examples, as shown in Figure 2. Compared with other policies, our designed joint scheduling–control policy has the lowest cost, as shown by the leftmost blue rectangle in Figure 2. The results show that our designed joint scheduling–control policy has significant advantages in reducing the system′s cost and improving service efficiency. The results of this numerical simulation not only confirm the rationality and practicality of the previous theoretical analysis but also provide a valuable reference for practical applications. A more detailed description of these numerical simulations and comparisons is presented in Section 3.
Therefore, determining how to establish the relationship between the physically optimal model and the limit diffusion-based optimal control system is the key to proving the asymptotic optimality of our designed policy. In doing so, we extend several techniques developed in Weerasinghe [9] to broaden our goal from a single customer class to multiple customer classes and from rate control alone to integrated server scheduling and rate control. This study also extends the standalone problem of scheduling (see, e.g., Arapostathis et al. [14,15] and Atar et al. [10]) to the joint scheduling and control problem developed in this paper. Additionally, instead of employing joint server scheduling and rate control methods, we consider related studies on joint admission and rate control, which can be found in Kocağa [16].
Dynamic scheduling and control for parallel server queues in the Halfin–Whitt regime have attracted interest in both academic studies and real-world applications (see, e.g., Sze [17], Arapostathis et al. [14,15], Armony et al. [18], Atar et al. [10], Dai [19], Garnett et al. [20]), Kocağa [16], and Weerasinghe [9,21]). Practical applications include modern telephone call centers, high-performance computer systems, cloud computing, and even quantum computing-based future communication systems and internet applications (see, e.g., Dai [22,23,24]).
More precisely, the aim of designing the policy is to choose a suitable number of servers while balancing the system′s inputs and outputs by considering a sequence of parallel server queues, which is diffusively scaled by the number n of servers. Furthermore, the traffic intensity associated with each n tends toward unity as n increases. Scheduling involves allocating servers to customers, and the associated rate control is conducted by introducing a vector of state-dependent feedback control functions involving rate perturbations. When the server is interpreted as a quantum qubit, this server number is the number of qubits required to build a quantum computer (see the recent breakthrough reported in Dai [19]). Our research is widely applied and up-to-date and can provide valuable references for practical operation.
This paper is organized as follows. In Subsections 2.1 and 2.2, we establish the mathematical equations of the stochastic model, specify the necessary assumptions, and perform the relevant scaling of the equations. In Subsection 2.3, we present the cost function of the original queueing system. To facilitate numerical simulation, a linear version is presented, and we prove that any cost function satisfying Assumption 3 can be used. The diffusion control problem is described in Subsection 2.4, and the corresponding HJB equation is described in Subsection 4.1. We present the main theorem (Theorem 2) of this paper in Subsection 2.5, which shows the asymptotic optimality of the joint scheduling control policy that we design. In Section 3, we use several experiments to compare the system′s performance and value functions under different control policies. Section 4 presents the proof of the theorem, Subsection 4.1 presents the proof of the existence and uniqueness of the solutions of the HJB equation, Subsection 4.2 presents the proof of asymptotic boundedness. Finally, we present the proof of Theorem 2 at the end of this paper in Subsection 4.3.
The study in this section is based on the model of parallel server queues that is presented in Figure 1 and explained in the introduction of this paper. First, define M={1,2,⋯,m} for an integer m∈N={1,2,⋯}, and let
Sm={x∈Rm+;m∑k=1xk=1}, | (2.1) |
where Rm+=[0,∞)m. Furthermore, all related m-dimensional vector stochastic processes are assumed to be in the Skorohod topological space D(Rm) of right-continuous functions with left limits and to have the Skorohod topology (see, e.g., Ethier and Kurtz [25]). In addition, we use "⇒" to represent the weak convergence of the processes in this space.
In this subsection, we specify mathematical expressions for the queuing state, the service state, the arrival process, the service process, and the abandonment process. In this way, we outline a basic form of the system. Importantly, our design of the service rate, which depends on the overall state of the system and not on a single customer or a single server, is a unique feature of our policy.
For each n∈N, sequences of stochastic processes in the QED (both Quality and Efficiency driven) regime, all the related processes indexed by n are assumed to be defined in a complete probability space (Ω,F,P) that may depend on n, where n denotes the number of servers. We also need to assume that the expectation of P is denoted by E and that t≥0 (appearing below) represents the time. At time t≥0, for each k∈M, the number of customers of class k queuing in buffer k is denoted by Φnk(t), which has the corresponding vector-form expression
Φn(t)=(Φn1(t),Φn2(t),⋯,Φnm(t))′. | (2.2) |
The prime symbol used in (2.2) denotes the transpose of a vector. Similarly, Ψnk(t) represents the number of customers of class k being served at time t, which has the corresponding vector-form expression
Ψn(t)=(Ψn1(t),Ψn2(t),⋯,Ψnm(t))′. | (2.3) |
Clearly, the processes in (2.2)-(2.3) are integer-valued vector processes, and
Φn(t),Ψn(t)∈Rm+,∑k∈MΨnk(t)≤nfor eacht≥0. | (2.4) |
Let Φn(0) and Ψn(0) represent the corresponding initial values. Here, we assume that they are predetermined. Let Xnk(t) represent the total number of customers of class k in the system at time t, and let Xn(t) denote the corresponding vector form. Thus, we have
Xn(t)=Φn(t)+Ψn(t). | (2.5) |
The service times of class k customers for each k∈M are assumed to be exponentially distributed with the rate function μnk(⋅,⋅), and each server is assumed to obey a work-conserving rule (i.e., a no-idling rule). Furthermore, the rate μnk(⋅,⋅) is designed to be more realistic and efficient than in previous works, as follows. We first use the queue state to design the server scheduling policy, i.e., to determine the number of different classes of customers to be served simultaneously by n servers, based on the proportion of customers in different classes of queues at time t. This step can be expressed as
Ψnk(t)=nξnk(Xnt), | (2.6) |
where ξnk(⋅):Rm+⟶R+ is the proportion of customers in class k to all customers in the system. Since Ψnk(t) must be an integer, we specify (2.6) as follows:
nξnk(Xnt)={[nXnk(t)∑mi=1Xni(t)],k=1,2,⋯,m−1,n−∑m−1i=1nξni(Xnt),k=m. | (2.7) |
In what follows, we use ξn(Xnt)=(ξn1(Xnt),⋯,ξnm(Xnt))′∈Sm to denote the corresponding m-dimensional ratio vector. Thus, the non-negative rate function μnk(⋅,⋅) can be given as μnk(ξn(Xnt),Xnt) for time t. Hence, if we use Snk(⋅) to denote the standard Poisson process, at time t, the total number of class k customers completing the service can be expressed as
Snk(∫t0μnk(ξn(Xns),Xns)⋅Ψnk(s)ds). | (2.8) |
The server rate control and server scheduling control in our joint scheduling–control policy are contained in μnk(ξn(Xnt),Xnt). In addition, we denote customer admission control as Lnk(t),k∈M. Their exact form will be given in Assumption 2 in Section 2.2.
The arrival processes {Ank,k∈M} are assumed to be mutually independent renewal processes. This assumption is common in the literature (see, e.g., Atar et al. [10]) and is realistic. It can be described as follows. For each k∈M, let {ˆτk(j),j∈N} be a strictly positive i.i.d. sequence of random variables with the mean E[ˆτk(1)]=1 and a squared coefficient of variation
C2τ,k=Var(ˆτk(1))(E[ˆτk(1)])2∈[0,∞). |
Furthermore, we assume that there is a constant N≥2 such that E(ˆτk(1))N<∞. We can then define
˜τnk(j)=1λnkˆτnk(j), | (2.9) |
where λnk>0 is the arrival rate of customers of class k. The notation in (2.9) is interpreted as follows: ˜τnk(1) represents the first arrival time of class k customers, and ˜τnk(a) for each a∈{2,3,...} represents the amount of time between the (a−1)th and the ath arrival of the kth customer. With the convention that ∑01=0, we define the number of arrivals of class k customers up to time t as follows:
Ank(t)=sup{a≥0:a∑j=1˜τnk(j)≤t},k∈M,t≥0. | (2.10) |
In addition, we assume that the rate at which class-k customers abandon their queue is a constant denoted by θnk∈[0,∞). Let {Rnk,k∈M} be standard Poisson processes independent of each other and of the process {Ank,Snk,k∈M}. Then, the number of abandonments of queue k up to time t is given by
Rnk(θnk∫t0Φnk(s)ds),k∈M,t≥0. | (2.11) |
To cover both non-preemptive and preemptive resume policies (i.e., policies that either prevent or allow the service to be interrupted and the customer to return to the queue without completing the service), we introduce the corresponding process Dnk for each k∈M. This process has three properties: (i) Initially, it takes the value of zero, i.e., Dnk(0)=0; (ii) it increases by 1 when a customer of class k is allocated to a server; and (iii) it decreases by 1 when a customer of class k returns to the queue. We can then model the associated dynamic queueing system as
{Φnk(t)=Φnk(0)+Ank(t)−Dnk(t)−Rnk(˜Enk(t))−Lnk(t),Ψnk(t)=Ψnk(0)+Dnk(t)−Snk(Enk(t)),k∈M,t≥0, | (2.12) |
where
Enk(t)=∫t0μnk(ξn(Xns),Xns)⋅Ψnk(s)ds,˜Enk(t)=θnk∫t0Φnk(s)ds. |
Furthermore, an induced filtration Fn={Fnt,t≥0} with the corresponding sigma algebra Fnt is given by
Fnt=σ{Ank(s),Snk(Enk(s)),Rnk(˜Enk(s)),Φnk(s),Ψnk(s),Xnk(s):s≤t,k∈M}. | (2.13) |
It is obvious that Xnk(t) is adapted to filtration Fn and that Fnt is the information available at time t. In addition, let τnk(t) be the first arrival time in queue k that is not earlier than time t, i.e.,
τnk(t)=inf{a≥t:Ank(a)−Ank(a−)>0},k∈M. | (2.14) |
We can then introduce the corresponding information field as follows:
Gnt=σ{Ank(τnk(t)+a)−Ank(τnk(t)),Snk(Enk(t)+a)−Snk(Enk(t)),Rnk(˜Enk(t)+a)−Rnk(˜Enk(t)):a≥0,k∈M}. | (2.15) |
On the basis of this notation, we introduce the concept of an admissible policy.
Definition 1. For each k and t, a joint scheduling–control policy is admissible if the following three properties hold:
(1) Fnt is independent of Gnt;
(2) Snk(Enk(t)+a)−Snk(Enk(t)) is equal in law to Snk(a);
(3) Rnk(˜Enk(t)+a)−Rnk(˜Enk(t)) is equal in law to Rnk(a).
We first make the following heavy- traffic assumption (cf. [11,20,26,27]) for our subsequent study.
Assumption 1. There are constants λ0k∈(0,∞), μ0k∈(0,∞), and ˉλk∈(0,∞) for each k∈M such that
m∑k=1λ0kμk0=1, |
and, as n→∞,
λnkn⟶λ0k,n1/2(n−1λnk−λ0k)⟶ˉλk. |
Second, for all t≥0, we introduce the related scaling processes as follows. Let ˆAnk(⋅) be the process obtained by diffusion scaling and centering of Ank(⋅), which takes the form
ˆAnk(t)=Ank(t)−λnkt√n. | (2.16) |
Furthermore, let ˉΦnk and ˉΨnk be fluid-scaled processes defined by
ˉΦnk(t)=Φnk(t)nandˉΨnk(t)=Ψnk(t)n. | (2.17) |
Now, let
γ0k=λ0kμ0kandγ0=(γ01,γ02,⋯,γ0m)′. | (2.18) |
We can then introduce the corresponding diffusion-scaled processes as
ˆΦnk(t)=Φnk(t)√nandˆΨnk(t)=Ψnk(t)−nγ0k√n. | (2.19) |
In addition, there are diffusion-scaled processes as follows:
ˆDnk(t)=Dnk(t)−nλ0kt√n, | (2.20) |
ˆSnk(t)=Snk(nt)−nt√n, | (2.21) |
ˆRnk(t)=Rnk(nt)−nt√n. | (2.22) |
We then define
ˆXnk(t)≜ˆΦnk(t)+ˆΨnk(t)=Xnk(t)−γ0kn√n, | (2.23) |
and
ˆXn(t)=(ˆXn1(t),ˆXn2(t),⋯,ˆXnm(t))′. | (2.24) |
Next, we define
ˆξnk(ˆXnt)=ξnk(√nˆXnt+nγ0), | (2.25) |
where ˆξnk(⋅):Rm⟶R+ can be approximated as the ratio of k class customers among all customers in the system of ˆXn(t). Let ˆξn(ˆXnt)=(ˆξn1(ˆXnt),⋯,ˆξnm(ˆXnt))′. From (2.25), we can obtain
m∑k=1ˆξnk(ˆXnt)=1,ˆξnk(ˆXnt)=ξnk(Xnt). | (2.26) |
We can then denote ˆΨnk(t) for each k∈M as
ˆΨnk(t)=Ψnk(t)−nγ0k√n=nξnk(Xnt)−nγ0k√n=√n[ˆξnk(ˆXnt)−γ0k]. | (2.27) |
Remark 1. Note that Xn(t)∈Rm+, but ˆXn(t)∈Rm. The reason is that, in some cases, for a fixed ˜k∈M, Φn˜k(t)=0, and Ψn˜k(t)<nγ0˜k, we have ˆΦn˜k(t)=0 and ˆΨn˜k(t)<0. Consequently, ˆXn˜k(t)<0, and so ˆXn(t)∈Rm.
With the notation above, (2.12) can be rewritten as follows:
{ˆΦnk(t)=ˆΦnk(0)+ˆAnk(t)−ˆDnk(t)−ˆRnk(ˉ˜Enk(t))−ˆ˜Enk(t)+n1/2(n−1λnk−λ0k)t−Lnk(t),ˆΨnk(t)=ˆΨnk(0)+ˆDnk(t)−ˆSnk(1nEnk(t))+√nλ0kt−1√nEnk(t), | (2.28) |
where
ˉ˜Enk(t)=θnk∫t0ˉΦnk(s)ds,ˆ˜Enk(t)=θnk∫t0ˆΦnk(s)ds. |
Assumption 2. For each k∈M, there are non-negative bounded functions Unk(⋅):Rm⟶R+ that satisfy Unk(⋅)=0 when ξnk(Xnt)=0 such that
Lnk(t)=μ0kUnk(Xnt−nγ0√n) | (2.29) |
and there are non-negative bounded functions Hnk(⋅):Sm⟶R+ that satisfy Hnk(⋅)=0 when ξnk(Xnt)=0 such that
μnk(ξn(Xnt),Xnt)={μ0k[1+Hnk(ξn(Xnt))√nξnk(Xnt)],ξnk(Xnt)>0,μ0k,ξnk(Xnt)=0, | (2.30) |
where μ0k>0 for all k∈M are constants called the basic service rates defined in Assumption 1 (cf. Weerasinghe [9]); ˆξnk(⋅), defined in (2.25), represents the server scheduling control; Unk(⋅) represents customer admission control; and Hnk(⋅) represents the server rate control.
Remark 2. A typical example of the rate control law for μnk(⋅,⋅) in (2.8) can be designed by generalizing the discussion in Weerasinghe [9] for the single-class case to our multiclass case. The function μnk(⋅):Rm⟶R+ can take the following basic form:
μnk(x)={μ0k[1+1√nUnk(x√n)],x∈Rm+,μ0k,otherwise. | (2.31) |
Here μ0k for all k∈M is called the basic service rates.
Remark 3. Based on Assumption 2, (2.6), and (2.26), 1√nEnk(t) in (2.28) can be written as
1√nEnk(t)=1√n∫t0μnk(ξn(Xns),Xns)⋅Ψnk(s)ds | (2.32) |
=1√n∫t0μ0k[1+Hnk(ˆξn(Xnt−nγ0√n))√nξnk(Xnt)]⋅nξnk(Xns)ds=√nμ0k∫t0ξnk(Xns)ds+μ0k∫t0Hnk(ˆξn(ˆXns))ds=√nμ0k∫t0ˆξnk(ˆXns)ds+μ0k∫t0Hnk(ˆξn(ˆXns))ds. | (2.33) |
According to the notation introduced in Assumptions 1 and 2, from (2.28), we can obtain
ˆXnk(t)=xnk+σkˆWnk(t)+ˆλnkt−θnk∫t0ˆXnk(s)ds−√n(μ0k−θnk)∫t0ˆξnk(ˆXns)ds−μ0k∫t0Unk(ˆXns)ds−μ0k∫t0Hnk(ˆξn(ˆXns))ds, | (2.34) |
where
σkˆWnk(t)=ˆAnk(t)−ˆSnk(1nEnk(t))−ˆRnk(ˉ˜Enk(t)),σk=(λkC2τ,k+λk)1/2,ˆλnk=n−1/2λnk−n1/2γ0kθnk,xnk=ˆXnk(0). |
Remark 4. Now, we show that (2.34) includes the case of ξnk(Xnt)=0.
First, when ξnk(Xnt)=0, we have Ψnk(t)=0, ˆΨnk(t)=−√nγ0k, ˆξnk(ˆXnt)=0, Enk(t)=0, and ˆSnk(1nEnk(t))=0. Substituting these values into (2.28), we obtain
ˆXnk(t)=xnk+ˆAnk(t)−ˆRnk(ˉ˜Enk(t))+n−1/2λnkt−θnk∫t0ˆΦnk(s)ds=xnk+ˆAnk(t)−ˆRnk(ˉ˜Enk(t))+n−1/2λnkt−θnk∫t0[ˆXnk(s)−ˆΨnk(s)]ds=xnk+ˆAnk(t)−ˆRnk(ˉ˜Enk(t))+n−1/2λnkt−θnk∫t0ˆXnk(s)ds+θnk∫t0√n(0−γ0k)ds=xnk+ˆAnk(t)−ˆRnk(ˉ˜Enk(t))+n−1/2λnkt−θnk∫t0ˆXnk(s)ds−√nθnkγ0kt. | (2.35) |
Second, when ξnk(Xnt)=0, (2.34) can be written as
ˆXnk(t)=xnk+ˆAnk(t)−ˆRnk(ˉ˜Enk(t))+(n−1/2λnk−n1/2γ0kθnk)t−θnk∫t0ˆXnk(s)ds. | (2.36) |
The two equations above have the same form, verifying that (2.34) includes the case of ξnk(Xnt)=0.
The corresponding vector form of (2.34) can be written as
ˆXn(t)=xn+σˆWn(t)+∫t0bn(ˆXns,Uns,ˆξns,Hns)ds, | (2.37) |
where xn=(xn1,⋯,xnm)′, σ=diag(σk,k∈M) (here, "diag" means "diagonal matrix"), ˆWn(t)=(ˆWn1(t),...,ˆWnm(t))′, and
bn(ˆXn,Un,ˆξn,Hn)=ˆλn−θnˆXn−√n(μ0−θn)ˆξn(ˆXn)−μ0Un(ˆXn)−μ0Hn(ˆξn(ˆXn)). | (2.38) |
Furthermore, the terms used in (2.39) have the following expressions:
ˆλn=(ˆλn1,⋯,ˆλnm)′,θn=diag(θnk,k∈M),μ0=diag(μ0k,k∈M),ˆξn(ˆXn)=(ˆξn1(ˆXn),⋯,ˆξnm(ˆXn))′,Un(ˆXn)=(Un1(ˆXn),⋯,Unm(ˆXn))′,Hn(ˆξn(ˆXn))=(Hn1(ˆξn(ˆXn)),⋯,Hnm(ˆξn(ˆXn)))′. |
At this point, our system′s equations are fully developed. In this section, we scale the system equations, present some necessary heavy traffic assumptions, and, most importantly, present the specific form of the control function μnk(ξn(Xnt),Xnt). Next, we provide the cost and value functions so that our original physical model is fully described.
Let (xn,ˆXn,Un,ˆξn,Hn)n≥1 be as described in Section 2.2; the corresponding cost function can be defined as
Jn(xn,ˆXn,Un,ˆξn,Hn)=E∫∞0e−αtC(ˆXnt,Unt,ˆξnt,Hnt)dt, | (2.39) |
where C:Rm×Rm×Rm×Sm↦R+ is a non-negative continuous function as follows:
C(ˆXnt,Unt,ˆξnt,Hnt)=cn+m∑k=1pnkˆXnk(t)+m∑k=1snkˆξnk(t)+m∑k=1qnkUnk(t)+m∑k=1hnkHnk(t), | (2.40) |
where cn,pnk, snk, qnk, and hnk are constants. This equation can be interpreted as follows. At time t, the number of idle servers can be represented by n−m∑k=1Ψnk(t); subsequently, the idle server cost can be represented by −m∑k=1c1kˆΨnk(t), where c1k is a constant. The abandonment cost can be represented by m∑k=1c2kθnkˆΦnk(t), where c2k is a constant and θnk is the rate at which class k customers abandon their queue, as defined in Section 2.1. The delay cost can be represented by m∑k=1c3kˆΦnk(t), where c3k is a constant. These three costs can be unified in the form of m∑k=1pnkˆXnk(t), which we call the queuing cost. m∑k=1snkˆξnk(t) represents the corresponding server scheduling costs, m∑k=1qnkUnk(t) represents the costs of customer admission control, and m∑k=1hnkHnk(t) represents the costs of server rate control. Fixed costs, such as equipment depreciation and the basic salary of personnel, can also be considered; here, they are denoted by cn, which is a constant.
Many cost factors need to be considered in practical applications. We provide a few typical examples here. The cost factors to be considered can be increased or decreased according to the specific situation. We will show later that any cost function can be used as long as it satisfies the following assumption.
Assumption 3. We assume that function C satisfies the following:
(1) There is δ∈(0,1), and for any compact G⊂Rm, there is c0 depending only on G such that
|C(x,U,ˆξ,H)−C(y,U,ˆξ,H)|≤c0‖x−y‖δ, | (2.41) |
holds for (U,ˆξ,H)∈Rm×Rm×Sm and x,y∈G.
(2) Constants c1>0 and kC≥0 exist such that
C(x,U,ˆξ,H)≤c1(1+‖x‖kC), | (2.42) |
where ‖x‖=∑mk=1|xk|, (U,ˆξ,H)∈Rm×Rm×Sm, and x∈Rm.
Definition 2. Let Dn be the collection of all such admissible (ˆXn,Un,ˆξn,Hn); the value function Vn is defined by
Vn=infJn(xn,ˆXn,Un,ˆξn,Hn), | (2.43) |
where the infimum is taken over all available processes (ˆXn,Un,ˆξn,Hn) in Dn.
In this section, we present a diffusion control problem that can be regarded as a limiting form of the stochastic model described in Section 2.2. The use of the Brownian system as a heavy traffic approximation of a queuing system has been a long-standing idea, and the reader is referred to [3] and [28] for a list of comprehensive references.
Let us define a complete probability space, (Ω,F,P), where the expectation with respect to P is denoted by E; in this space, all the stochastic processes below are defined. First, we make two assumptions to facilitate obtaining the corresponding diffusion control problem from the previously established joint scheduling–control problem.
Assumption 4. As n→∞, for each k∈M, there are constants θ0k∈[0,∞), ck∈[0,∞) and λk∈R such that
ˆλnk→λk,θnk→θ0k,√n(μ0k−θnk)→ck. |
Assumption 5. As n→∞, for each k∈M, t≥0, there are constants ϕk∈[0,∞), ψk∈R such that
ˆΦnk(0)→ϕk,ˆΨnk(0)→ψk, |
where
ˆΦnk(0)=Φnk(0)√n,ˆΨnk(0)=Ψnk(0)−γ0kn√n, |
and ∑Mψk≤0.
Next, let xk=ϕk+ψk; thus, we consider a controlled state process Xk(t), which is a weak solution to the following equation:
Xk(t)=xk+σkWk(t)+λkt−θ0k∫t0Xk(s)ds−ck∫t0ξk(Xs)ds−μ0k∫t0Uk(Xs)−μ0k∫t0Hk(ξ(Xs))ds, | (2.44) |
where Uk(⋅):Rm⟶R+, Hk(⋅):Sm⟶R+, ξk(⋅):Rm⟶R+, ∑mk=1ξk(⋅)=1 and
Wk(t)=σ−1kAk(t),Xs=(X1(s),⋯,Xm(s))′,ξ(Xs)=(ξ1(Xs),⋯,ξm(Xs))′. |
Now, we can obtain
X(t)=x+σW(t)+∫t0b(Xs,Us,ξs,Hs)ds, | (2.45) |
where x=(x1,⋯,xm)′, σ=diag(σk,k∈M), W(t)=(W1(t),⋯,Wm(t))′, and
b(X,U,ξ,H)=λ−θ0X−cξ(X)−μ0U(X)−μ0H(ξ(X)). | (2.46) |
Furthermore, the terms used in (2.46) have the following expressions:
λ=(λ1,⋯,λm)′,θ0=diag(θ0k,k∈M),c=diag(ck,k∈M),ξ(X)=(ξ1(X),⋯,ξm(X))′,U(X)=(U1(X),⋯,Um(X))′,H(X)=(H1(ξ(X)),⋯,Hm(ξ(X)))′. |
Assumption 6. To ensure that Xk(⋅) in (2.44) does not explode, we assume that there exists a constant M≥0 such that
0≤Uk(⋅)≤M, | (2.47) |
0≤Hk(⋅)≤M, | (2.48) |
where k∈M.
Now, we introduce the following cost function for the state Eq (2.45):
J(x,X,U,ξ,H)=Eπx∫∞0e−αtC(Xt,Ut,ξt,Ht)dt, | (2.49) |
where C:Rm×Rm×Rm×Sm↦R+ is a non-negative continuous function as follows:
C(Xt,Ut,ξt,Ht)=c+m∑k=1pkXk(t)+m∑k=1skξk(t)+m∑k=1qkUk(t)+m∑k=1hkHk(t), | (2.50) |
where c,pk, sk, qk, and hk are constants.
In Section 3, we present several numerical examples in which the cost functions refer to (2.50). The costs presented here are not used in every example. We provide this form to facilitate some of the numerical simulations given in Section 3. Practical applications can be modified to suit a given situation, and we will show later that any cost function that satisfies Assumption 3 is usable.
Definition 3. Given an initial state x, we call
π=(Ω,F,(Ft),P,U,ξ,H,W) |
an admissible control system if
(1) (Ω,F,(Ft),P) is a complete filtered probability space;
(2) (U,ξ,H)∈Rm×Rm×Sm is an F-measurable, (Ft)-progressively measurable process;
(3) W is standard m-dimensional (Ft)-Brownian motion.
Definition 4. Let D be the set of all such admissible (X,U,ξ,H). The value function for the control problem is defined as follows:
V(x)=infπ∈ΠJ(x,X,U,ξ,H), | (2.51) |
where the infimum is taken over all available processes (X,U,ξ,H) in D.
Theorem 1. Let the system equation be (2.45) and the cost function be (2.49); then, there is a Markov control policy that is optimal for all x∈Rm.
Theorem 1 is an integral part of Theorem 3, which states that the value function V is the unique solution of the corresponding HJB equation. For this reason, Theorem 1 holds without proof as long as Theorem 3 is proved later. We present Theorem 1 here to facilitate the construction of an asymptotically optimal policy for our original joint scheduling–control problem in the next subsection.
In this section, we present an asymptotically optimal control policy for the original problem, which is defined by the optimal Markov control policy described in Theorem 1.
To state this result, we need to introduce a Markov control policy, namely ˜η=(˜Un,˜ξn,˜Hn), where ˜Un=U∗,˜ξn=T(ξ∗),˜Hn=H∗, T is a measurable map that guarantees that the components of Ψn(t) are integer-valued, and η∗=(U∗,ξ∗,H∗) is the optimal Markov control policy described in Theorem 1. Let Ψn,∗=nξ∗ and ˜Ψn=n˜ξn=nT(ξ∗), and define
˜Ψnk={[Ψn,∗k],k=1,2,⋯,m−1,Ψn,∗k+m−1∑i=1(Ψn,∗i−[Ψn,∗i]),k=m. | (2.52) |
Clearly, m∑k=1Ψn,∗i=m∑k=1˜Ψnk=n and ˜Ψnk∈Z+. Moreover, ∥˜Ψn−Ψn,∗∥≤2m.
Now, for a given initial condition xn and a stochastic sequence system, we write
V_n=lim infn→∞Jn(xn,ˆXn,Un,ˆξn,Hn), | (2.53) |
where Jn is defined in (2.39) and represents the cost function for the joint scheduling–control problem.
Theorem 2. Let (xn,ˆXn,Un,ˆξn,Hn)n≥1 be any sequence of the stochastic system described in Section 2.2, let (Jn(xn,ˆXn,Un,ˆξn,Hn))n≥1 be the associated cost function defined in (2.39) that satisfies Assumption 3, and let Xnk≥˜Ψnk(k∈M). We thus obtain
limn→∞E∫∞0e−αtC(ˆXn(t),˜Un(t),˜ξn(t),˜Hn(t))dt≤V_n. |
Theorem 2 is the main theorem of this paper, which states the asymptotic optimality of the joint scheduling–control policy ˜η=(˜Un,˜ξn,˜Hn) by using the optimal Markov policy η∗=(U∗,ξ∗,H∗). Since the proof of this theorem is complicated and requires the assistance of lemmas, it is presented in Section 4.
In this section, we present several comparative experiments to compare the system′s performance and value functions under different control policies. The results show that our designed joint scheduling–control policy has significant advantages in reducing the system′s cost and improving service efficiency. This numerical simulation not only confirms the rationality and practicality of the previous theoretical analysis but also provides a valuable reference for practical applications.
The diffusion control problem is a stochastic differential equation, which can be regarded as the limiting case of our original problem. U, ξ, and H in (2.46) can be regarded as the customer admission control function, the server scheduling control function, and the server rate control function, respectively, in the diffusion control problem. Therefore, in this section, we provide a specific example of the diffusion control problem and some corresponding examples to illustrate the superiority of our model.
Example 1 (Two-class model with our joint scheduling–control policy). For a numerical example, we refer to (2.44) and (2.45) in Section 2.4, with m=2, and consider a system of our joint scheduling–control policy with two customer classes as follows:
{dx1=[λ1−θ01x1−c1ξ1(x1,x2)−μ01U1(x1,x2)−μ01H1(ξ1(x1,x2),ξ2(x1,x2))]dt+σ1dW1,dx2=[λ2−θ02x2−c2ξ2(x1,x2)−μ02U2(x1,x2)−μ02H2(ξ1(x1,x2),ξ2(x1,x2))]dt+σ2dW2, | (3.1) |
as well as the following parameter values:
λ1=1.00,θ01=0.01,c1=0.98,μ01=1.00,σ1=1.00, |
λ2=0.50,θ02=0.02,c2=1.96,μ02=2.00,σ2=0.80, |
where λk,k=1,2 can be regarded as the arrival rate of customers of class k, θk,k=1,2 can be regarded as the abandon rate of customers of class k, ck,k=1,2 can be regarded as the server scheduling control rate of customers of class k, μ0k,k=1,2 can be regarded as the basic service rates defined in Assumption 1, and σk,k=1,2 can be regarded as the diffusion coefficient defined in (2.34).
Now, we refer to Eq (2.50) to define the relevant cost function of this model as follows:
C12(Xt,Ut,ξt,Ht)=1+p1x1+p2x2+s1ξ1(x1,x2)+s2ξ2(x1,x2)+h1H1(ξ1(x1,x2),ξ2(x1,x2))+h2H2(ξ1(x1,x2),ξ2(x1,x2))+q1U1(x1,x2)+q2U2(x1,x2), | (3.2) |
with the parameter values
p1=0.50,s1=0.25,q1=0.25,h1=0.25, |
p2=1.00,s2=1.00,q2=1.00,h2=1.00. |
The constant 1 in (3.2) represents the fixed costs, such as equipment depreciation and the basic salary of personnel; pkxk,k=1,2 represents the queuing costs; skξk,k=1,2 represents the corresponding server scheduling costs; qkUk,k=1,2 represents the costs of customer admission control; and hkHk,k=1,2 represents the costs of server rate control.
To facilitate calculation, we refer to Eq (2.51) in Section 2.4 and define a deformation of the value function as follows:
V12=min∫10C12(Xt,Ut,ξt,Ht)dt. | (3.3) |
Through the HJB equation, we obtain an optimal policy, i.e., the server scheduling control function, the customer admission control function, and the server rate control function, as follows:
ξ1(x1,x2)=x1/(x1+x2),ξ2(x1,x2)=x2/(x1+x2), |
U1(x1,x2)=arctan(x1/x2),U2(x1,x2)=arctan(x2/x1), |
H1(ξ1(x1,x2),ξ2(x1,x2))=ex1−x2/x1+x2,H2(ξ1(x1,x2),ξ2(x1,x2))=ex2−x1/x1+x2. |
Given that x1(0)=2 and x2(0)=3, we use the Euler–Maruyama (EM) method, which is a standard numerical tool for solving SDEs. With MATLAB (cf. [29], [30], [31]), we obtain V12=4.0850. The plot of this two-class model (Figure 3) shows that the sizes of the two customer classes are more balanced because of customer admission control, server scheduling control, and server rate control. Cases in which the number of one type of customer is large and the number of the other type is small are relatively rare. Additionally, the numbers of both classes of customers decline in the later period. The plot of the cost function C12=C12(Xt,Ut,ξt,Ht) (Figure 4) shows that the cost function of this two-class model increases rapidly in the initial stage but then continues to decline. These two figures tentatively illustrate that the joint scheduling–control policy we have devised is effective.
Below, we present five other models (i.e., without control policy, with customer admission control only, with server scheduling control only, with customer admission control and server scheduling control, and with server scheduling control and server rate control) to illustrate the superiority of our joint scheduling–control policy.
Example 2 (Two-class model without control policy). We present a system without any control as follows:
{dx3=[λ1−θ01x3]dt+σ1dW1,dx4=[λ2−θ02x4]dt+σ2dW2. | (3.4) |
We use the following parameter values:
λ1=1.00,θ01=0.01,σ1=1.00, |
λ2=0.50,θ02=0.02,σ2=0.80. |
Given x3(0)=2 and x4(0)=3, we can obtain an operational image of this system (Figure 5), where the total trend of the number of customers in both categories is increasing. This trend shows that without the control function, the system is very inefficient, and customers will increasingly accumulate.
Next, we define the corresponding cost function of this example as follows:
C34(Xt)=1+p1x3+p2x4, | (3.5) |
where p1=0.50 and p2=1.00; the plot of the cost function C34=C34(Xt) (Figure 6) shows that the cost of this model continues to increase, which is not desirable.
The value function is
V34=min∫10C34(Xt)dt=4.9069. | (3.6) |
A comparison of Figures 3 and 5 reveals that the sizes of the two classes of customers in the model of our joint scheduling–control policy are more balanced because of customer admission control, server scheduling control, and server rate control. Cases in which the number of one type of customer is large and the number of the other type is small are relatively rare, as shown in Figure 3. In Figure 3, the numbers of both classes of customers decline in the later period, while in Figure 5, the decline is not obvious. A comparison of Figures 4 and 6 reveals that the cost function of our model increases rapidly in the initial stage but then continues to decline. However, the cost function in Figure 6 does not decline significantly after increasing to a certain extent. The size of the value function is more convincing: V12=4.0850, V34=4.9069. We compare the value functions of the five examples in Figure 2.
Example 3 (Two-class model with customer admission control only). We present a system with customer admission control only as follows:
{dx5=[λ1−θ01x5−μ01U1(x5,x6)]dt+σ1dW1,dx6=[λ2−θ02x6−μ02U2(x5,x6)]dt+σ2dW2, | (3.7) |
with the parameter values
λ1=1.00,θ01=0.01,μ01=1.00,σ1=1.00, |
λ2=0.50,θ02=0.02,μ02=2.00,σ2=0.80. |
We define the customer admission control functions U1 and U2 in the same way as before; we assume that x5(0)=2 and x6(0)=3 are given.
We define the relevant cost function of this example as follows:
C56(Xt,Ut)=1+p1x5+p2x6+q1U1(x5,x6)+q2U2(x5,x6), | (3.8) |
where p1=0.50, p2=1.00, q1=0.25, and q2=1.00.
The value function is
V56=min∫10C56(Xt,Ut)dt=5.7043. | (3.9) |
Moreover, as seen from the system's operating diagram (Figure 7) and the image of the cost function (Figure 8), this model is clearly not as effective as the one that uses our joint scheduling–control policy.
Example 4 (Two-class model with server scheduling control only). We present a system with server scheduling control only as follows:
{dx7=[λ1−θ01x7−c1ξ1(x7,x8)]dt+σ1dW1,dx8=[λ2−θ02x8−c2ξ2(x7,x8)]dt+σ2dW2, | (3.10) |
with the parameter values
λ1=1.00,θ01=0.01,c1=0.98,σ1=1.00, |
λ2=0.50,θ02=0.02,c2=1.96,σ2=0.80. |
We define the server scheduling control functions ξ1 and ξ2 as previously described and assume that x7(0)=2 and x8(0)=3 are given.
Now, we define the relevant cost function of this model as follows:
C78(Xt,ξt)=1+p1x7+p2x8+s1ξ1(x7,x8)+s2ξ2(x7,x8), | (3.11) |
with p1=0.50, p2=1.00, s1=0.25, and s2=1.00.
The value function is
V78=min∫10C78(Xt,ξt)dt=5.5537. | (3.12) |
Moreover, as seen from the system's operating diagram (Figure 9) and the image of the cost function (Figure 10), this model is clearly not as effective as the one that uses our joint scheduling–control policy.
Example 5 (Two-class model with customer admission and server scheduling control). We present a system with customer admission control and server scheduling control as follows:
{dx1=[λ1−θ01x9−c1ξ1(x9,x10)−μ01U1(x9,x10)]dt+σ1dW1,dx2=[λ2−θ02x10−c2ξ2(x9,x10)−μ02U2(x9,x10)]dt+σ2dW2, | (3.13) |
with the parameter values
λ1=1.00,θ01=0.01,c1=0.98,μ01=1.00,σ1=1.00, |
λ2=0.50,θ02=0.02,c2=1.96,μ02=2.00,σ2=0.80. |
We define the server allocation functions ξ1, ξ2, U1, and U2 as previously described and take x9(0)=2 and x10(0)=3 as given.
Now, we define the relevant cost function of this model as follows:
C910(Xt,ξt,Ut)=1+p1x9+p2x10+s1ξ1(x9,x10)+s2ξ2(x9,x10)+q1U1(x9,x10)+q2U2(x9,x10), | (3.14) |
with p1=0.50, p2=1.00, s1=0.25, s2=1.00, q1=0.25, and q2=1.00.
The value function is
V910=min∫10C910(Xt,ξt,Ut)dt=5.1434. | (3.15) |
Moreover, as seen from the system's operating diagram (Figure 11) and the image of the cost function (Figure 12), this model is clearly not as effective as the one that uses our joint scheduling–control policy.
Example 6 (Two-class model with server scheduling and server rate control). We present a system with server scheduling control and server rate control as follows:
{dx1=[λ1−θ01x11−c1ξ1(x11,x12)−μ01H1(ξ1(x11,x12),ξ2(x11,x12))]dt+σ1dW1,dx2=[λ2−θ02x12−c2ξ2(x11,x12)−μ02H2(ξ1(x11,x12),ξ2(x11,x12))]dt+σ2dW2, | (3.16) |
with the parameter values
λ1=1.00,θ01=0.01,c1=0.98,μ01=1.00,σ1=1.00, |
λ2=0.50,θ02=0.02,c2=1.96,μ02=2.00,σ2=0.80. |
We define the server allocation functions ξ1, ξ2, H1, and H2 as previously described and take x11(0)=2 and x12(0)=3 as given.
We define the relevant cost function of this model as follows:
C1112(Xt,ξt,Ht)=1+p1x11+p2x12+s1ξ1(x11,x12)+s2ξ2(x11,x12)+h1H1(ξ1(x11,x12),ξ2(x11,x12))+h2H2(ξ1(x11,x12),ξ2(x11,x12)), | (3.17) |
with p1=0.50, p2=1.00, s1=0.25, s2=1.00, h1=0.25, and h2=1.00.
The value function is
V1112=min∫10C1112(Xt,ξt,Ht)dt=4.9074. | (3.18) |
Moreover, as seen from the system's operating diagram (Figure 13) and the image of the cost function (Figure 14), this model is clearly not as effective as the one that uses our joint scheduling–control policy.
Compared with other policies, our designed joint scheduling–control policy has the lowest cost, as shown by the leftmost blue rectangle in Figure 2. The results show that our designed joint scheduling–control policy has significant advantages in reducing the system′s costs and improving service efficiency. This numerical simulation not only confirms the rationality and practicality of the previous theoretical analysis but also provides a valuable reference for practical applications.
In this section, we analyze in depth the existence and uniqueness of the solution of the HJB equation for the diffusion control problem presented in Section 2.4, which provides a solid theoretical foundation for designing an asymptotically optimal control policy for the original problem. Then, the asymptotic optimality of the joint scheduling control policy for multiclass parallel queuing models is proved in detail through the proof of several lemmas.
We prove the existence of a unique solution to the corresponding Hamilton–Jacobi–Bellman (HJB) equation, which is the optimal Markov control policy for the diffusion control problem. Theorem 3, which is proven in this section, is a complete version of Theorem 1 in Section 2.4. Thus, proving Theorem 3 effectively proves Theorem 1.
First, we introduce H:Rm×Rm↦R as follows:
H(x,h)=inf(U,ξ,H)∈Rm×Rm×Sm{b(x,U,ξ,H)⋅h+C(x,U,ξ,H)}, | (4.1) |
where b(x,U,ξ,H), defined in (2.46), is the corresponding vector form of bk(x,U,ξ,H) and C(x,U,ξ,H) is defined in (2.49). The corresponding HJB equation (cf. [13]) can generally be written in the following form:
12m∑k=1σ2k∂2f∂x2k+H(x,▽f(x))−αf=0, | (4.2) |
where σk is defined in (2.44) and α is defined in (2.49).
Remark 5. The system equation is
X(t)=x+σW(t)+∫t0b(Xs,Us,ξs,Hs)ds. | (4.3) |
The cost function is
J(x,X,U,ξ,H)=Eπx∫∞0e−αtC(Xt,Ut,ξt,Ht)dt. | (4.4) |
We define
(L⋅)(x)=∂⋅∂t(x)+m∑k=1bk(X,U,ξ,H)∂⋅∂xk+12m∑k=1σ2k∂2⋅∂x2k. |
The HJB equation for V(x)=inf(U,ξ,H)∈Rm×Rm×SmJ(x,X,U,ξ,H) is
0=inf(U,ξ,H)∈Rm×Rm×Sm{C(X,U,ξ,H)+(LV)(x)}=inf(U,ξ,H)∈Rm×Rm×Sm{C(X,U,ξ,H)+∂V∂t(x)+m∑k=1bk(X,U,ξ,H)∂V∂xk+12m∑k=1σ2k∂2V∂x2k}=12m∑k=1σ2k∂2V∂x2k+inf(U,ξ,H)∈Rm×Rm×Sm{m∑k=1bk(X,U,ξ,H)∂V∂xk+C(X,U,ξ,H)}−αV. |
Theorem 3. There exists a classical solution f∈C2(Rm) to the HJB equation (4.2), and we have the following:
(i) Constants c>0 and k>0 exist such that
|f|≤c(1+‖x‖k),x∈Rm. |
(ii) The solution f is unique, and
f=V, |
i.e., there is an optimal Markov control policy for all x∈Rm.
Before this theorem is proven, we need to introduce several lemmas.
Lemma 1. The Hamiltonian H(x,h), which is defined in (4.1), is Hölder continuous with the exponent δ, where δ is defined in Assumption 3.
Proof. For ε>0, ∀y∈Rm, ∀h1∈Rm, by the infimum definition of H(x,h), (U0,ξ0,H0)∈Rm×Rm×Sm exists such that
H(y,h1)≥b(y,U0,ξ0,H0)h1+C(y,U0,ξ0,H0)−ε. | (4.5) |
Additionally, with the infimum definition of H(x,h), for ∀x∈Rm, ∀h2∈Rm, and for the same (U0,ξ0,H0)∈Rm×Rm×Sm in (4.5), we can obtain
H(x,h2)≤b(x,U0,ξ0,H0)h2+C(x,U0,ξ0,H0). |
According to the above two inequalities, we can derive the following:
H(x,h2)−H(y,h1)≤b(x,U0,ξ0,H0)h2−b(y,U0,ξ0,H0)h1+C(x,U0,ξ0,H0)−C(y,U0,ξ0,H0)+ε=b(x,U0,ξ0,H0)h2−b(y,U0,ξ0,H0)h2+b(y,U0,ξ0,H0)h2−b(y,U0,ξ0,H0)h1+C(x,U0,ξ0,H0)−C(y,U0,ξ0,H0)+ε≤c‖h2‖‖x−y‖+c‖h2−h1‖+c‖x−y‖δ+ε, |
where the last inequalities use the Lipschitz property of b and the Hölder property of C. Since ε>0 is arbitrary, H(x,h) is Hölder continuous with the exponent δ.
Lemma 2. For any x∈Rm and an admissible system π, with X as the controlled process associated with x and π, we have
Eπx∥X(t)∥N≤cN(1+∥x∥N),t≥0, | (4.6) |
where N∈N and the constants cN are independent of π,x, and t.
Proof. Equation (2.44) can be written as follows:
Xk(t)=xk+σkWk(t)+∫t0[−θ0kXk(s)−ckξk(Xs)−μ0kUk(Xs)−μ0kHk(ξ(Xs))+λk]ds. | (4.7) |
Next, for all t≥0, by Theorem 5.2.5 of [32], we define a process ˜Xk as the unique solution to the equation
˜Xk(t)=xk+σkWk(t)+∫t0[−θ0k˜Xk(s)−ckI[Xk(s)<0]−2μ0kMI[Xk(s)<0]+λk]ds. | (4.8) |
Letting ˜X=(˜X1,⋯,˜Xm)′, we have
˜X(t)=x+σW(t)+∫t0˜b(s,˜X)ds, | (4.9) |
where
˜b(s,˜X)=−θ0˜X(s)−cI[X(s)<0]−2μ0kMI[X(s)<0]+λ |
and M≥0 is as given in Assumption 6. Additionally,
λ=(λ1,⋯,λm)′,θ0=diag(θ0k,k∈M),c=diag(ck,k∈M),σ=diag(σk,k∈M),I[X(s)<0]=(I[X1(s)<0],⋯,I[Xm(s)<0])′. |
In this case, a constant K≥0 exists such that
∥˜b(t,x)−˜b(t,y)∥+∥σ−σ∥≤K∥x−y∥, |
∥˜b(t,x)∥2+∥σ∥2≤K2(1+∥x∥2), |
for every 0≤t<∞ and x∈Rm,y∈Rm. That is, the coefficients ˜b,σ satisfy the global Lipschitz and linear growth conditions.
For every T>0, according to Theorem 5.2.9 of [32], a positive constant cK,T depending only on K and T exists such that
Eπx∥˜X(t)∥2≤C(1+∥x∥2)ecK,Tt,0≤t≤T. | (4.10) |
Since ˜Xk is Gaussian, we have
Eπx∥˜X(t)∥N≤cN(1+∥x∥N),t≥0. | (4.11) |
Next, we aim to show that |Xk(t)|≤2|˜Xk(t)|, so we introduce Yk(t)=Xk(t)−˜Xk(t); we then have
Yk(t)=−∫t0[θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]ds,t≥0. | (4.12) |
The process ˜U1k is given by
˜U1k(t)={ξk(Xt),Xk(t)≥0,ξk(Xt)−1,Xk(t)<0. | (4.13) |
The process ˜U2k is given by
˜U2k(t)={Uk(Xt),Xk(t)≥0,Uk(Xt)−M,Xk(t)<0. | (4.14) |
The process ˜U3k is given by
˜U3k(t)={Hk(ξ(Xt)),Xk(t)≥0,Hk(ξ(Xt))−M,Xk(t)<0, | (4.15) |
where M is defined in Assumption 6. Next, we introduce a non-negative, differentiable function F on R such that
{F′(x)<0,x∈(−∞,−K),F(x)=0,x∈(−K,K),F′(x)>0,x∈(K,∞), | (4.16) |
where K=|˜Xk(t)|.
We then have
F(Yk(t))=−∫t0F′(Yk(s))[θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]ds,t≥0. |
If Yk(t)<−K, then Xk(t)<˜Xk(t)−|˜Xk(t)|≤0. Using(4.13)–(4.15), we have [θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]≤0. Consequently,
F′(Yk(s))[θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]≥0. |
If Yk(t)>K, then Xk(t)>˜Xk(t)+|˜Xk(t)|≥0. Using (4.13)–(4.15), we have [θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]≥0. Consequently,
F′(Yk(s))[θ0kYk(t)+ck˜U1k(s)+μ0k˜U2k(s)+μ0k˜U3k(s)]≥0. |
If |Yk(t)|≤K, we have F′(Yk(t))=0.
By combining these facts, we obtain F(Yk(t))≤0, so F(x)=0 only on [−K,K], which yields |Yk(t)|≤K. Consequently,
|Xk(t)|≤2|˜Xk(t)|. |
Using (4.11), it follows that Eπx∥X(t)∥N≤cN(1+∥x∥N).
Proof of Theorem 3. We first use a standard truncation idea (cf. [33]), which enables us to study a sequence of quasilinear Partial Differential Equations (PDEs) with a Dirichlet boundary condition.
Fix i∈N, and let B(0,i)={x:∥x∥≤i}. Then, a policy π and an initial condition X(0)=x∈B(0,i) are fixed. In addition, set Tπi=inf{t≥0:X(t)∈∂B(0,i)}. Next, we consider the diffusion process X, which satisfies (2.45) and terminates at the boundary of B(0,i).
Let
Ji(x,X,U,ξ,H)=Eπx∫Tπi0e−αtC(Xt,Ut,ξt,Ht)dt. | (4.17) |
Let Di be the set of all such admissible (X,U,ξ,H); the value function Vi is defined as
Vi=infJi(x,X,U,ξ,H), | (4.18) |
where the infimum is taken over all the available processes (X,U,ξ,H) in Di.
Assuming that ¯f(x)=0,x∈∂B(0,i) and using Theorem 8.3 of [34] and Lemma 1 (for details, see [26]), we find that Vi(x) is the unique bounded solution in C2(Rm) of
12m∑k=1σ2k∂2¯f∂x2k+H(x,▽¯f(x))−α¯f=0, | (4.19) |
where H(x,h) is defined in (4.1).
First, we prove that the function V(x) satisfies the growth condition. In fact, from the polynomial condition on C(Xt,Ut,ξt,Ht) in Assumption 3, using Fubini's theorem, we can obtain
V(x)≤c∫∞0Eπx[∥X(t)∥kC]dt, |
where c is a constant that is independent of i, and kC is defined in Assumption 3. Now, we appeal to Lemma 2; thus, we can obtain V(x)≤c(1+‖x‖kC) for all x∈Rm. Note that Vi≤V(x), which implies a uniform bound on Vi.
We then fix j>0 and set B(0,j)={x:∥x∥≤j}; let Δσ denote the second-order operator in the HJB equation with the weights σ2i. Similar to Step 2 in the proof of Theorem 1 in [26], we can see that the families {Vi}, {∇Vi}, and {ΔσVi} are equicontinuous and bounded. Therefore, according to the Arzela–Ascoli Theorem (cf. [35]), convergent subsequences of {Vi}, {∇Vi}, and {ΔσVi} exist. For brevity, we also denote these convergent subsequences with the subscript i. Then, there is a V(x), which satisfies Vi→V, ∇Vi→∇V, and ΔσVi→ΔσV uniformly on B(0,j).
The improved smoothness of V(x) is then obtained with the standard PDE arguments (cf. [34]). Vi adheres to the Hamilton–Jacobi–Bellman (HJB) equation, complete with its specified boundary conditions, and converges uniformly to V within the ball B(0,j). Utilizing Lemma 1, it is feasible to transfer these limits through the modified version of the HJB equation (4.19), thereby confirming that V(x) is in accordance with the original HJB equation V(x) on B(0,j). Given that the choice of j is unrestricted, it follows that it is compliant with the original HJB equation across the entire space Rm. Furthermore, the definitions of Vi and V(x) indicate that the principle of monotonic convergence dictates
Vi↑V(x)=infEπx∫∞0e−αtC(Xt,Ut,ξt,Ht)dt. | (4.20) |
Thus, the proposed limit V(x) is the value function of the diffusion control problem, implying that the value function V(x) is a classical solution to the HJB equation (4.2).
The uniqueness of V(x) and the existence of optimal Markov control policies for the problem remain to be shown. Let ˜f(x)∈C2(Rm) be any solution to the HJB equation (4.2), and fix a policy π∈Π, assuming that ˜f(x) satisfies the growth condition. Now, by applying the Itô formula to e−αt˜f(Xt), we can obtain the following (for details, see Theorem 5.1 of [13]):
˜f(x)≤J(x,X,U,ξ,H)+lim inft→∞e−αtEπx˜f(Xt). |
Consequently, when ˜f(x)≤c(1+‖x‖k), the last term on the right-hand side of the inequality above converges to zero. Thus, ˜f(x)≤J(x,X,U,ξ,H), and because π is arbitrary, we have ˜f(x)≤V(x).
Let
˜fi(x)={˜f(x),x∈B(0,i),0,x∉B(0,i). | (4.21) |
This result is obviously a solution to Eq (4.19). By repeating our treatment of Vi, we can obtain ˜fi(x)→V(x). Hence, ˜f(x)=V(x) on Rm; i.e., V(x) is the unique solution of the HJB equation (4.2), and there is a Markov control policy that is optimal for all x∈Rm.
In this section, we show that the value function V of the diffusion control problem is an asymptotic lower bound for the value function Vn of the original problem.
Theorem 4. Let (xn,ˆXn,Un,ˆξn,Hn)n≥1 be any sequence of the stochastic system, and let the associated cost function be (Jn(xn,ˆXn,Un,ˆξn,Hn))n≥1, as described in Section 2.3, and let V_n be as defined in (2.53) in Section 2.5. Assuming that xn→x, V is the value function of the diffusion control problem given in Section 2.4.
We then have
V_n≥V(x). |
Before proving this theorem, we need to introduce several lemmas.
Lemma 3. Assuming that for xn→x, the processes ˆXn described in Theorem 4 are applied, and we have
E∥ˆXn(t)∥N≤cN(1+∥x∥N)(1+tN),t≥0, | (4.22) |
where N∈N and the constants cN are independent of n,x, and t.
Proof. Analogous to the proof of Lemma 2, we can obtain
|ˆXnk(t)|≤2|Znk(t)|, |
where
Znk(t)=xnk+σkˆWnk(t)+∫t0(−μ0kZnk(s)+˜I)ds. | (4.23) |
˜I=ˆλnk−√n(μ0k−θnk)I[ˆXnk(s)<0]−2μ0kMI[ˆXnk(s)<0], and M≥0 is the bound of Unk(⋅) and Hnk(⋅) on R. Analogous to proving Proposition 4 in [10], we obtain
∥ˆXn(t)∥≤c[∥xn∥+∥ˆWn(t)∥+∫t0∥Zn(s)∥ds+∥Zn(t)∥+t], |
where Zn(t)=(Zn1,Zn1,⋯,Znm)′ and c is independent of n and t. Solving for Znk(t) in (4.23), we obtain
Znk(t)=xnke−αt+σkˆWnk(t)+˜It−μ0k∫t0(σkˆWnk(t)+˜It)e−μ0k(t−s)ds. | (4.24) |
Therefore,
∥ˆXn(t)∥≤c[1+t2+∥xn∥+∥ˆWn(t)∥+∫t0∥ˆWn(s)∥ds+∫t0∫s0∥ˆWn(v)∥dvds]. | (4.25) |
By
σkˆWnk(t)=ˆAnk(t)−ˆSnk(1nEnk(t))−ˆRnk(ˉ˜Enk(t)), |
and letting an(s)=maxk[n−1θnk(X0,nk+Ank(s))], we can obtain
∥ˆWn(t)∥≤∥ˆAn(t)∥+sups≤t∥ˆSn(s)∥+sups≤an(t)∥ˆRn(s)∥. | (4.26) |
Given Lemma 2 of [10], under the assumption in the definition of {Ank,k∈M} that there is a constant N≥2 such that E(ˆτk(1))N<∞, then by the definition of ˆAn(t), we can obtain
E(∥ˆAn(t)∥)N≤c(1+tN/2), |
where c is independent of n and t.
For the (disjunctive) martingale ˆSn, we apply Burkholder's inequality (see [36], p. 175). Letting αnk(t) be a Poisson random variable with the parameter 2nμ0kt, we can obtain
Esups≤t|ˆSnk(s)|N≤cE([ˆSnk](s))N/2=c(2n)−N/2E(αnk(t))N/2≤c(2n)−N/2(2nμ0kt)N/2≤ctN/2, | (4.27) |
where c is independent of n and t, and [ˆSnk] denotes the quadratic variation processes associated with ˆSnk.
Similarly, we can obtain Esups≤t|ˆRnk(t)|N≤ctN/2. By the independence of An and Rn, we can obtain
Esups≤an(t)|ˆRnk(t)|N=E{E[sups≤an(t)|ˆRnk(t)|N|an(t)]}≤cE(an(t))N/2≤c(1+tN), |
where c is independent of n and t.
By applying Minkowski's inequality to (4.26), we can obtain
E∥ˆWn(t)∥N≤c(1+tN), |
where N and c do not depend on n or t.
According to (4.25), this lemma follows.
Lemma 4. As n→∞, the following results hold for each k∈M:
ˆAnk⟹Ak,ˆSnk⟹Sk,ˆRnk⟹Rk,ˉΦnk⟹0, |
where Ak is a Brownian motion with zero drift, the variance matrices λkC2τ,k, Sk,Rk are standard Brownian motions; and Ak,Sk,Rk are independent of each other.
Proof. Applying Theorem 17.3 in [37] to ˆAnk, it directly follows that ˆAnk⟹Ak, where Ak is a Brownian motion with zero drift and the variance matrices λkC2τ,k. Additionally, applying this theorem to ˆSnk and ˆRnk, we can find that ˆSnk⟹Sk and ˆRnk⟹Rk, where Sk,Rk, are standard Brownian motions. According to Lemma 3, we obtain
∥ˆXn(t)∥≤c[∥xn∥+∥ˆWn(t)∥+∫t0∥Zn(s)∥ds+∥Zn(t)∥+t]. |
Because
σkˆWnk(t)=ˆAnk(t)−ˆSnk(1nEnk(t))−ˆRnk(ˉ˜Enk(t)). |
the previous results, we can obtain ˆWn√n⟹0. It is easy to show that xn√n⟹0. By Gronwall's lemma, for any t, we can obtain sups≤t∥Zn(t)∥√n→0 in the distribution. Therefore, Zn(t)√n⟹0. Consequently, ˆXn√n=ˉXn−γ0⟹0, where ˉXn=Xnn=ˉΦn+ˉΨn. Note that m∑1ˉΦnk=(m∑1ˉXnk−1)+. Then, by Assumption 1, we have ∑mk=1λ0k∖μk0=1. Because γ0k=λ0k/μ0k, we can obtain m∑1ˉΦnk⟹0. Since ˉΦnk≥0, ˉΦnk⟹0.
Lemma 5. Write
ˆBn(t)=∫t0bn(ˆXns,Uns,ˆξns,Hns)ds, | (4.28) |
ˆCn(t)=∫t0e−αsC(ˆXns,Uns,ˆξns,Hns)ds. | (4.29) |
Then, the sequence (ˆXn,ˆWn,ˆBn,ˆCn) is tight in the space (D(Rm))4.
Proof. By Lemma 4 and a time change lemma (cf. [37]), we can see that
ˆSnk(1nEnk(t))⇒0,ˆRnk(ˉ˜Enk(t))⇒0. |
Because
σkˆWnk(t)=ˆAnk(t)−ˆSnk(1nEnk(t))−ˆRnk(ˉ˜Enk(t)), |
it follows that
ˆWn(t)⇒σ−1σA=A=W, |
where A=(A1,⋯,Am) and Ak,k∈M are as described in Lemma 4. They are tight because ˆWn is relatively compact, and from Theorem 16.10 of [37], we can obtain the following:
(i) For each m,
lima→∞lim supnP[‖ˆWn‖m≥a]=0. | (4.30) |
(ii) For 1≤i≤v, let [ti−1,ti) be decompositions of [0,m) such that ti−ti−1>δ and let ω′m(x,δ)=infmax1≤i≤vω(x,[ti−1,ti)); the infimum extends over all decompositions [ti−1,ti). For each m and ε,
limδlim supnP[ω′m(ˆWn,δ)≥ε]=0. | (4.31) |
Since
ˆXn(t)=xn+σˆWn(t)+∫t0bn(ˆXns,Uns,ˆξns,Hns)ds, |
noting that the Lipschitz property holds uniformly for the functions ˆXns→bn(ˆXns,Uns,ˆξns,Hns) in ˆXn,Un,ˆξn,Hn and n, we can obtain
∥ˆXn(t)∥≤∥xn∥+∥ˆWn(t)∥+c∫t0(1+∥ˆXn(s)∥)ds. |
Analogous to the proof of Lemma 4 in [10], we have the tightness of ˆXn. Now, by Theorem 16.10 of [37], it is easy to obtain (cf.[10]) the tightness of ˆBn and ˆCn.
Lemma 6. Denote (X,W,B,C) as a limit point of (ˆXn,ˆWn,ˆBn,ˆCn) along a subsequence. In this case, X,B and C have continuous sample paths, and B has the sample paths of bounded variation over finite time intervals. Denote (Ft) as the filtration generated by (X,W,B); in that case, W is an (Ft)-standard Brownian motion.
Proof. Analogous to the proof of Lemma 6 in [10], this lemma is easy to obtain. Since ˆBn and ˆCn have continuous sample paths, by [25], we can see that processes B and C have continuous sample paths. By Lemma 5, we have ˆWn(t)⇒W, so X=x+σW+B since ˆXn(t)=xn+σˆWn(t)+ˆBn. We write ˆBn=ˆBn,+−ˆBn,−, where ˆBn,+k(t)=∫t0(ˆBnk(s))+ds and ˆBn,−k(t)=∫t0(ˆBnk(s))−ds. By the definition of bn in Section 2.2, we have
ˆBn,+(t)∨ˆBn,−(t)≤c∫t0(1+∥ˆXns∥)ds, |
(ˆBn,+(t)−ˆBn,+(s))∨(ˆBn,−(t)−(ˆBn,−(s))≤c∣t−s∣(1+∥ˆXn∥t)ds, |
where c is independent of t,n. Thus, the tightness of (ˆBn,+,ˆBn,−) follows from the tightness of ˆXn. Since ˆBn,+ and ˆBn,− have continuous sample paths, let (B+,B−) denote any subsequential limit point in (D(Rm))2. We can see that B+ and B− have continuous sample paths; therefore, B=B+−B−. Since B+ and B− have nondecreasing sample paths and t is arbitrary, B has sample paths of bounded variation over finite time intervals.
By Lemma 5, we have ˆWn(t)⇒W, where W is a standard Brownian motion. By the definitions of W and (Ft), W is adapted to (Ft). Next, we show that Ft is independent of σ{Wt+a−Wt:a>0} for each t. In fact, we fix a t≥0 and let
ΔˆSnk=ˆSnk(1nEnk(t))−ˆSnk(1nEnk(t+a)),ΔˆRnk=ˆSnk(1n˜Enk(t))−ˆSnk(1n˜Enk(t+a)), |
where a≥0, Enk(t) and ˜Enk(t) are as defined in Section 2.1. In addition, we have
σk(ˆWnk(t+a)−ˆWnk(t))=ˆAnk(t+a)−ˆAnk(t)−ΔˆSnk−ΔˆRnk. |
We then, let
σkαnk=Ank(τnk(t)+a)−Ank(τnk(t))−ΔˆSnk−ΔˆRnk, |
βnk=ˆAnk(t+a)−ˆAnk(t)−Ank(τnk(t)+a)+Ank(τnk(t)), |
where τnk(t)=inf{a≥t:Ank(a)−Ank(a−)>0}. We then have
σk(ˆWnk(t+a)−ˆWnk(t))=σkαnk+βnk. |
Let Yn=(ˆXns,ˆWns,ˆBns), Y=(Xs,Ws,Bs), 0≤s≤t, F:(Rm)3→Rm, and G:Rm→Rm. By the definitions of Fnt and Gnt, we can conclude that Yn and αn are measurable on Fnt and Gnt, respectively. By Definition (1), we can obtain
E[F(Yn)G(αn)]=E[F(Yn)]E[G(αn)]. |
According to the random change of the time lemma (cf.[37]), τnk(t) converges in distribution to 0, and thus, ˆAnk converges in distribution to a continuous process. Thus, we can see that βnk converges in distribution to 0. Therefore, αnk converges in distribution to Wt+a−Wt. According to the continuous mapping theorem, Yn=(ˆXn,ˆWn,ˆBn)⇒Y=(X,W,B), we can obtain
E[F(Y)G(Wt+a−Wt)]=E[F(Y)]E[G(Wt+a−Wt)]. |
Since a,s are arbitrary, according to the Dynkin class theorem, Ft is independent of σ{Wt+a−Wt:a>0}. Thus, W is a (Ft)- Brownian motion because t is arbitrary.
Proof of Theorem 4. Let f be the solution of the HJB equation (4.2) described in Theorem 3. From (4.1), we can obtain
Γnt=b(ˆXnt,Unt,ˆξnt,Hnt)⋅∇f(ˆXnt)+C(ˆXnt,Unt,ˆξnt,Hnt)−H(ˆXnt,▽f(ˆXnt))≥0. | (4.32) |
Moreover, we have
∫t0e−αsΓnsds=∫t0e−αs∇f(ˆXns)dˆBn(s)−∫t0e−αsH(ˆXns,▽f(ˆXns))ds+ˆCn(t)+∫t0e−αs{b(ˆXns,Uns,ˆξns,Hns)−bn(ˆXns,Uns,ˆξns,Hns)}∇f(ˆXns)ds≥0. | (4.33) |
By the definitions of b(X,U,ξ,H) and bn(ˆXn,Un,ˆξn,Hn) and Assumption 6, we have
‖b(ˆXns,Uns,ˆξns,Hns)−bn(ˆXns,Uns,ˆξns,Hns)‖≤εn(1+‖ˆXns‖), |
where εn→0. Therefore, by Lemma 3 and the continuous mapping theorem, we have
∫t0e−αs{b(ˆXns,Uns,ˆξns,Hns)−bn(ˆXns,Uns,ˆξns,Hns)}∇f(ˆXns)ds⇒0. | (4.34) |
We set (X,W,B,C) as a weak limit point of (ˆXn,ˆWn,ˆBn,ˆCn), and set (Ft) as the filtration generated by (X,W,B). Since Xt=x+σWt+Bt, by Lemma 6, W is an (Ft)-standard Brownian motion and B has the sample paths of bounded variation over finite time intervals. Therefore, for n→∞, we apply Lemma 5 of [10] with Un=e−αs∇f(ˆXns), Vn=ˆBn, and decompose ˆBn=Mn+Nn into Mn=0 and Nn=ˆBn. We thus obtain
∫t0e−αs∇f(ˆXns)dˆBn(s)⇒∫t0e−αs∇f(Xs)dB(s). | (4.35) |
Using the continuity of Xs↦H(Xs,▽f(Xs)), we can obtain
∫t0e−αs∇f(Xs)dB(s)−∫t0e−αsH(Xs,▽f(Xs))ds+C(t)≥0. | (4.36) |
Applying Itˆo's formula to f, we can obtain
e−αtf(Xt)=f(x)+∫t0e−αs∇f(Xs)⋅σdW(s)+∫t0e−αs∇f(Xs)dB(s)−∫t0e−αsH(Xs,▽f(Xs))ds. | (4.37) |
Combining (4.36) and (4.37), we have
e−αtf(Xt)≥f(x)+∫t0e−αs∇f(Xs)⋅σdW(s)−C(t). | (4.38) |
For t≥0 and N∈N, by Theorem 3(i) and Lemma 3, we can obtain
Ef(ˆXnt)≤c(1+tN). |
Because for each t, f(ˆXnt) converges in distribution to f(Xt), and f(Xt) is uniformly integrable, we can obtain
Ef(Xt)≤c(1+tN). |
We then have
EC(t)≥f(x)−ε(t), |
where ε(t)→0 as t→∞. By Lemma 3 and Assumption 3, for a fixed δ>0, there is a T large enough to satisfy
∫∞Te−αsC(ˆXn(s),Un(s),ˆξn(s),Hn(s))ds≤δ. |
Hence,
V_n≥EC(T)−δ≥f(x)−ε(T)−δ. |
When T→∞ and δ→0, we have
V_n≥V(x). |
First, for a given initial condition xn and a sequence stochastic system, write
¯Vn=lim supn→∞Jn(xn,ˆXn,Un,ˆξn,Hn). |
Before proving Theorem 2, we need to introduce a lemma.
Lemma 7. Let (xn,ˆXn,Un,ˆξn,Hn)n≥1 be any sequence of the stochastic system; the associated cost functions (Jn(xn,ˆXn,Un,ˆξn,Hn))n≥1 are described in Section 2.3. Assume that xn→x, V is the value function given in Section 2.4, and ∫t0e−αsΓnsds⇒0 holds, where Γnt is described as in (4.32). We then have
¯Vn≥V(x). |
Proof. First, as in the proof of Theorem 4, (4.32)–(4.35), and (4.37) hold for this case. By ∫t0e−αsΓnsds⇒0, we can obtain the following equation instead of (4.36):
∫t0e−αs∇f(Xs)dB(s)−∫t0e−αsH(Xs,▽f(Xs))ds+C(t)=0. | (4.39) |
Combining (4.37) and (4.39), we have
0≤e−αtf(Xt)=f(x)+∫t0e−αs∇f(Xs)⋅σdW(s)−C(t). | (4.40) |
We then have
EC(t)≤f(x),∀t. | (4.41) |
By Assumption 3 and Lemma 3, for all n and ∀δ>0, there is a T such that
E∫∞Te−αsC(ˆXns,Uns,ˆξns,Hns)ds≤δ. | (4.42) |
Since ˆCn⇒C and C have continuous sample paths, ˆCnT converges in distribution to CT. We thus obtain the following equation on the basis of Jensen's inequality, Assumption 3 and Lemma 3.
E(ˆCnT)1+εkC≤cE∫T0e−α(1+εkC)s(1+∥ˆXns∥kC+ε)ds≤c, |
where c is independent of n, and kC is defined in Assumption 3. For n∈N, ˆCnT is uniformly integrable, and as n→∞, we have EˆCnT→ECT. By (4.41) and (4.42), we have
lim supn→∞E∫∞0e−αsC(ˆXns,Uns,ˆξns,Hns)ds≤f(x)+δ. | (4.43) |
As δ→0, we have
¯Vn≤V(x). |
Proof of Theorem 2. First, (xn,ˆXn,˜Un,˜ξn,˜Hn)n≥1 satisfy the conditions in Theorem 4; i.e.,
lim infn→∞Jn(xn,ˆXn,˜Un,˜ξn,˜Hn)≥V(x). | (4.44) |
Next, we show that (xn,ˆXn,˜Un,˜ξn,˜Hn)n≥1 satisfy the conditions of Lemma 7. Let Ωn denote the event in which Xnk≥˜Ψnk holds. According to Assumption 3 of C(X,U,ξ,H) in Section 2.3, C(X,U,ξ,H) is uniformly continuous on compact spaces. Let gM(ϵ) be satisfied with the limit
limϵ↓0gM(ϵ)=0∀M, |
and let ∣C(ˆXnt,˜Unt,˜ξnt,˜Hnt)−C(ˆXnt,U∗t,ξ∗t,H∗t)∣≤gM(ϵ) when ∥˜Unt∥,∥U∗t∥,∥˜ξnt∥,∥ξ∗t∥,∥˜Hnt∥,∥H∗t∥≤M, and ∥˜ξnt−ξ∗t∥≤ϵ. Following holds for the event Ωn,M=Ωn∩{∥ˆXnt∥∗T+∥ˆΦnt∥∗T+∥ˆΨnt∥∗T≤M}, where ∥Xt∥∗T=sup0≤t≤T∥Xt∥:
∣Γnt∣=∣(b(ˆXnt,˜Unt,˜ξnt,˜Hnt)−b(ˆXnt,U∗t,ξ∗t,H∗t))⋅∇f(ˆXnt)+C(ˆXnt,˜Unt,˜ξnt,˜Hnt)−C(ˆXnt,U∗t,ξ∗t,H∗t)∣≤c∥˜ξnt−ξ∗t∥⋅∥∇f(ˆXnt)∥+gM(∥˜ξnt−ξ∗t∥). |
By ∥˜Ψn−Ψn,∗∥≤2m, we can obtain ∥˜ξnt−ξ∗t∥≤2mn. Therefore, for Ωn,M and for some δn→0, we have ∣Γnt∣∗T≤δn. limn→∞P(Ωn)=1, which follows from the convergence ˆXn√n=ˉXn−γ0⟹0 in Lemma 3. By the tightness of ˆXn and ˆΦnt∈Rm+, we have
limM→∞lim infn→∞P(Ωn,M)=1. | (4.45) |
Therefore, ∣Γnt∣∗T converges to zero in the distribution. Since T is arbitrary, Γnt⇒0, so (xn,ˆXn,˜Un,˜ξn,˜Hn)n≥1 satisfies the conditions of Lemma 7. We then
lim supn→∞Jn(xn,ˆXn,˜Un,˜ξn,˜Hn)≤V(x). | (4.46) |
Combining (4.44) and (4.46), we have
limn→∞Jn(xn,ˆXn,˜Un,˜ξn,˜Hn)=V(x)≤V_n(x). | (4.47) |
The present study introduces and evaluates a novel joint scheduling–control policy for managing multiclass customer flows in parallel server queues under heavy traffic conditions. Our approach, which is based on in the principles of customer admission control, service scheduling control, and service rate control, provides a comprehensive strategy to optimize queuing systems where the traffic intensity parameter is close to unity.
Our research makes the following contributions to the literature:
(1) Innovative policy formulation: This paper presents a unique joint scheduling–control policy that considers the overall system state rather than isolated customer or server conditions, leading to more holistic and efficient queuing management.
(2) Theoretical validation: This paper demonstrates the asymptotic optimality of the proposed policy within the Halfin–Whitt heavy traffic regime, ensuring that the additional control components are not only justified but also reduce the total system cost.
(3) Numerical simulations: This paper provides empirical evidence from comparative experiments that illustrate the superior performance of our policy over other control strategies in terms of reducing system′s costs and improving service efficiency.
The results of our study have practical implications for various real-world applications, including (but not limited to) modern call centers, high-performance computing systems, cloud computing services, and emerging quantum computing platforms. The newly developed policy can serve as a valuable tool for operators and managers aiming to enhance service quality while maintaining high server utilization rates.
Our work also opens avenues for further research. The joint scheduling–control policy could be extended to more complex systems with additional constraints or different customer behavior patterns. Moreover, the integration of machine learning techniques to dynamically adjust control parameters in real time could be an exciting direction for future exploration.
In conclusion, our study presents a significant advancement in the field of queuing theory and control, offering a robust policy that promises to improve operational efficiency in multiclass server environments. We are confident that our findings will not only benefit academic researchers but also inform the decision-making of practitioners in related industries.
The author declares that he has not used Artificial Intelligence (AI) tools in the creation of this article.
The authors would like to thank the editor and reviewers for their help and suggestions for revisions to this paper. Thanks are also due to those who have provided help and advice in the writing, revision, and publication of this paper.
The author has no relevant financial or non-financial interests to disclose.
[1] |
S. L. Bell, R. J. Williams, Dynamic scheduling of a system with two parallel servers in heavy traffic with resource pooling: asymptotic optimality of a threshold polic, Ann. Appl. Probab., 11 (2001), 608–649. https://doi.org/10.1214/aoap/1015345343 doi: 10.1214/aoap/1015345343
![]() |
[2] | J. M. Harrison, Brownian models of queueing networks with heterogeneous customer populations, In: W. Fleming, P. L. Lions, Stochastic differential systems, stochastic control theory and applications, The IMA Volumes in Mathematics and Its Applications, 10 (1988), 147–186. https://doi.org/10.1007/978-1-4613-8762-6_11 |
[3] |
E. V. Krichagina, M. I. Taksar, Diffusion approximation for GI/G/1 controlled queues, Queueing Syst., 12 (1992), 333–367. https://doi.org/10.1007/bf01158808 doi: 10.1007/bf01158808
![]() |
[4] |
S. Kumar, Two-server closed networks in heavy traffic: diffusion limits and asymptotic optimality, Ann. Appl. Probab., 10 (2000), 930–961. https://doi.org/10.1214/aoap/1019487514 doi: 10.1214/aoap/1019487514
![]() |
[5] |
H. J. Kushner, Y. N. Chen, Optimal control of assignment of jobs to processors under heavy traffic, Stochastics Stochastic Rep., 68 (2000), 177–228. https://doi.org/10.1080/17442500008834223 doi: 10.1080/17442500008834223
![]() |
[6] |
L. F. Martins, S. E. Shreve, H. M. Soner, Heavy traffic convergence of a controlled, multiclass queueing system, SIAM J. Control Optim., 34 (1996), 2133–2171. https://doi.org/10.1137/s0363012994265882 doi: 10.1137/s0363012994265882
![]() |
[7] |
E. Plambeck, S. Kumar, J. M. Harrison, A multiclass queue in heavy traffic with throughput time constraints: asymptotically optimal dynamic controls, Queueing Syst. Theory Appl., 39 (2001), 23–54. https://doi.org/10.1023/a:1017983532376 doi: 10.1023/a:1017983532376
![]() |
[8] |
J. A. Van Mieghem, Dynamic scheduling with convex delay costs: the generalized cμ rule, Ann. Appl. Probab., 5 (1995), 809–833. https://doi.org/10.1214/aoap/1177004706 doi: 10.1214/aoap/1177004706
![]() |
[9] |
A. Weerasinghe, Optimal service rate perturbations of many server queues in heavy traffic, Queueing Syst., 79 (2015), 321–363. https://doi.org/10.1007/s11134-014-9423-9 doi: 10.1007/s11134-014-9423-9
![]() |
[10] |
R. Atar, A. Mandelbaum, M. I. Reiman, Scheduling a multi class queue with many exponential servers: asymptotic optimality in heavy traffic, Ann. Appl. Probab., 14 (2004), 1084–1134. https://doi.org/10.1214/105051604000000233 doi: 10.1214/105051604000000233
![]() |
[11] |
S. Halfin, W. Whitt, Heavy-traffic limits for queues with many exponential servers, Oper. Res., 29 (1981), 567–588. https://doi.org/10.1287/opre.29.3.567 doi: 10.1287/opre.29.3.567
![]() |
[12] |
D. L. Jagerman, Some properties of the Erlang loss function, Bell Syst. Tech. J., 53 (1974), 525–551. https://doi.org/10.1002/j.1538-7305.1974.tb02756.x doi: 10.1002/j.1538-7305.1974.tb02756.x
![]() |
[13] | W. H. Fleming, H. M. Soner, Controlled Markov processes and viscosity solutions, 2 Eds., New York: Springer, 2006. https://doi.org/10.1007/0-387-31071-1 |
[14] |
A. Arapostathis, H. Hmedi, G. Pang, On uniform exponential ergodicity of markovian multiclass many-server queues in the halfin-whitt regime, Math. Oper. Res., 46 (2021), 772–796. https://doi.org/10.1287/moor.2020.1087 doi: 10.1287/moor.2020.1087
![]() |
[15] |
A. Arapostathis, G. Pang, Y. Zheng, Optimal scheduling of critically loaded multiclass GI/M/n+M queues in an alternating renewal environment, Appl. Math. Opt., 84 (2021), 1857–1901. https://doi.org/10.1007/s00245-020-09698-9 doi: 10.1007/s00245-020-09698-9
![]() |
[16] |
Y. L. Kocağa, An approximating diffusion control problem for dynamic admission and service rate control in a G/M/N+G queue, Oper. Res. Lett., 45 (2017), 538–542. https://doi.org/10.1016/j.orl.2017.08.005 doi: 10.1016/j.orl.2017.08.005
![]() |
[17] |
D. Y. Sze, A queueing model for telephone operator staffing, Oper. Res., 32 (1984), 229–249. https://doi.org/10.1287/opre.32.2.229 doi: 10.1287/opre.32.2.229
![]() |
[18] |
M. Armony, C. Maglaras, On customer contact centers with a call-back option: customer decisions, routing rules, and system design, Oper. Res., 52 (2004), 271–292. https://doi.org/10.1287/opre.1030.0088 doi: 10.1287/opre.1030.0088
![]() |
[19] |
W. Dai, n-qubit operations on sphere and queueing scaling limits for programmable quantum computer, Quantum Inf. Process., 22 (2023), 122–164. https://doi.org/10.1007/s11128-023-03851-3 doi: 10.1007/s11128-023-03851-3
![]() |
[20] |
O. Garnett, A. Mandelbaum, M. I. Reiman, Designing a telephone call center with impatient customers, Manuf. Serv. Oper. Manage., 4 (2002), 208–227. https://doi.org/10.1287/msom.4.3.208.7753 doi: 10.1287/msom.4.3.208.7753
![]() |
[21] |
A. Weerasinghe, Controlling the running maximum of a diffusion process and an application to queueing systems, SIAM J. Control Optim., 56 (2018), 1412–1440. https://doi.org/10.1137/17m1135967 doi: 10.1137/17m1135967
![]() |
[22] |
W. Dai, Optimal rate scheduling via utility-maximization for j-user MIMO Markov fading wireless channels with cooperation, Oper. Res., 61 (2013), 1450–1462. https://doi.org/10.1287/opre.2013.1224 doi: 10.1287/opre.2013.1224
![]() |
[23] |
W. Dai, Platform modeling and scheduling game with multiple intelligent cloud-computing pools for big data, Math. Comput. Model. Dyn. Syst., 24 (2018), 506–552. https://doi.org/10.1080/13873954.2018.1516677 doi: 10.1080/13873954.2018.1516677
![]() |
[24] |
W. Dai, Quantum-computing with AI & blockchain: modelling, fault tolerance and capacity scheduling, Math. Comput. Model. Dyn. Syst., 25 (2019), 523–559. https://doi.org/10.1080/13873954.2019.1677725 doi: 10.1080/13873954.2019.1677725
![]() |
[25] | S. N. Ethier, T. G. Kurtz, Markov processes: characterization and convergence, New York: Wiley, 1986. https://doi.org/10.1002/9780470316658 |
[26] |
J. M. Harrison, A. Zeevi, Dynamic scheduling of a multiclass queue in the Halfin-Whitt heavy traffic regime, Oper. Res., 52 (2004), 243–257. https://doi.org/10.1287/opre.1030.0084 doi: 10.1287/opre.1030.0084
![]() |
[27] |
A. A. Puhalskii, M. I. Reiman, The multiclass GI/PH/N queue in the Halfin–Whitt regime, Adv. Appl. Probab., 32 (2000), 564–595. https://doi.org/10.1017/s0001867800010089 doi: 10.1017/s0001867800010089
![]() |
[28] | W. Whitt, Stochastic–process limits, New York: Springer, 2002. https://doi.org/10.1007/b97479 |
[29] |
D. J. Higham, An algorithmic introduction to numerical simulation of stochastic differential equations, SIAM Rev., 43 (2001), 525–546. https://doi.org/10.1137/s0036144500378302 doi: 10.1137/s0036144500378302
![]() |
[30] |
W. Dai, Convolutional neural network based simulation and analysis for backward stochastic partial differential equations, Comput. Math. Appl., 119 (2022), 21–58. https://doi.org/10.1016/j.camwa.2022.05.019 doi: 10.1016/j.camwa.2022.05.019
![]() |
[31] | W. Fang, M. B. Giles, Adaptive Euler-Maruyama method for SDEs with non-globally Lipschitz drift: Part I, finite time interval, arXiv Preprint, 2016. https://doi.org/10.48550/arXiv.1609.08101 |
[32] | I. Karatzas, S. E. Shreve, Brownian motion and stochastic calculus, 2 Eds., New York: Springer, 1998. https://doi.org/10.1007/978-1-4612-0949-2 |
[33] |
H. J. Kushner, Optimal discounted stochastic control for diffusion processes, SIAM J. Control, 5 (1967), 520–531. https://doi.org/10.1137/0305032 doi: 10.1137/0305032
![]() |
[34] | O. A. Ladyzhenskaya, N. N. Uraltseva, Linear and quasilinear elliptic equations, New York: Academic Press, 1968. |
[35] | W. Rudin, Real and complex analysis, New York: Wiley, 1987. |
[36] | P. Protter, Stochastic integration and differential equations: a new approach, Berlin: Springer, 1990. https://doi.org/10.1007/978-3-662-02619-9_6 |
[37] | P. Billingsley, Convergence of probability measures, New York: John Wiley & Sons, 1999. https://doi.org/10.1002/9780470316962 |