Research article

An actor-critic framework based on deep reinforcement learning for addressing flexible job shop scheduling problems

  • Received: 11 October 2023 Revised: 30 November 2023 Accepted: 10 December 2023 Published: 28 December 2023
  • With the rise of Industry 4.0, manufacturing is shifting towards customization and flexibility, presenting new challenges to meet rapidly evolving market and customer needs. To address these challenges, this paper suggests a novel approach to address flexible job shop scheduling problems (FJSPs) through reinforcement learning (RL). This method utilizes an actor-critic architecture that merges value-based and policy-based approaches. The actor generates deterministic policies, while the critic evaluates policies and guides the actor to achieve the most optimal policy. To construct the Markov decision process, a comprehensive feature set was utilized to accurately represent the system's state, and eight sets of actions were designed, inspired by traditional scheduling rules. The formulation of rewards indirectly measures the effectiveness of actions, promoting strategies that minimize job completion times and enhance adherence to scheduling constraints. The experimental evaluation conducted a thorough assessment of the proposed reinforcement learning framework through simulations on standard FJSP benchmarks, comparing the proposed method against several well-known heuristic scheduling rules, related RL algorithms and intelligent algorithms. The results indicate that the proposed method consistently outperforms traditional approaches and exhibits exceptional adaptability and efficiency, particularly in large-scale datasets.

    Citation: Cong Zhao, Na Deng. An actor-critic framework based on deep reinforcement learning for addressing flexible job shop scheduling problems[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 1445-1471. doi: 10.3934/mbe.2024062

    Related Papers:

  • With the rise of Industry 4.0, manufacturing is shifting towards customization and flexibility, presenting new challenges to meet rapidly evolving market and customer needs. To address these challenges, this paper suggests a novel approach to address flexible job shop scheduling problems (FJSPs) through reinforcement learning (RL). This method utilizes an actor-critic architecture that merges value-based and policy-based approaches. The actor generates deterministic policies, while the critic evaluates policies and guides the actor to achieve the most optimal policy. To construct the Markov decision process, a comprehensive feature set was utilized to accurately represent the system's state, and eight sets of actions were designed, inspired by traditional scheduling rules. The formulation of rewards indirectly measures the effectiveness of actions, promoting strategies that minimize job completion times and enhance adherence to scheduling constraints. The experimental evaluation conducted a thorough assessment of the proposed reinforcement learning framework through simulations on standard FJSP benchmarks, comparing the proposed method against several well-known heuristic scheduling rules, related RL algorithms and intelligent algorithms. The results indicate that the proposed method consistently outperforms traditional approaches and exhibits exceptional adaptability and efficiency, particularly in large-scale datasets.



    加载中


    [1] M. Parente, G. Figueira, P. Amorim, A. Marques, Production scheduling in the context of Industry 4.0: Review and trends, Int. J. Prod. Res., 58 (2020), 5401–5431. https://doi.org/10.1080/00207543.2020.1718794 doi: 10.1080/00207543.2020.1718794
    [2] A. Ham, Flexible job shop scheduling problem with parallel batch processing machine, in 2016 Winter Simulation Conference (WSC), (2016), 2740–2749. https://doi.org/10.1109/WSC.2016.7822311
    [3] K. Gao, F. Yang, M. Zhou, Q. Pan, P. N. Suganthan, Flexible job-shop rescheduling for new job insertion by using discrete Jaya algorithm, IEEE Trans. Cybern., 49 (2019), 1944–1955. https://doi.org/10.1109/TCYB.2018.2817240 doi: 10.1109/TCYB.2018.2817240
    [4] C. Lu, X. Li, L. Gao, W. Liao, J. Yi, An effective multi-objective discrete virus optimization algorithm for flexible job-shop scheduling problem with controllable processing times, Comput. Ind. Eng., 104 (2017), 156–174. https://doi.org/10.1016/j.cie.2017.01.030 doi: 10.1016/j.cie.2017.01.030
    [5] N. Shahsavari-Pour, B. Ghasemishabankareh, A novel hybrid meta-heuristic algorithm for solving multi-objective flexible job shop scheduling, J. Manuf. Syst., 32 (2013), 771–780. https://doi.org/10.1016/j.jmsy.2013.04.015 doi: 10.1016/j.jmsy.2013.04.015
    [6] K. Hu, L. Wang, J. Cai, L. Cheng, An improved genetic algorithm with dynamic neighborhood search for job shop scheduling problem, Math. Biosci. Eng., 20 (2023), 17407–17427.
    [7] M. Nouiri, A. Bekrar, A. Jemai, S. Niar, A.C. Ammari, An effective and distributed particle swarm optimization algorithm for flexible job-shop scheduling problem, J. Intell. Manuf., 29 (2016), 603–615. https://doi.org/10.1007/s10845-016-1233-5 doi: 10.1007/s10845-016-1233-5
    [8] I.A. Chaudhry, A. A. Khan, A research survey: Review of flexible job shop scheduling techniques, Int. Trans. Oper. Res., 23 (2016), 551–591. https://doi.org/10.1111/itor.12199 doi: 10.1111/itor.12199
    [9] C. Lu, L. Gao, J. Yi, X. Li, Energy-efficient scheduling of distributed flow shop with heterogeneous factories: A real-world case from automobile industry in China, IEEE Trans. Ind. Inf., 17 (2020), 6687–6696. https://doi.org/10.1109/TII.2020.2963792 doi: 10.1109/TII.2020.2963792
    [10] Y. Feng, L. Zhang, Z. Yang, Y. Guo, D. Yang, Flexible job shop scheduling based on deep reinforcement learning, in 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), (2021), 660–666. https://doi.org/10.1109/ACAIT53529.2021.9731322
    [11] W. Song, X. Chen, Q. Li, Z. Cao, Flexible job-shop scheduling via graph neural network and deep reinforcement learning, IEEE Trans. Ind. Inf., 19 (2022), 1600–1610. https://doi.org/10.1109/TII.2022.3189725 doi: 10.1109/TII.2022.3189725
    [12] M. Ziaee, A heuristic algorithm for solving flexible job shop scheduling problem, Int. J. Adv. Manuf. Technol., 71 (2014), 519–528. https://doi.org/10.1007/s00170-013-5510-z doi: 10.1007/s00170-013-5510-z
    [13] P. Priore, A. Gomez, R. Pino, R. Rosillo, Dynamic scheduling of manufacturing systems using machine learning: An updated review, AI Edam, 28 (2014), 83–97. https://doi.org/10.1017/S0890060413000516 doi: 10.1017/S0890060413000516
    [14] Y. Li, S. Carabelli, E. Fadda, D. Manerba, R. Tadei, O. Terzo, Machine learning and optimization for production rescheduling in Industry 4.0, Int. J. Adv. Manuf. Technol., 110 (2020), 2445–2463. https://doi.org/10.1007/s00170-020-05850-5 doi: 10.1007/s00170-020-05850-5
    [15] G. Chenyang, G. Yuelin, L. Shanshan, Improved simulated annealing algorithm for flexible job shop scheduling problems, in 2016 Chinese Control and Decision Conference (CCDC), (2016), 2191–2196. https://doi.org/10.1109/CCDC.2016.7531349
    [16] G. Vilcot, J. C. Billaut, A tabu search algorithm for solving a multicriteria flexible job shop scheduling problem, Int. J. Prod. Res., 49 (2011), 6963–6980. https://doi.org/10.1080/00207543.2010.526016 doi: 10.1080/00207543.2010.526016
    [17] H. H. Doh, J. M. Yu, J. S. Kim, D. H. Lee, S. H. Nam, A priority scheduling approach for flexible job shops with multiple process plans, Int. J. Prod. Res., 51 (2013), 3748–3764. https://doi.org/10.1080/00207543.2013.765074 doi: 10.1080/00207543.2013.765074
    [18] C. Zhang, W. Song, Z. Cao, J. Zhang, P. S. Tan, X. Chi, Learning to dispatch for job shop scheduling via deep reinforcement learning, Adv. Neural Inf. Process. Syst., 33 (2020), 1621–1632.
    [19] J. Shahrabi, M. A. Adibi, M. Mahootchi, A reinforcement learning approach to parameter estimation in dynamic job shop scheduling, Comput. Ind. Eng., 110 (2016), 75–82. https://doi.org/10.1016/j.cie.2017.05.026 doi: 10.1016/j.cie.2017.05.026
    [20] H. X. Wang, H. S. Yan, An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning, J. Intell. Manuf., 27 (2016), 1085–1095. https://doi.org/10.1007/s10845-014-0936-1 doi: 10.1007/s10845-014-0936-1
    [21] Y. F. Wang, Adaptive job shop scheduling strategy based on weighted Q-learning algorithm, J. Intell. Manuf., 31 (2020), 417–432. https://doi.org/10.1007/s10845-018-1454-3 doi: 10.1007/s10845-018-1454-3
    [22] Y. Zhao, Y. Wang, Y. Tan, J. Zhang, H. Yu, Dynamic job shop scheduling algorithm based on deep Q network, IEEE Access, 9 (2021), 122995–123011. https://doi.org/10.1109/ACCESS.2021.3110242 doi: 10.1109/ACCESS.2021.3110242
    [23] S. Luo, Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning, Appl. Soft Comput., 91 (2020), 106208. https://doi.org/10.1016/j.asoc.2020.106208. doi: 10.1016/j.asoc.2020.106208
    [24] R. Li, W. Gong, C. Lu, A reinforcement learning based RMOEA/D for bi-objective fuzzy flexible job shop scheduling, Expert Syst. Appl., 203 (2022), 117380. https://doi.org/10.1016/j.eswa.2022.117380 doi: 10.1016/j.eswa.2022.117380
    [25] C. L. Liu, C. C. Chang, C. J. Tseng, Actor-critic deep reinforcement learning for solving job shop scheduling problems, IEEE Access, 8 (2020), 71752–71762. https://doi.org/10.1109/ACCESS.2020.2987820 doi: 10.1109/ACCESS.2020.2987820
    [26] E. Yuan, S. Cheng, L. Wang, S. Song, F. Wu, Solving job shop scheduling problems via deep reinforcement learning, Appl. Soft Comput., 143 (2023), 110436. https://doi.org/10.1016/j.asoc.2022.110436 doi: 10.1016/j.asoc.2022.110436
    [27] J. C. Palacio, Y. VM. Jiménez, L. Schietgat, B. van Doninck, A. Nowé, A Q-learning algorithm for flexible job shop scheduling in a real-world manufacturing scenario, Procedia CIRP, 106 (2022), 227–232. https://doi.org/10.1016/j.procir.2022.02.183 doi: 10.1016/j.procir.2022.02.183
    [28] J. Popper, V. Yfantis, M. Ruskowski, Simultaneous production and AGV scheduling using multi-agent deep reinforcement learning, Procedia CIRP, 104 (2021), 1523–1528. https://doi.org/10.1016/j.procir.2021.11.257 doi: 10.1016/j.procir.2021.11.257
    [29] J. Chang, D. Yu, Z. Zhou, W. He, L. Zhang, Hierarchical reinforcement learning for multi-objective real-time flexible scheduling in a smart shop floor, Machines, 10 (2022), 1195. https://doi.org/10.3390/machines10121195 doi: 10.3390/machines10121195
    [30] L. Yin, X. Li, L. Gao, C. Lu, Z. Zhang, A novel mathematical model and multi-objective method for the low-carbon flexible job shop scheduling problem, Sustainable Comput. Inf. Syst., 13 (2017), 15–30. https://doi.org/10.1016/j.suscom.2017.01.004 doi: 10.1016/j.suscom.2017.01.004
    [31] P. Burggräf, J. Wagner, T. Saßmannshausen, D. Ohrndorf, K. Subramani, Multi-agent-based deep reinforcement learning for dynamic flexible job shop scheduling, Procedia CIRP, 112 (2022), 57–62. https://doi.org/10.1016/j.procir.2022.01.026 doi: 10.1016/j.procir.2022.01.026
    [32] S. Yang, Z. Xu, J. Wang, Intelligent decision-making of scheduling for dynamic permutation flowshop via deep reinforcement learning, Sensors, 21 (2021), 1019. https://doi.org/10.3390/s21031019 doi: 10.3390/s21031019
    [33] J. P. Huang, L. Gao, X. Y. Li, C. J. Zhang, A novel priority dispatch rule generation method based on graph neural network and reinforcement learning for distributed job-shop scheduling, J. Manuf. Syst., 69 (2021), 119–134. https://doi.org/10.1016/j.jmsy.2022.12.008 doi: 10.1016/j.jmsy.2022.12.008
    [34] B. A. Han, J. J. Yang, Research on adaptive job shop scheduling problems based on dueling double DQN, IEEE Access, 8 (2021), 186474–186495. https://doi.org/10.1109/ACCESS.2020.3029868 doi: 10.1109/ACCESS.2020.3029868
    [35] J. Bergdahl, Asynchronous Advantage Actor-Critic with Adam Optimization and A Layer Normalized Recurrent Network, Student thesis, (2017).
    [36] B. Han, J. Yang, A deep reinforcement learning based solution for flexible job shop scheduling problem, Int. J. Simul. Modell., 20 (2021), 375–386. https://doi.org/10.2507/IJSIMM20-2-CO7 doi: 10.2507/IJSIMM20-2-CO7
    [37] A. Henchiri, M. Ennigrou, Particle swarm optimization combined with tabu search in a multi-agent model for flexible job shop problem, in Advances in Swarm Intelligence: 4th International Conference, (2013), 385–394.
    [38] W. Xia, Z. Wu, An effective hybrid optimization approach for multi-objective flexible job-shop scheduling problems, Comput. Indust. Eng., 48 (2005), 409–425. https://doi.org/10.1016/j.cie.2004.11.002 doi: 10.1016/j.cie.2004.11.002
    [39] I. Kacem, S. Hammadi, P. Borne, Approach by localization and multiobjective evolutionary optimization for flexible job-shop scheduling problems, IEEE Trans. Syst. Man Cybernetics, 32 (2002), 1–13. https://doi.org/10.1109/TSMCC.2002.1000156 doi: 10.1109/TSMCC.2002.1000156
    [40] J. Hurink, B. Jurisch, M. Thole, Tabu search for the job-shop scheduling problem with multi-purpose machines, Oper. Res. Spektrum, 15 (1994), 205–215. https://doi.org/10.1007/BF01720537 doi: 10.1007/BF01720537
    [41] X. Li, L. Gao, An effective hybrid genetic algorithm and tabu search for flexible job shop scheduling problem, Int. J. Prod. Econ., 174 (2016), 93–110. https://doi.org/10.1016/j.ijpe.2016.01.016 doi: 10.1016/j.ijpe.2016.01.016
    [42] J. Stopforth, D. Moodley, Continuous versus discrete action spaces for deep reinforcement learning, in Proceedings of the South African Forum for Artificial Intelligence Research, (2019).
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1271) PDF downloads(116) Cited by(2)

Article outline

Figures and Tables

Figures(11)  /  Tables(8)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog