In this study, a car transfer planning system for parking lots was designed based on reinforcement learning. The car transfer planning system for parking lots is an intelligent parking management system that is designed by using reinforcement learning techniques. The system features autonomous decision-making, intelligent path planning and efficient resource utilization. And the problem is solved by constructing a Markov decision process and using a dynamic planning-based reinforcement learning algorithm. The system has the advantage of looking to the future and using reinforcement learning to maximize its expected returns. And this is in contrast to manual transfer planning which relies on traditional thinking. In the context of this paper on parking lots, the states of the two locations form a finite set. The system ultimately seeks to find a strategy that is beneficial to the long-term development of the operation. It aims to prioritize strategies that have positive impacts in the future, rather than those that are focused solely on short-term benefits. To evaluate strategies, as its basis the system relies on the expected return of a state from now to the future. This approach allows for a more comprehensive assessment of the potential outcomes and ensures the selection of strategies that align with long-term goals. Experimental results show that the system has high performance and robustness in the area of car transfer planning for parking lots. By using reinforcement learning techniques, parking lot management systems can make autonomous decisions and plan optimal paths to achieve efficient resource utilization and reduce parking time.
Citation: Feng Guo, Haiyu Xu, Peng Xu, Zhiwei Guo. Design of a reinforcement learning-based intelligent car transfer planning system for parking lots[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 1058-1081. doi: 10.3934/mbe.2024044
In this study, a car transfer planning system for parking lots was designed based on reinforcement learning. The car transfer planning system for parking lots is an intelligent parking management system that is designed by using reinforcement learning techniques. The system features autonomous decision-making, intelligent path planning and efficient resource utilization. And the problem is solved by constructing a Markov decision process and using a dynamic planning-based reinforcement learning algorithm. The system has the advantage of looking to the future and using reinforcement learning to maximize its expected returns. And this is in contrast to manual transfer planning which relies on traditional thinking. In the context of this paper on parking lots, the states of the two locations form a finite set. The system ultimately seeks to find a strategy that is beneficial to the long-term development of the operation. It aims to prioritize strategies that have positive impacts in the future, rather than those that are focused solely on short-term benefits. To evaluate strategies, as its basis the system relies on the expected return of a state from now to the future. This approach allows for a more comprehensive assessment of the potential outcomes and ensures the selection of strategies that align with long-term goals. Experimental results show that the system has high performance and robustness in the area of car transfer planning for parking lots. By using reinforcement learning techniques, parking lot management systems can make autonomous decisions and plan optimal paths to achieve efficient resource utilization and reduce parking time.
[1] | J. Yang, F. Lin, C. Chakraborty, K. Yu, Z. Guo, A. T. Nguyen, et al., A Parallel Intelligence-driven Resource Scheduling Scheme for Digital Twins-based Intelligent Vehicular Systems, IEEE Transact. Intell. Vehicles, 8 (2023), 2770–2785. https://doi.org/10.1109/TIV.2023.3237960 doi: 10.1109/TIV.2023.3237960 |
[2] | A. Thakur, Car rental system, Int. J. Res. Appl. Sci. Eng. Technol., 9 (2021), 402–412. https://doi.org/10.22214/ijraset.2021.36339 |
[3] | X. Zhu, F. Ma, F. Ding, Z. Guo, J. Yang, K. Yu, A Low-latency Edge Computation Offloading Scheme for Trust Evaluation in Finance-Level Artificial Intelligence of Things, IEEE Int. Things J., (2023), 1. https://doi.org/10.1109/JIOT.2023.3297834 |
[4] | J. Yang, Z. Guo, J. Luo, Y. Shen, K. Yu, Cloud-Edge-End Collaborative Caching Based on Graph Learning for Cyber-Physical Virtual Reality, IEEE Systems J., (2023), 3262255. https://doi.org/10.1109/JSYST.2023.3262255 |
[5] | Z. Shen, F. Ding, Y. Yao, A. Bhardwaj, Z. Guo, K. Yu, A Privacy-Preserving Social Computing Framework for Health Management Using Federated Learning, IEEE Transact. Comput. Soc. Syst., 10 (2023), 1666–1678. https://doi.org/10.1109/TCSS.2022.3212864 doi: 10.1109/TCSS.2022.3212864 |
[6] | Z. Guo, Q. Zhang, F. Ding, X. Zhu, K. Yu, A Novel Fake News Detection Model for Context of Mixed Languages Through Multiscale Transformer, IEEE Transact. Comput. Soc. Syst., (2023), 1–11. https://doi.org/10.1109/TCSS.2023.3298480 |
[7] | D. Meng, Y. Xiao, Z. Guo, A. Jolfaei, L. Qin, X. Lu, et al., A data-driven intelligent planning model for UAVs routing networks in mobile Internet of Things, Comput. Commun., 179 (2021), 231–241. https://doi.org/10.1016/j.comcom.2021.08.014 doi: 10.1016/j.comcom.2021.08.014 |
[8] | M. Wen, R. Lin, H. Wang, Y. Yang, Y. Wen, L. Mai, et al., Large sequence models for sequential decision-making: A survey, Front. Computer Sci., 17 (2023), 176–349. https://doi.org/10.1007/s11704-023-2689-5 doi: 10.1007/s11704-023-2689-5 |
[9] | J. Huang, F. Yang, C. Chakraborty, Z. Guo, H. Zhang, L. Zhen, et al., Opportunistic capacity based resource allocation for 6G wireless systems with network slicing, Future Gener. Comput. Syst., 140 (2023), 390–401. https://doi.org/10.1016/j.future.2022.10.032 doi: 10.1016/j.future.2022.10.032 |
[10] | Z. Guo, Y. Shen, A. K. Bashir, M. Imran, N. Kumar, D. Zhang, et al., Robust Spammer Detection Using Collaborative Neural Network in Internet-of-Things Applications, IEEE Int. Things J., 8 (2021), 9549–9558. https://doi.org/10.1109/JIOT.2020.3003802 doi: 10.1109/JIOT.2020.3003802 |
[11] | Z. Guo, K. Yu, K. Konstantin, S. Mumtaz, W. Wei, P. Shi, et al., Deep Collaborative Intelligence-driven Traffic Forecasting in Green Internet of Vehicles, IEEE Transact. Green Commun. Network., 7 (2023), 1023–1035. https://doi.org/10.1109/TGCN.2022.3193849 doi: 10.1109/TGCN.2022.3193849 |
[12] | S. Cheng, C. Liu, Y. Guo, R. Arcucci, Efficient deep data assimilation with sparse observations and time-varying sensors, J. Comput. Phys., 496 (2024), 112581. https://doi.org/10.1016/j.jcp.2023.112581 doi: 10.1016/j.jcp.2023.112581 |
[13] | S. Cheng, I. C. Prentice, Y. Huang, Y. Jin, Y. K. Guo, R. Arcucci, Data-driven surrogate model with latent data assimilation: Application to wildfire forecasting, J. Comput. Phys., 464 (2022), 111302. https://doi.org/10.1016/j.jcp.2022.111302 doi: 10.1016/j.jcp.2022.111302 |
[14] | C. Zhang, S. Cheng, M. Kasoar, R. Arcucci, Reduced-order digital twin and latent data assimilation for global wildfire prediction, Nat. Hazards Earth Syst. Sci., 23 (2023), 1755–1768. https://doi.org/10.5194/nhess-23-1755-2023 doi: 10.5194/nhess-23-1755-2023 |
[15] | L. Wang, Q. Liu, W. Ma, Optimization of dynamic relocation operations for one-way electric carsharing systems, Transport. Res. Part C Emerg. Technol., 101 (2019), 55–69. https://doi.org/10.1016/j.trc.2019.01.005 doi: 10.1016/j.trc.2019.01.005 |
[16] | K. Huang, K. An, G. H. de Almeida Correia, J. Rich, W. Ma, An innovative approach to solve the carsharing demand-supply imbalance problem under demand uncertainty, Transport. Res. Part C Emerg. Technol., 132 (2021), 103369. https://doi.org/10.1016/j.trc.2021.103369 doi: 10.1016/j.trc.2021.103369 |
[17] | B. B. Oliveira, M. A. Carravilla, J. F. Oliveira, Fleet and revenue management in car rental companies: A literature review and an integrated conceptual framework, Omega, 71 (2017), 11–26. https://doi.org/10.1016/j.omega.2016.08.011 doi: 10.1016/j.omega.2016.08.011 |
[18] | J. Wang, L. Kang, Y. Liu, Optimal scheduling for electric bus fleets based on dynamic programming approach by considering battery capacity fade, Renewable Sustainable Energy Rev., 130 (2020), 109978. https://doi.org/10.1016/j.rser.2020.109978 doi: 10.1016/j.rser.2020.109978 |
[19] | N. Wang, J. Guo, X. Liu, Y. Liang, Electric vehicle car-sharing optimization relocation model combining user relocation and staff relocation, Transport. Letters, 13 (2021), 315–326. https://doi.org/10.1080/19427867.2020.1728843 doi: 10.1080/19427867.2020.1728843 |
[20] | Z. Hao, L. He, Z. Hu, J. Jiang, Robust vehicle pre-allocation with uncertain covariates, Product. Operat. Manag., 29 (2022), 955–972. https://doi.org/10.1111/poms.13143 doi: 10.1111/poms.13143 |
[21] | N. Wang, Y. Gao, H. Zhao, C. K. Ahn, Reinforcement learning-based optimal tracking control of an unknown unmanned surface vehicle, IEEE Transact. Neural Networks Learn. Syst., 32 (2022), 3034–3045. https://doi.org/10.1109/TNNLS.2020.3009214 doi: 10.1109/TNNLS.2020.3009214 |
[22] | N. Wang, Y. Gao, X. Zhang, Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle, IEEE Transact. Neural Networks Learn. Syst., 32 (2021), 5456–5467. https://doi.org/10.1109/TNNLS.2021.3056444 doi: 10.1109/TNNLS.2021.3056444 |
[23] | N. Wang, Y. Gao, C. Yang, X. Zhang, Reinforcement learning-based finite-time tracking control of an unknown unmanned surface vehicle with input constraints, Neurocomputing, 189 (2022), 108600. https://doi-orgssl.8611.top/10.1016/j.apacoust.2021.108600 |
[24] | G. Liu, W. Deng, X. Xie, L. Huang, H. Tang, Human-Level Control Through Directly Trained Deep Spiking $ Q $-Networks, IEEE Transact. Cybernet., (2022). https://doi.org/10.1109/TCYB.2022.3198259 |
[25] | X. B. Peng, E. Coumans, T. Zhang, T. W. Lee, J. Tan, S. Levine, Learning agile robotic locomotion skills by imitating animals, arXiv preprint arXiv: 2004.00784, (2004). https://doi.org/10.48550/arXiv.2004.00784 |
[26] | S. Zhang, Y. Li, Q. Dong, Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach, Appl. Soft Comput., 105 (2022), 108194. https://doi.org/10.1016/j.asoc.2021.108194 doi: 10.1016/j.asoc.2021.108194 |
[27] | Y. Oh, J. Shin, E. Yang, S. J. Hwang, Model-augmented prioritized experience replay, in International Conference on Learning Representations, (2021). |
[28] | T. Li, Z. Wang, G. Yang, Y. Cui, Y. Chen, X. Yu, Semi-selfish mining based on hidden Markov decision process, Int. J. Intell. Syst., 36 (2021), 3596–3612. https://doi.org/10.1002/int.22428 doi: 10.1002/int.22428 |
[29] | G. Kalnoor, G. Subrahmanyam, A review on applications of Markov decision process model and energy efficiency in wireless sensor networks, Proced.a Computer Sci., 167 (2020), 2308–2317. https://doi.org/10.1016/j.procs.2020.03.283 doi: 10.1016/j.procs.2020.03.283 |
[30] | G. P. Antonio, C. Maria-Dolores, Multi-agent deep reinforcement learning to manage connected autonomous vehicles at tomorrow's intersections, IEEE Transact. Vehicular Technol., 71 (2022), 7033–7043. https://doi.org/10.1109/TVT.2022.3169907 doi: 10.1109/TVT.2022.3169907 |
[31] | C. Li, Study on theory of the Grey Markov Chain method and its application, in The Proceedings of the Multiconference on "Computational Engineering in Systems Applications"., 72 (2006), 1742–1746. https://doi.org/10.1109/CESA.2006.4281919 |
[32] | F. Y. Wang, H. Zhang, D. Liu, Adaptive dynamic programming: An introduction, IEEE Comput. Intell. Magaz., 4 (2009), 39–47. https://doi.org/10.1109/MCI.2009.932261 doi: 10.1109/MCI.2009.932261 |
[33] | Z. Guo, L. Tang, T. Guo, K. Yu, M. Alazab, A. Shalaginov, Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace, Future Gener. Comput. Syst., 117 (2021), 205–218. https://doi.org/10.1016/j.future.2020.11.028 doi: 10.1016/j.future.2020.11.028 |
[34] | K. K. McDill, C. D. Minchew, Waveform selection for an electrically enhanced seine for use in harvesting channel catfish Ictalurus punctatus from ponds, J. World Aquaculture Soc., 32 (2001), 342–347. https://doi.org/10.1111/j.1749-7345.2001.tb00458.x doi: 10.1111/j.1749-7345.2001.tb00458.x |
[35] | D. Liu, S. Xue, B. Zhao, B. Luo, Q. Wei, Adaptive dynamic programming for control: A survey and recent advances, IEEE Transact. Syst. Man Cybernet. Syst., 51 (2022), 142–160. https://doi.org/10.1109/TSMC.2020.3042876 doi: 10.1109/TSMC.2020.3042876 |
[36] | S. V. Lapensée-Rankine, Dynamic Programming Insights from Programming Contests, Appalachian State University, (2021). |
[37] | A. Alla, M. Falcone, D. Kalise, An efficient policy iteration algorithm for dynamic programming equations, SIAM J. Sci. Comput., 37 (2015), A181–A200. https://doi.org/10.1002/pamm.201310226 doi: 10.1002/pamm.201310226 |
[38] | D. Xiang, H. Lin, J. Ouyang, D. Huang, Combined improved A* and greedy algorithm for path planning of multi-objective mobile robot, Sci. Rep., 12 (2022), 13273. https://doi.org/10.1038/s41598-022-17684-0 doi: 10.1038/s41598-022-17684-0 |
[39] | F. Ye, J. Perrett, L. Zhang, Y. Laili, Y Wang, A self-evolving system for robotic disassembly sequence planning under uncertain interference conditions, Robot. Computer-Integr. Manufact., 78 (2022), 102392. https://doi.org/10.1016/j.rcim.2022.102392 doi: 10.1016/j.rcim.2022.102392 |
[40] | I. A. Zamfirache, R. E. Precup, R. C. Roman, E. M. Petriu, Policy iteration reinforcement learning-based control using a grey wolf optimizer algorithm, Inform. Sci., 585 (2022), 162–175. https://doi.org/10.1016/j.ins.2021.11.051 doi: 10.1016/j.ins.2021.11.051 |