Recent research has showen that deep reinforcement learning (DRL) can be used to design better heuristics for the traveling salesman problem (TSP) on the small scale, but does not do well when generalized to large instances. In order to improve the generalization ability of the model when the nodes change from small to large, we propose a dynamic graph Conv-LSTM model (DGCM) to the solve large-scale TSP. The noted feature of our model is the use of a dynamic encoder-decoder architecture and a convolution long short-term memory network, which enable the model to capture the topological structure of the graph dynamically, as well as the potential relationships between nodes. In addition, we propose a dynamic positional encoding layer in the DGCM, which can improve the quality of solutions by providing more location information. The experimental results show that the performance of the DGCM on the large-scale TSP surpasses the state-of-the-art DRL-based methods and yields good performance when generalized to real-world datasets. Moreover, our model compares favorably to heuristic algorithms and professional solvers in terms of computational time.
Citation: Yang Wang, Zhibin Chen. Dynamic graph Conv-LSTM model with dynamic positional encoding for the large-scale traveling salesman problem[J]. Mathematical Biosciences and Engineering, 2022, 19(10): 9730-9748. doi: 10.3934/mbe.2022452
Recent research has showen that deep reinforcement learning (DRL) can be used to design better heuristics for the traveling salesman problem (TSP) on the small scale, but does not do well when generalized to large instances. In order to improve the generalization ability of the model when the nodes change from small to large, we propose a dynamic graph Conv-LSTM model (DGCM) to the solve large-scale TSP. The noted feature of our model is the use of a dynamic encoder-decoder architecture and a convolution long short-term memory network, which enable the model to capture the topological structure of the graph dynamically, as well as the potential relationships between nodes. In addition, we propose a dynamic positional encoding layer in the DGCM, which can improve the quality of solutions by providing more location information. The experimental results show that the performance of the DGCM on the large-scale TSP surpasses the state-of-the-art DRL-based methods and yields good performance when generalized to real-world datasets. Moreover, our model compares favorably to heuristic algorithms and professional solvers in terms of computational time.
[1] | M. Bellmore, G. L. Nemhauser, The traveling salesman problem: A survey, Oper. Res., 16 (1968), 538–558. https://doi.org/10.1007/978-3-642-51565-1 doi: 10.1007/978-3-642-51565-1 |
[2] | C. H. Papadimitriou, The euclidean travelling salesman problem is np-complete, Oper. Res., 4 (1977), 237–244. https://doi.org/10.1016/0304-3975(77)90012-3 doi: 10.1016/0304-3975(77)90012-3 |
[3] | C. William, World TSP, 2021. Available from: http://www.sars-expertcom.gov.hk/english/reports/reports.html. |
[4] | R. Bellman, Dynamic programming treatment of the travelling salesman problem, J. ACM, 9 (1962), 61–63. https://doi.org/10.1145/321105.321111 doi: 10.1145/321105.321111 |
[5] | V. V. Vazirani, Approximation Algorithms, Springer Science & Business Media Press, 2013. https://doi.org/10.1007/978-3-662-04565-7 |
[6] | Y. Hu, Q. Duan, Solving the TSP by the AALHNN algorithm, Math. Biosci. Eng., 19 (2022), 3427–3488. https://doi.org/10.3934/mbe.2022158 doi: 10.3934/mbe.2022158 |
[7] | J. J. Hopfield, D. W. Tank, "Neural" computation of decisions in optimization problems, Biol. Cyber., 52 (1985), 141–152. https://doi.org/10.1007/BF00339943 doi: 10.1007/BF00339943 |
[8] | K. Panwar, K. Deep, Transformation operators based grey wolf optimizer for travelling salesman problem, J. Comput. Sci., 55 (2021), 101454. https://doi.org/10.1016/j.jocs.2021.101454 doi: 10.1016/j.jocs.2021.101454 |
[9] | Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y. Philip, A comprehensive survey on graph neural networks, IEEE Trans. Neural Networks Learn. Syst., 32 (2020), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386 doi: 10.1109/TNNLS.2020.2978386 |
[10] | Q. Wang, C. Tang, Deep reinforcement learning for transportation network combinatorial optimization: A survey, Knowl. Based Syst., 233 (2021), 107526. https://doi.org/10.1016/j.knosys.2021.107526 doi: 10.1016/j.knosys.2021.107526 |
[11] | Y. Bengio, A. Lodi, A. Prouvost, Machine learning for combinatorial optimization: A methodological tour d'horizon, Eur. J. Oper. Res., 290 (2021), 405–421. https://doi.org/10.1016/j.ejor.2020.07.063 doi: 10.1016/j.ejor.2020.07.063 |
[12] | Q. Ma, S. Ge, D. He, D. Thaker, I. Drori, Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning, preprint, arXiv: 1911.04936. https://doi.org/10.48550/arXiv.1911.04936 |
[13] | O. Vinyals, M. Fortunato, N. Jaitly, Pointer networks, in Proceedings of the 29th Concerence on Neural Information Processing System (NIPS), 28 (2015), 2692–2700. |
[14] | I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with neural networks, in Proceedings of the 28th Concerence on Neural Information Processing System (NIPS), 27 (2014), 3104–3112. |
[15] | I. Bello, H. Pham, Q. V. Le, M. Norouzi, S. Bengio, Neural combinatorial optimization with reinforcement learning, in Proceedings of the 5th International Conference on Learning Representations, 2017. |
[16] | H. Dai, E. B. Khalil, Y. Zhang, B. Dilkina, L. Song, Learning combinatorial optimization algorithms over graphs, in Proceedings of the 31th Concerence on Neural Information Processing System (NIPS), 30 (2017), 6351–6361. |
[17] | C. K. Joshi, Q. Cappart, L. M. Rousseau, T. Laurent, X. Bresson, Learning tsp requires rethinking generalization, preprint, arXiv: 2006.07054. https://doi.org/10.48550/arXiv.2006.07054 |
[18] | W. Kool, H. van Hoof, M. Welling, Attention, learn to solve routing problems, in Proceedings of the 7th International Conference on Learning Representations (ICLR), 2019. |
[19] | Y. Wu, W. Song, Z. Cao, J. Zhang, A. Lim, Learning improvement heuristics for solving routing problems, in IEEE Transactions on Neural Networks and Learning Systems, (2021), 1–13. https: //doi.org/10.1109/TNNLS.2021.3068828 |
[20] | L. Xin, W. Song, Z. Cao, J. Zhang, Multi-decoder attention model with embedding glimpse for solving vehicle routing problems, in Proceedings of the 35th Conference on Artificial Intelligence (AAAI), (2021), 12042–12049. |
[21] | Y. D. Kwon, J. Choo, B. Kim, I. Yoon, Y. Gwon, S. Min, Pomo: Policy optimization with multiple optima for reinforcement learning, in Proceedings of the 34th Concerence on Neural Information Processing System (NIPS), 33 (2020), 21188–21198. |
[22] | Y. Ma, J. Li, Z. Cao, W. Song, L. Zhang, Z. Chen, J. Tang, Learning to iteratively solve routing problems with dual-aspect collaborative transformer, in Proceedings of the 35th Concerence on Neural Information Processing System (NIPS), 34 (2021), 11096–11107. |
[23] | W. Kool, H. van Hoof, J. Gromicho, M. Welling, Deep policy dynamic programming for vehicle routing problems, in International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Springer, (2022), 190–213. https://doi.org/10.1007/978-3-031-08011-1_14 |
[24] | X. Bresson, T. Laurent, The transformer network for the traveling salesman problem, preprint, arXiv: 2103.03012. |
[25] | B. Hudson, Q. Li, M. Malencia, A. Prorok, Graph neural network guided local search for the traveling salesperson problem, preprint, arXiv: 2110.05291. |
[26] | L. Xin, W. Song, Z. Cao, J. Zhang, NeuroLKH: Combining deep learning model with lin-kernighan-helsgaun heuristic for solving the traveling salesman problem, in Proceedings of the 35th Concerence on Neural Information Processing System (NIPS), 34 (2021), 7472–7483. |
[27] | W. Chen, Z. Li, C. Liu, Y. Ai, A deep learning model with conv-LSTM networks for subway passenger congestion delay prediction, J. Adv. Trans., 2021 (2021). https://doi.org/10.1155/2021/6645214 doi: 10.1155/2021/6645214 |
[28] | T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in Proceedings of the 4th International Conference on Learning Representations (ICLR), 2016. |
[29] | R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Mach. Learn., 8 (1992), 229–256. https://doi.org/10.1007/978-1-4615-3618-5_2 doi: 10.1007/978-1-4615-3618-5_2 |
[30] | D. L. Applegate, R. E. Bixby, V. Chvátal, W. Cook, D. G. Espinoza, M. Goycoolea, t al., Certification of an optimal TSP tour through 85,900 cities, Oper. Res. Lett., 37 (2009), 11–15. https://doi.org/10.1016/j.orl.2008.09.006 doi: 10.1016/j.orl.2008.09.006 |
[31] | K. Helsgaun, An extension of the Lin-Kernighan-Helsgaun TSP solver for constrained traveling salesman and vehicle routing problems, Roskilde Univ., 2017 (2017), 24–50. |
[32] | Gurobi Optimization, Gurobi optimizer reference manual, 2016. Available from: http://www.gurobi.com. |
[33] | Google, OR-Tools, 2018. Available from: https://developers.google.com. |
[34] | G. Reinelt, Tspliba traveling salesman problem library, ORSA J. Comput., 3 (1991), 376–384. https://doi.org/10.1287/ijoc.3.4.376 doi: 10.1287/ijoc.3.4.376 |