Analysis of transport mode choice is crucial in transportation planning and optimization. Traditionally, the transport mode of individuals is detected by discrete choice models (DCMs), which rely on data regarding individual and household attributes. Using these attribute data raises privacy concerns and limits the applicability of the model. Meanwhile, the detection results of DCMs may be biased, despite providing insight into the impact of variables. The machine learning models are more effective for mode detection, but most models need more interpretability. In this study, an interpretable machine learning model is developed to detect the transport modes of individuals. The mobility features of individuals, which introduce the velocity and acceleration of the center of mass (COM) are innovatively considered in the detection model. These mobility features are combined with multi-source data, including land use mix, GDP, population and online map service data as detection features. Using the travel survey data from Nanjing, China in 2015, the effects of different machine learning models on fine-grained detection performance are investigated. The results indicate that the deep forest model presents the best detection performance and achieves an accuracy of 0.82 in the test dataset, demonstrating the effectiveness of the proposed detection model. Furthermore, t-distributed stochastic neighbor embedding (t-SNE) and ablation experiments are conducted to overcome the non-interpretability issue of the machine learning models. The results show that the mobility features of individuals are the most critical features for improving detection performance. This study is essential for improving the structure of transport modes and maintaining low-carbon and sustainable development in urban traffic systems.
Citation: Yuhang Liu, Jun Chen, Yuchen Wang, Wei Wang. Interpretable machine learning models for detecting fine-grained transport modes by multi-source data[J]. Electronic Research Archive, 2023, 31(11): 6844-6865. doi: 10.3934/era.2023346
Analysis of transport mode choice is crucial in transportation planning and optimization. Traditionally, the transport mode of individuals is detected by discrete choice models (DCMs), which rely on data regarding individual and household attributes. Using these attribute data raises privacy concerns and limits the applicability of the model. Meanwhile, the detection results of DCMs may be biased, despite providing insight into the impact of variables. The machine learning models are more effective for mode detection, but most models need more interpretability. In this study, an interpretable machine learning model is developed to detect the transport modes of individuals. The mobility features of individuals, which introduce the velocity and acceleration of the center of mass (COM) are innovatively considered in the detection model. These mobility features are combined with multi-source data, including land use mix, GDP, population and online map service data as detection features. Using the travel survey data from Nanjing, China in 2015, the effects of different machine learning models on fine-grained detection performance are investigated. The results indicate that the deep forest model presents the best detection performance and achieves an accuracy of 0.82 in the test dataset, demonstrating the effectiveness of the proposed detection model. Furthermore, t-distributed stochastic neighbor embedding (t-SNE) and ablation experiments are conducted to overcome the non-interpretability issue of the machine learning models. The results show that the mobility features of individuals are the most critical features for improving detection performance. This study is essential for improving the structure of transport modes and maintaining low-carbon and sustainable development in urban traffic systems.
[1] | Z. Zhang, J. Zhang, Operating subsidies for urban rail transit PPP projects, J. Tsinghua Univ., 56 (2016), 1327–1332. https://doi.org/10.16511/j.cnki.qhdxxb.2016.25.046 doi: 10.16511/j.cnki.qhdxxb.2016.25.046 |
[2] | S. Tscharaktschiew, F. Reimann, Less workplace parking with fully autonomous vehicles?, J. Intell. Connect. Veh., 5 (2022), 283–301. https://doi.org/10.1108/JICV-07-2022-0029 doi: 10.1108/JICV-07-2022-0029 |
[3] | Y. Liu, C. Lyu, Z. Liu, J. Cao, Exploring a large-scale multi-modal transportation recommendation system, Transp. Res. Part C Emerg. Technol., 126 (2021), 103070. https://doi.org/10.1016/j.trc.2021.103070 doi: 10.1016/j.trc.2021.103070 |
[4] | K. Kim, K. Kwon, M. W. Horner, Examining the effects of the built environment on travel mode choice across different age groups in seoul using a random forest method, Transp. Res. Record., 2675 (2021), 670–683. https://doi.org/10.1177/03611981211000750 doi: 10.1177/03611981211000750 |
[5] | M. Yang, D. Li, W. Wang, J. Zhao, X. Chen, Modeling gender-based differences in mode choice considering time-use pattern: Analysis of bicycle, public transit, and car use in Su Zhou, China, Adv. Mech. Eng., 2013 (2013), 706918. https://doi.org/10.1155/2013/706918 doi: 10.1155/2013/706918 |
[6] | C. R. Bhat, S. Srinivasan, A multidimensional mixed ordered-response model for analyzing weekend activity participation, Transp. Res. Part B Methodol., 39 (2005), 255–278. https://doi.org/10.1016/j.trb.2004.04.002 doi: 10.1016/j.trb.2004.04.002 |
[7] | L. Cheng, X. Chen, M. Wei, J. Wu, X. Hou, Modeling mode choice behavior incorporating household and individual sociodemographics and travel attributes based on rough sets theory, Comput. Intell. Neurosci., 2014 (2014), 26. https://doi.org/10.1155/2014/560919 doi: 10.1155/2014/560919 |
[8] | C. Ding, Y. Chen, J. Duan, Y. Lu, J. Cui, Exploring the influence of attitudes to walking and cycling on commute mode choice using a hybrid choice model, J. Adv. Transp., 2017 (2017). https://doi.org/10.1155/2017/8749040 doi: 10.1155/2017/8749040 |
[9] | J. Jeong, J. Lee, T. H. T. Gim, Travel mode choice as a representation of travel utility: A multilevel approach reflecting the hierarchical structure of trip, individual, and neighborhood characteristics, Pap. Reg. Sci., 101 (2022), 745–765. https://doi.org/10.1111/pirs.12665 doi: 10.1111/pirs.12665 |
[10] | C. Ding, D. Wang, C. Liu, Y. Zhang, J. Yang, Exploring the influence of built environment on travel mode choice considering the mediating effects of car ownership and travel distance, Transp. Res. Part A Policy Pract., 100 (2017), 65–80. https://doi.org/10.1016/j.tra.2017.04.008 doi: 10.1016/j.tra.2017.04.008 |
[11] | P. van den Berg, T. Arentze, H. Timmermans, Estimating social travel demand of senior citizens in the Netherlands, J. Transp. Geogr., 19 (2011), 323–331. https://doi.org/10.1016/j.jtrangeo.2010.03.018 doi: 10.1016/j.jtrangeo.2010.03.018 |
[12] | C. R. Bhat, S. Srinivasan, K. W. Axhausen, An analysis of multiple interepisode durations using a unifying multivariate hazard model, Transp. Res. Part B Methodol., 39 (2005), 797–823. https://doi.org/10.1016/j.trb.2004.11.002 doi: 10.1016/j.trb.2004.11.002 |
[13] | X. Cao, S. L. Handy, P. L. Mokhtarian, The influences of the built environment and residential self-selection on pedestrian behavior: Evidence from Austin, TX, Transportation., 33 (2006), 1–20. https://doi.org/10.1007/s11116-005-7027-2 doi: 10.1007/s11116-005-7027-2 |
[14] | R. Ye, H. Titheridge, Satisfaction with the commute: The role of travel mode choice, built environment and attitudes, Transp. Res. Part D Transp. Environ., 52 (2017), 535–547. https://doi.org/10.1016/j.trd.2016.06.011 doi: 10.1016/j.trd.2016.06.011 |
[15] | N. F. M. Ali, A. F. M. Sadullah, A. P. P. A. Majeed, M. A. M. Razman, R. M. Musa, The identification of significant features towards travel mode choice and its prediction via optimised random forest classifier: An evaluation for active commuting behavior, J. Transp. Health., 25 (2022), 101362. https://doi.org/10.1016/j.jth.2022.101362 doi: 10.1016/j.jth.2022.101362 |
[16] | C. Ding, Y. Wang, T. Tang, S. Mishra, C. Liu, Joint analysis of the spatial impacts of built environment on car ownership and travel mode choice, Transp. Res. Part D Transp., 60 (2018), 28–40. https://doi.org/10.1016/j.trd.2016.08.004 doi: 10.1016/j.trd.2016.08.004 |
[17] | L. Shen, P. R. Stopher, Review of GPS travel survey and GPS data-processing methods, Transp. Rev., 34 (2014), 316–334. https://doi.org/10.1080/01441647.2014.903530 doi: 10.1080/01441647.2014.903530 |
[18] | N. Caceres, L. M. Romero, F. G. Benitez, Exploring strengths and weaknesses of mobility inference from mobile phone data vs. travel surveys, Transportmetrica A: Transport Sci., 16 (2020), 574–601. https://doi.org/10.1080/23249935.2020.1720857 doi: 10.1080/23249935.2020.1720857 |
[19] | R. J. Lee, I. N. Sener, J. A. Mullins, An evaluation of emerging data collection technologies for travel demand modeling: From research to practice, Transp. Lett., 8 (2016), 181–193. https://doi.org/10.1080/19427867.2015.1106787 doi: 10.1080/19427867.2015.1106787 |
[20] | Y. Liu, E. Miller, K. N. Habib, Detecting transportation modes using smartphone data and GIS information: evaluating alternative algorithms for an integrated smartphone-based travel diary imputation, Transp. Lett., 14 (2022), 933–943. https://doi.org/10.1080/19427867.2021.1958591 doi: 10.1080/19427867.2021.1958591 |
[21] | K. Chin, H. Huang, C. Horn, I. Kasanicky, R. Weibel, Inferring fine-grained transport modes from mobile phone cellular signaling data, Comput. Environ. Urban Syst., 77 (2019), 101348. https://doi.org/10.1016/j.compenvurbsys.2019.101348 doi: 10.1016/j.compenvurbsys.2019.101348 |
[22] | K. Gao, H. Wang, S. Wang, X. Qu, Data and code disclosure and sharing policy of communications in transportation research, Commun. Transp. Res., 2 (2022), 100055. https://doi.org/10.1016/j.commtr.2022.100055 doi: 10.1016/j.commtr.2022.100055 |
[23] | L. Cheng, X. Chen, S. Yang, An exploration of the relationships between socioeconomics, land use and daily trip chain pattern among low-income residents, Transp. Plan. Technol., 39 (2016), 358–369. https://doi.org/10.1080/03081060.2016.1160579 doi: 10.1080/03081060.2016.1160579 |
[24] | S. A. O. Medina, Inferring weekly primary activity patterns using public transport smart card data and a household travel survey, Travel Behav. Soc., 12 (2018), 93–101. https://doi.org/10.1016/j.tbs.2016.11.005 doi: 10.1016/j.tbs.2016.11.005 |
[25] | Q. Yuan, X. Xu, T. Wang, Y. Chen, Investigating safety and liability of autonomous vehicles: Bayesian random parameter ordered probit model analysis, J. Intell. Connect. Veh., 5 (2022), 199–205. https://doi.org/10.1108/JICV-04-2022-0012 doi: 10.1108/JICV-04-2022-0012 |
[26] | X. Zhao, X. Yan, A. Yu, P. Van Hentenryck, Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models, Travel Behav. Soc., 20 (2020), 22–35. https://doi.org/10.1016/j.tbs.2020.02.003 doi: 10.1016/j.tbs.2020.02.003 |
[27] | J. Hagenauer, M. Helbich, A comparative study of machine learning classifiers for modeling travel mode choice, Expert Syst. Appl., 78 (2017), 273–282. https://doi.org/10.1016/j.eswa.2017.01.057 doi: 10.1016/j.eswa.2017.01.057 |
[28] | T. Hillel, M. Bierlaire, M. Z. E. B. Elshafie, Y. Jin, A systematic review of machine learning classification methodologies for modelling passenger mode choice, J. Choice Model., 38 (2021), 100221. https://doi.org/10.1016/j.jocm.2020.100221 doi: 10.1016/j.jocm.2020.100221 |
[29] | Y. Liu, F. Wu, Z. Liu, K. Wang, F. Wang, X. Qu, Can language models be used for real-world urban-delivery route optimization?, Innovation, 2023. https://doi.org/10.1016/j.xinn.2023.100520 doi: 10.1016/j.xinn.2023.100520 |
[30] | L. Cheng, X. Chen, J. D. Vos, X. Lai, F. Witlox, Applying a random forest method approach to model travel mode choice behavior, Travel Behav. Soc., 14 (2019), 1–10. https://doi.org/10.1016/j.tbs.2018.09.002 doi: 10.1016/j.tbs.2018.09.002 |
[31] | P. Salas, R. de la Fuente, S. Astroza, J. A. Carrasco, A systematic comparative evaluation of machine learning classifiers and discrete choice models for travel mode choice in the presence of response heterogeneity, Expert Syst. Appl., 193 (2022), 116253. https://doi.org/10.1016/j.eswa.2021.116253 doi: 10.1016/j.eswa.2021.116253 |
[32] | L. Cheng, X. Lai, X. Chen, S. Yang, J. D. Vos, F. Witlox, Applying an ensemble-based model to travel choice behavior in travel demand forecasting under uncertainties, Transp. Lett., 12 (2020), 375–385. https://doi.org/10.1080/19427867.2019.1603188 doi: 10.1080/19427867.2019.1603188 |
[33] | W. Li, K. Xiao, Y. Ren, C. Li, Y. Fan, Path planning and control method for vehicle obstacle avoidance in pedestrian crossing scenes, J. Automot. Saf. Energy, 13 (2022), 489–501. https://doi.org/10.3969/j.issn.1674-8484.2022.03.010 doi: 10.3969/j.issn.1674-8484.2022.03.010 |
[34] | Y. Hu, T. Jiang, X. Liu, Y. Shi, Pedestrian-crossing intention-recognition based on dual-stream adaptive graph-convolutional neural-network, J. Automot. Saf. Energy., 13 (2022), 325–332. https://doi.org/10.3969/j.issn.1674-8484.2022.02.013 doi: 10.3969/j.issn.1674-8484.2022.02.013 |
[35] | I. Ullah, K. Liu, T. Yamamoto, M. Zahid, A. Jamal, Modeling of machine learning with SHAP approach for electric vehicle charging station choice behavior prediction, Travel Behav. Soc., 31 (2023), 78–92. https://doi.org/10.1016/j.tbs.2022.11.006 doi: 10.1016/j.tbs.2022.11.006 |
[36] | L. van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., 9 (2008). |
[37] | S. M. Lundberg, S. I. Lee, A unified approach to interpreting model predictions, Adv. Neural Inf. Process Syst., 30 (2017). https://doi.org/10.48550/arXiv.1705.07874 doi: 10.48550/arXiv.1705.07874 |
[38] | S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, Adv. Neural Inf. Process Syst., 2015. https://doi.org/10.48550/arXiv.1506.01497 doi: 10.48550/arXiv.1506.01497 |
[39] | M. T. Kashifi, A. Jamal, M. S. Kashefi, M. Almoshaogeh, S. M. Rahman, Predicting the travel mode choice with interpretable machine learning techniques: A comparative study, Travel Behav. Soc., 29 (2022), 279–296. https://doi.org/10.1016/j.tbs.2022.07.003 doi: 10.1016/j.tbs.2022.07.003 |
[40] | Y. Zheng, J. Xiao, X. Hua, W. Wang, H. Chen, A comparative analysis of the robustness of multimodal comprehensive transportation network considering mode transfer: A case study, Electron. Res. Arch., 31 (2023), 5362–5395. https://doi.org/10.3934/era.2023272 doi: 10.3934/era.2023272 |
[41] | A. A. Toorzani, A. A. Rassafi, Pro-environmental attitude and adherence to a travel mode in an integrated choice and latent variable (ICLV) model: results from a revealed preference survey, Int. J. Civ. Eng., 21 (2023), 235–249. https://doi.org/10.1007/s40999-022-00757-6 doi: 10.1007/s40999-022-00757-6 |
[42] | Y. Tran, N. Hashimoto, T. Ando, T. Sato, N. Konishi, Y. Takeda, et al., The indirect effect of travel mode use on subjective well-being through out-of-home activities, Transportation, 2023 (2023), 1–33. https://doi.org/10.1007/s11116-023-10408-x doi: 10.1007/s11116-023-10408-x |
[43] | J. De Vos, P. L. Mokhtarian, T. Schwanen, V. Van Acker, F. Witlox, Travel mode choice and travel satisfaction: bridging the gap between decision utility and experienced utility, Transportation, 43 (2016), 771–796. https://doi.org/10.1007/s11116-015-9619-9 doi: 10.1007/s11116-015-9619-9 |
[44] | M. C. González, C. A. Hidalgo, A. L. Barabási, Understanding individual human mobility patterns, Nature, 453 (2008), 779–782. https://doi.org/10.1038/nature06958 doi: 10.1038/nature06958 |
[45] | F. Xu, Y. Li, D. Jin, J. Lu, C. Song, Emergence of urban growth patterns from human mobility behavior, Nat. Comput. Sci., 1 (2021), 791–800. https://doi.org/10.1038/s43588-021-00160-6 doi: 10.1038/s43588-021-00160-6 |
[46] | Y. Hong, H. Martin, Y. Xin, D. Bucher, D. J. Reck, K. W. Axhausen, et al., Conserved quantities in human mobility: From locations to trips, Transp. Res. Part C Emerg. Technol., 146 (2023), 103979. https://doi.org/10.1016/j.trc.2022.103979 doi: 10.1016/j.trc.2022.103979 |
[47] | J. C. Xian-Yu, Travel mode choice analysis using support vector machines, in ICCTP 2011: Towards Sustainable Transportation Systems, (2011), 360–371. https://doi.org/10.1061/41186(421)37 |
[48] | G. Zhan, X. Yan, S. Zhu, Y. Wang, Using hierarchical tree-based regression model to examine university student travel frequency and mode choice patterns in China, Transp. Policy, 45 (2016), 55–65. https://doi.org/10.1016/j.tranpol.2015.09.006 doi: 10.1016/j.tranpol.2015.09.006 |
[49] | H. Omrani, O. Charif, P. Gerber, A. Awasthi, P. Trigano, Prediction of individual travel mode with evidential neural network model, Transp. Res. Record., 2399 (2013), 1–8. https://doi.org/10.3141/2399-01 doi: 10.3141/2399-01 |
[50] | T. Chen, C. Guestrin, XGBoost: A scalable tree boosting system, in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2016), 785–794. https://doi.org/10.1145/2939672.2939785 |
[51] | L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, A. Gulin, Catboost: Unbiased boosting with categorical features, Adv. Neural Inf. Process Syst., 2018. https://doi.org/10.48550/arXiv.1706.09516 doi: 10.48550/arXiv.1706.09516 |
[52] | H. Omrani, Predicting travel mode of individuals by machine learning, Transp. Res. Procedia, 10 (2015), 840–849. https://doi.org/10.1016/j.trpro.2015.09.037 doi: 10.1016/j.trpro.2015.09.037 |
[53] | Z. H. Zhou, J. Feng, Deep forest, Natl. Sci. Rev., 6 (2019), 74–86. https://doi.org/10.1093/nsr/nwy108 doi: 10.1093/nsr/nwy108 |
[54] | J. Qin, F. Liao, Space-time prisms in multimodal supernetwork-Part 2: Application for analyses of accessibility and equality, Commun. Transp. Res., 2 (2022), 100063. https://doi.org/10.1016/j.commtr.2022.100063 doi: 10.1016/j.commtr.2022.100063 |
[55] | P. Widhalm, Y. Yang, M. Ulm, S. Athavale, M. C. González, Discovering urban activity patterns in cell phone data, Transportation, 42 (2015), 597–623. https://doi.org/10.1007/s11116-015-9598-x doi: 10.1007/s11116-015-9598-x |
[56] | H. Huang, Y. Cheng, R. Weibel, Transport mode detection based on mobile phone network data: A systematic review, Transp. Res. Part C Emerg. Technol., 101 (2019), 297–312. https://doi.org/10.1016/j.trc.2019.02.008 doi: 10.1016/j.trc.2019.02.008 |
[57] | C. Chen, J. Ma, Y. Susilo, Y. Liu, M. Wang, The promises of big data and small data for travel behavior (aka human mobility) analysis, Transp. Res. Part C Emerg. Technol., 68 (2016), 285–299. https://doi.org/10.1016/j.trc.2016.04.005 doi: 10.1016/j.trc.2016.04.005 |
[58] | Y. Song, L. Merlin, D. Rodriguez, Comparing measures of urban land use mix, Comput. Environ. Urban Syst., 42 (2013), 1–13. https://doi.org/10.1016/j.compenvurbsys.2013.08.001. doi: 10.1016/j.compenvurbsys.2013.08.001 |
[59] | K. K. W. Yim, S. C. Wong, A. Chen, C. K. Wong, W. H. K. Lam, A reliability-based land use and transportation optimization model, Transp. Res. Part C Emerg. Technol., 19 (2011), 351–362. https://doi.org/10.1016/j.trc.2010.05.019 doi: 10.1016/j.trc.2010.05.019 |
[60] | M. W. Horner, D. Schleith, Analyzing temporal changes in land-use-transportation relationships: A LEHD-based approach, Appl. Geogr., 35 (2012), 491–498. https://doi.org/10.1016/j.apgeog.2012.09.006 doi: 10.1016/j.apgeog.2012.09.006 |
[61] | Z. Peng, G. Bai, H. Wu, L. Liu, Y. Yu, Travel mode recognition of urban residents using mobile phone data and MapAPI, Environ. Plan. B Urban Anal. City Sci., 48 (2021), 2574–2589. https://doi.org/10.1177/2399808320983001 doi: 10.1177/2399808320983001 |
[62] | P. Gong, B. Chen, X. Li, H. Liu, J. Wang, Y. Bai, et al., Mapping essential urban land use categories in China (EULUC-China): preliminary results for 2018, Sci. Bull., 65 (2020), 182–187. https://doi.org/10.1016/j.scib.2019.12.007 doi: 10.1016/j.scib.2019.12.007 |
[63] | G. Xiao, Z. Juan, C. Zhang, Detecting trip purposes from smartphone-based travel surveys with artificial neural networks and particle swarm optimization, Transp Res Part C Emerg Technol., 71 (2016), 447–463. https://doi.org/10.1016/j.trc.2016.08.008 doi: 10.1016/j.trc.2016.08.008 |
[64] | M. Müller-Hannemann, R. Rückert, A. Schiewe, A. Schöbel, Estimating the robustness of public transport schedules using machine learning, Transp. Res. Part C Emerg. Technol., 137 (2022), 103566. https://doi.org/10.1016/j.trc.2022.103566 doi: 10.1016/j.trc.2022.103566 |
[65] | C. Song, Z. Qu, N. Blumm, A. L. Barabási, Limits of predictability in human mobility, Science, 327 (2010), 1018–1021. https://doi.org/10.1126/science.1177170 doi: 10.1126/science.1177170 |
[66] | A. Chatzimparmpas, R. M. Martins, A. Kerren, t-visne: Interactive assessment and interpretation of t-sne projections, IEEE Trans. Visual Comput. Graphics, 26 (2020), 2696–2714. https://doi.org/10.1109/TVCG.2020.2986996 doi: 10.1109/TVCG.2020.2986996 |
[67] | P. R. Anukrishna, V. Paul, A review on feature selection for high dimensional data, in 2017 International Conference on Inventive Systems and Control (ICISC), 2017. https://doi.org/10.1109/ICISC.2017.8068746 |
[68] | J. Cai, J. Luo, S. Wang, S. Yang, Feature selection in machine learning: A new perspective, Neurocomputing, 300 (2018), 70–79. https://doi.org/10.1016/j.neucom.2017.11.077 doi: 10.1016/j.neucom.2017.11.077 |
[69] | M. Zhang, The role of land use in travel mode choice: Evidence from Boston and Hong Kong, J. Am. Plan. Assoc., 70 (2004), 344–360. https://doi.org/10.1080/01944360408976383 doi: 10.1080/01944360408976383 |