The factors that affect the severity of crashes must be identified for pedestrian and traffic safety in urban roads. Specifically, in the case of urban road crashes, these crashes occur due to the complex interaction of various factors. Therefore, it is necessary to collect high-quality data that can derive these various factors. Accordingly, this study collected crash data, which included detailed crash factor data on the huge urban and mid-level roads. Using this, various crash factors including driver, vehicle, road, environment, and crash characteristics are constructed to develop a crash severity prediction model. Through this, this study identified more detailed factors affecting the severity of urban road crashes. The crash severity model was developed using both machine learning and statistical models because the insights that can be obtained from the latest technology and traditional methods are different. Therefore, the binary logit model, a support vector machine, and extreme gradient boosting were developed using key variables derived from the multiple correspondence analysis and Boruta-SHapley Additive exPlanations. The main result of this study shows that the crash severity decreased at four-street intersections and when traffic segregation facilities were installed. The findings of this study can be used to establish a traffic safety management strategy to reduce the severity of crashes on urban roads.
Citation: Nuri Park, Junhan Cho, Juneyoung Park. Assessing crash severity of urban roads with data mining techniques using big data from in-vehicle dashcam[J]. Electronic Research Archive, 2024, 32(1): 584-607. doi: 10.3934/era.2024029
The factors that affect the severity of crashes must be identified for pedestrian and traffic safety in urban roads. Specifically, in the case of urban road crashes, these crashes occur due to the complex interaction of various factors. Therefore, it is necessary to collect high-quality data that can derive these various factors. Accordingly, this study collected crash data, which included detailed crash factor data on the huge urban and mid-level roads. Using this, various crash factors including driver, vehicle, road, environment, and crash characteristics are constructed to develop a crash severity prediction model. Through this, this study identified more detailed factors affecting the severity of urban road crashes. The crash severity model was developed using both machine learning and statistical models because the insights that can be obtained from the latest technology and traditional methods are different. Therefore, the binary logit model, a support vector machine, and extreme gradient boosting were developed using key variables derived from the multiple correspondence analysis and Boruta-SHapley Additive exPlanations. The main result of this study shows that the crash severity decreased at four-street intersections and when traffic segregation facilities were installed. The findings of this study can be used to establish a traffic safety management strategy to reduce the severity of crashes on urban roads.
[1] | A. A. Jahangeer, S. S. Anjana, V. R. Das, A hierarchical modeling approach to predict pedestrian crash severity, in Transportation Research: Proceedings of CTRG 2017, 45 (2020), 355–366. https://doi.org/10.1007/978-981-32-9042-6 |
[2] | A. Sheykhfard, F. Haghighi, T. Nordfjærn, M. Soltaninejad, Structural equation modelling of potential risk factors for pedestrian accidents in rural and urban roads, Int. J. Inj. Control Saf. Promot., 28 (2020), 46–57. https://doi.org/10.1080/17457300.2020.1835991 doi: 10.1080/17457300.2020.1835991 |
[3] | X. Yan, J. He, C. Zhang, Z. Liu, B. Qiao, H. Zhang, Single-vehicle crash severity outcome prediction and determinant extraction using tree-based and other non-parametric models, Accid. Anal. Prev., 153 (2021), 106034. https://doi.org/10.1016/j.aap.2021.106034 doi: 10.1016/j.aap.2021.106034 |
[4] | I. Dash, M. Abkowitz, C. Philip, Factors impacting bike crash severity in urban areas, J. Saf. Res., 83 (2022), 128–138. https://doi.org/10.1016/j.jsr.2022.08.010 doi: 10.1016/j.jsr.2022.08.010 |
[5] | Y. Yu, Z. Liu, A data-driven on-site injury severity assessment model for car-to-electric-bicycle collisions based on positional relationship and random forest, Electron. Res. Arch., 31 (2023), 3417–3434. https://doi.org/10.3934/era.2023173 doi: 10.3934/era.2023173 |
[6] | K. Santos, J. P. Dias, C. Amado, A literature review of machine learning algorithms for crash injury severity prediction, J. Saf. Res., 80 (2022), 254–269. https://doi.org/10.1016/j.jsr.2021.12.007 doi: 10.1016/j.jsr.2021.12.007 |
[7] | T. J. Song, J. So, J. Lee, B. M. Williams, Exploring vehicle-pedestrian crash severity factors on the basis of in-car black box recording data, Transp. Res. Rec., 2659 (2017), 148–154. https://doi.org/10.3141/2659-16 doi: 10.3141/2659-16 |
[8] | Y. Chung, Injury severity analysis in taxi-pedestrian crashes: An application of reconstructed crash data using a vehicle black box, Accid. Anal. Prev., 111 (2018), 345–353. https://doi.org/10.1016/j.aap.2017.10.016 doi: 10.1016/j.aap.2017.10.016 |
[9] | Y. Chung, An application of in-vehicle recording technologies to analyze injury severity in crashes between taxis and two-wheelers, Accid. Anal. Prev., 166 (2022), 106541. https://doi.org/10.1016/j.aap.2021.106541 doi: 10.1016/j.aap.2021.106541 |
[10] | J. Cho, S. Lee, S. Park, J. Park, Classification and prediction of highway accident characteristics using vehicle black box data, J. Korea Inst. Intell. Trans. Syst., 21 (2022), 132–145. https://doi.org/10.12815/kits.2022.21.6.132 doi: 10.12815/kits.2022.21.6.132 |
[11] | B. P. Loo, Z. Fan, T. Lian, F. Zhang, Using computer vision and machine learning to identify bus safety risk factors, Accid. Anal. Prev., 185 (2023), 107017. https://doi.org/10.1016/j.aap.2023.107017 doi: 10.1016/j.aap.2023.107017 |
[12] | E. Giovannini, A. Giorgetti, G. Pelletti, A. Giusti, M. Garagnani, J. P. Pascali, et al., Importance of dashboard camera (Dash Cam) analysis in fatal vehicle-pedestrian crash reconstruction, Forensic Sci., Med. Pathol., 17 (2021), 379–387. https://doi.org/10.1007/s12024-021-00382-0 doi: 10.1007/s12024-021-00382-0 |
[13] | L. Taccari, F. Sambo, L. Bravi, S. Salti, L. Sarti, M. Simoncini, et al., Classification of crash and near-crash events from dashcam videos and telematics, in 2018 21st International Conference on Intelligent Transportation Systems (ITSC), IEEE, (2018), 2460–2465. https://doi.org/10.1109/ITSC.2018.8569952 |
[14] | H. Pradana, An end-to-end online traffic-risk incident prediction in first-person dash camera videos, Big Data Cognit. Comput., 7 (2023), 129. https://doi.org/10.3390/bdcc7030129 doi: 10.3390/bdcc7030129 |
[15] | F. Hajri, H. Fradi, Vision transformers for road accident detection from dashboard cameras, in 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), IEEE, (2022), 1–8. https://doi.org/10.1109/AVSS56176.2022.9959545 |
[16] | L. Mussone, M. Bassani, P. Masci, Analysis of factors affecting the severity of crashes in urban road intersections, Accid. Anal. Prev., 103 (2017), 112–122. https://doi.org/10.1016/j.aap.2017.04.007 doi: 10.1016/j.aap.2017.04.007 |
[17] | A. Iranitalab, A. Khattak, Comparison of four statistical and machine learning methods for crash severity prediction, Accid. Anal. Prev., 108 (2017), 27–36. https://doi.org/10.1016/j.aap.2017.08.008 doi: 10.1016/j.aap.2017.08.008 |
[18] | S. Mafi, Y. AbdelRazig, R. Doczy, Machine learning methods to analyze injury severity of drivers from different age and gender groups, Transp. Res. Rec., 2672 (2018), 171–183. https://doi.org/10.1177/0361198118794292 doi: 10.1177/0361198118794292 |
[19] | J. Liu, Severity Analysis of Large Truck Crashes-Comparision Between the Regression Modeling Methods with Machine Learning Methods, Ph.D thesis, Texas Southern University, 2021. |
[20] | M. K. Islam, I. Reza, U. Gazder, R. Akter, M. Arifuzzaman, M. M. Rahman, Predicting road crash severity using classifier models and crash hotspots, Appl. Sci., 12 (2022), 11354. https://doi.org/10.3390/app122211354 doi: 10.3390/app122211354 |
[21] | M. Yan, Y. Shen, Traffic accident severity prediction based on random forest, Sustainability, 14 (2022), 1729. https://doi.org/10.3390/su14031729 doi: 10.3390/su14031729 |
[22] | F. Afshar, S. Seyedabrishami, S. Moridpour, Application of Extremely Randomised Trees for exploring influential factors on variant crash severity data, Sci. Rep., 12 (2022), 11476. https://doi.org/10.1038/s41598-022-15693-7 doi: 10.1038/s41598-022-15693-7 |
[23] | M. Alrumaidhi, M. M. Farag, H. A. Rakha, Comparative analysis of parametric and non-parametric data-driven models to predict road crash severity among elderly drivers using synthetic resampling techniques, Sustainability, 15 (2023), 9878. https://doi.org/10.3390/su15139878 doi: 10.3390/su15139878 |
[24] | V. Astarita, S. S. Haghshenas, G. Guido, A. Vitale, Developing new hybrid grey wolf optimization-based artificial neural network for predicting road crash severity, Transp. Eng., 12 (2023), 100164. https://doi.org/10.1016/j.treng.2023.100164 doi: 10.1016/j.treng.2023.100164 |
[25] | S. Das, R. Avelar, K. Dixon, X. Sun, Investigation on the wrong way driving crash patterns using multiple correspondence analysis, Accid. Anal. Prev., 111 (2018), 43–55. https://doi.org/10.1016/j.aap.2017.11.016 doi: 10.1016/j.aap.2017.11.016 |
[26] | M. B. Kursa, W. R. Rudnicki, Feature selection with the Boruta package, J. Stat. Software, 36 (2010), 1–13. https://doi.org/10.18637/jss.v036.i11 doi: 10.18637/jss.v036.i11 |
[27] | M. B. Kursa, A. Jankowski, W. R. Rudnicki, Boruta—a system for feature selection, Fundam. Inform., 101 (2010). https://doi.org/271-285.10.3233/FI-2010-288 |
[28] | B. E. Boser, I. M. Guyon, V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, (1992), 144–152. https://doi.org/10.1145/130385.130401 |
[29] | T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, et al., Xgboost: extreme gradient boosting, R Package Version 0.4-2, 1 (2015), 1–4. |
[30] | T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, (2016), 785–794. https://doi.org/10.1145/2939672.2939785 |
[31] | S. M. Lundberg, S. I. Lee, A unified approach to interpreting model predictions, Adv. Neural Inf. Process. Syst., 30 (2017). |
[32] | M. Chen, Y. Tan, SF-FWA: A Self-Adaptive Fast Fireworks Algorithm for effective large-scale optimization, Swarm Evol. Comput., 80 (2023), 101314. https://doi.org/10.1016/j.swevo.2023.101314 doi: 10.1016/j.swevo.2023.101314 |
[33] | M. A. Dulebenets, An adaptive polyploid memetic algorithm for scheduling trucks at a cross-docking terminal, Inf. Sci., 565 (2021), 390–421. https://doi.org/10.1016/j.ins.2021.02.039 doi: 10.1016/j.ins.2021.02.039 |
[34] | J. Pasha, A. L. Nwodu, A. M. Fathollahi-Fard, G. Tian, Z. Li, H. Wang, et al., Exact and metaheuristic algorithms for the vehicle routing problem with a factory-in-a-box in multi-objective settings, Adv. Eng. Inf., 52 (2022), 101623. https://doi.org/10.1016/j.aei.2022.101623 doi: 10.1016/j.aei.2022.101623 |
[35] | P. Singh, J. Pasha, R. Moses, J. Sobanjo, E. E. Ozguven, M. A. Dulebenets, Development of exact and heuristic optimization methods for safety improvement projects at level crossings under conflicting objectives, Reliab. Eng. Syst. Saf., 220 (2022), 108296. https://doi.org/10.1016/j.ress.2021.108296 doi: 10.1016/j.ress.2021.108296 |
[36] | M. A. Dulebenets, A Diffused Memetic Optimizer for reactive berth allocation and scheduling at marine container terminals in response to disruptions, Swarm Evol. Comput., 80 (2023), 101334. https://doi.org/10.1016/j.swevo.2023.101334 doi: 10.1016/j.swevo.2023.101334 |
[37] | E. Singh, N. Pillay, A study of ant-based pheromone spaces for generation constructive hyper-heuristics, Swarm Evol. Comput., 72 (2022), 101095. https://doi.org/10.1016/j.swevo.2022.101095 doi: 10.1016/j.swevo.2022.101095 |