The operation space of the vertical lift shaft is small, the components are complex, the occluding and different behavior space characteristics are similar, and the unsafe behavior is not easy to detect, which makes the operation safety of maintenance personnel in the elevator greatly threatened. This paper proposes an elevator maintenance personnel behavior detection algorithm based on the first-order deep network architecture (FOA-BDNet). First, a lightweight backbone feature extraction network is designed to meet the online real-time requirements of elevator maintenance environment monitoring video stream detection. Then, the feature fusion network structure of "far intersection and close connection" is proposed to fuse the fine-grained information with the coarse-grained information and to enhance the expression ability of deep semantic features. Finally, a first-order deep target detection algorithm adapted to the elevator scene is designed to identify and locate the behavior of maintenance personnel and to correctly detect unsafe behaviors. Experiments show that the detection accuracy rate on the self-built data set in this paper is 98.68%, which is 4.41% higher than that of the latest target detection model YOLOv8-s, and the reasoning speed reaches 69.51fps/s, which can be easily deployed in common edge devices and meet the real-time detection requirements for the unsafe behaviors of elevator scene maintenance personnel.
Citation: Zengming Feng, Tingwen Cao. FOA-BDNet: A behavior detection algorithm for elevator maintenance personnel based on first-order deep network architecture[J]. AIMS Mathematics, 2024, 9(11): 31295-31316. doi: 10.3934/math.20241509
The operation space of the vertical lift shaft is small, the components are complex, the occluding and different behavior space characteristics are similar, and the unsafe behavior is not easy to detect, which makes the operation safety of maintenance personnel in the elevator greatly threatened. This paper proposes an elevator maintenance personnel behavior detection algorithm based on the first-order deep network architecture (FOA-BDNet). First, a lightweight backbone feature extraction network is designed to meet the online real-time requirements of elevator maintenance environment monitoring video stream detection. Then, the feature fusion network structure of "far intersection and close connection" is proposed to fuse the fine-grained information with the coarse-grained information and to enhance the expression ability of deep semantic features. Finally, a first-order deep target detection algorithm adapted to the elevator scene is designed to identify and locate the behavior of maintenance personnel and to correctly detect unsafe behaviors. Experiments show that the detection accuracy rate on the self-built data set in this paper is 98.68%, which is 4.41% higher than that of the latest target detection model YOLOv8-s, and the reasoning speed reaches 69.51fps/s, which can be easily deployed in common edge devices and meet the real-time detection requirements for the unsafe behaviors of elevator scene maintenance personnel.
[1] | D. C. Balmer, Impact of the A18. 1 ASME Standard on platform lifts and stairway chairlifts on accessibility and usability, Assist. Technol., 22 (2010), 46−50. https://doi.org/10.1080/10400430903520264 doi: 10.1080/10400430903520264 |
[2] | C. Cheng, S. Zhang, Z. Wang, L. Qiu, L. Tu, L. Zhu, et al., Surrogate-model-based dynamic-sensing optimization method for high-speed elevator car horizontal vibration reduction, Proc. I. Mech. Eng. Part C, 2024. https://doi.org/10.1177/09544062231217926 doi: 10.1177/09544062231217926 |
[3] | P. C. Liao, Z. Guo, T. Wang, J. Wen, C. H. Tsai, Interdependency of construction safety hazards from a network perspective: A mechanical installation case, Int. J. Occup. Saf. Ergo., 26 (2020), 245−255. https://doi.org/10.1080/10803548.2018.1426272 doi: 10.1080/10803548.2018.1426272 |
[4] | J. Lei, W. Sun, Y. Fang, N. Ye, S. Yang, J. Wu, A model for detecting abnormal elevator passenger behavior based on video classification, Electronics, 13 (2024), 2472. https://doi.org/10.3390/electronics13132472 doi: 10.3390/electronics13132472 |
[5] | H. Hasegawa, S. Aida, Elevator monitoring system to guide user's behavior by visualizing the state of crowdedness, In: Lee, R. (eds) Big Data, Cloud Computing, and Data Science Engineering, Springer, Cham, 2020, 85−98. https://doi.org/10.1007/978-3-030-24405-7_6 |
[6] | S. Liang, D. Niu, K. Huang, H. Wu, L. Ding, Y. Yang, An elevator door blocking behavior recognition method based on two-stage object detection networks, IEEE, 2022, 1374−1378. https://doi.org/10.1109/YAC57282.2022.10023898 doi: 10.1109/YAC57282.2022.10023898 |
[7] | Z. Wang, J. Chen, P. Yu, B. Feng, D. Feng, SC-YOLOv8 network with soft-pooling and attention for elevator passenger detection, Appl. Sci., 14 (2024), 3321. https://doi.org/10.3390/app14083321 doi: 10.3390/app14083321 |
[8] | S. Chai, X. I. Li, Y. Jia, Y. He, C. H. Yip, K. K. Cheung, et al., A non-intrusive deep learning based diagnosis system for elevators, IEEE Access, 9 (2021), 20993−21003. https://doi.org/10.1109/ACCESS.2021.3053858 doi: 10.1109/ACCESS.2021.3053858 |
[9] | S. C. Lai, M. L. Yang, R. J. Wang, J. Y. Jhuang, M. C. Ho, Y. C. Shiau, Remote-control system for elevator with sensor technology, Sensor. Mater., 34 (2022). https://doi.org/10.18494/SAM3827 doi: 10.18494/SAM3827 |
[10] | Z. Li, J. Ning, T. Li, Design of non-intrusive online monitoring system for traction elevators, Appl. Sci., 14 (2024), 4346. https://doi.org/10.3390/app14114346 doi: 10.3390/app14114346 |
[11] | W. Yao, A. Wang, Y. Nie, Z. Lv, S. Nie, C. Huang, et al., Study on the recognition of coal miners' unsafe behavior and status in the hoist cage based on machine vision, Sensors, 23 (2023), 8794. https://doi.org/10.3390/s23218794 doi: 10.3390/s23218794 |
[12] | T. Kong, W. Fang, P. E. D. Love, H. Luo, S. Xu, H. Li, Computer vision and long short-term memory: Learning to predict unsafe behaviour in construction, Adv. Eng. Inform., 50 (2021), 101400. https://doi.org/10.1016/j.aei.2021.101400 doi: 10.1016/j.aei.2021.101400 |
[13] | M. Casini, Extended reality for smart building operation and maintenance: A review, Energies, 15 (2022), 3785. https://doi.org/10.3390/en15103785 doi: 10.3390/en15103785 |
[14] | R. D'Souza, IoT and the future of elevator maintenance business, Master Thesis, Technische Universität Wien, 2022. https://doi.org/10.34726/hss.2022.103532 |
[15] | X. P. Zhang, J. H. Ji, L. Wang, Z. He, S. Liu, A review of video-based human abnormal behavior recognition and detection methods, Control Decis., 37 (2022), 14−27. |
[16] | A. F. Bobick, J. W. Davis, The recognition of human movement using temporal templates, IEEE T. Pattern Anal., 23 (2001), 257−267. https://doi.org/10.1109/34.910878 doi: 10.1109/34.910878 |
[17] | H. Wang, A. Kläser, C. Schmid, C. L. Liu, Dense trajectories and motion boundary descriptors for action recognition, Int. J. Comput. Vis., 103 (2013), 60−79. https://doi.org/10.1007/s11263-012-0594-8 doi: 10.1007/s11263-012-0594-8 |
[18] | L. Xu, C. Gong, J. Yang, Q. Wu, L. Yao, Violent video detection based on MoSIFT feature and sparse coding, IEEE, 2014, 3538−3542. https://doi.org/10.1109/ICASSP.2014.6854259 doi: 10.1109/ICASSP.2014.6854259 |
[19] | H. Fujiyoshi, A. J. Lipton, T. Kanade, Real-time human motion analysis by image skeletonization, IEICE T. Inf. Syst., 87 (2004), 113−120. |
[20] | M. S. Alzahrani, S. K. Jarraya, H. Ben-Abdallah, M. S. Ali, Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2, Signal, Image Video P., 13 (2019), 1431−1439. https://doi.org/10.1007/s11760-019-01490-9 doi: 10.1007/s11760-019-01490-9 |
[21] | Z. Liao, H. Hu, J. Zhang, C. Yin, Residual attention unit for action recognition, Comput. Vis. Image Und., 189 (2019), 102821. https://doi.org/10.1016/j.cviu.2019.102821 doi: 10.1016/j.cviu.2019.102821 |
[22] | C. Feichtenhofer, A. Pinz, A. Zisserman, Convolutional two-stream network fusion for video action recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 2016, 1933−1941. https://doi.org/10.1109/CVPR.2016.213 doi: 10.1109/CVPR.2016.213 |
[23] | S. Sudhakaran, O. Lanz, Learning to detect violent videos using convolutional long short-term memory, IEEE, 2017, 1−6. https://doi.org/10.1109/AVSS.2017.8078468 doi: 10.1109/AVSS.2017.8078468 |
[24] | C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, I. H. Yeh, CSPNet: A new backbone that can enhance learning capability of CNN, Montreal, BC, Canada, 2020, 390−391. |
[25] | X. Hu, D. Kong, X. Liu, J. Zhang, D. Zhang, FM-STDNet: High-speed detector for fast-moving small targets based on deep first-order network architecture, Electronics, 12 (2023), 1−15. https://doi.org/10.3390/electronics12081829 doi: 10.3390/electronics12081829 |
[26] | S. Wang, Z. Miao, Anomaly detection in crowd scene, IEEE, 2010, 1220−1223. https://doi.org/10.1109/ICOSP.2010.5655356 doi: 10.1109/ICOSP.2010.5655356 |
[27] | R. Mehran, A. Oyama, M. Shah, Abnormal crowd behavior detection using social force model, IEEE, 2009, 935−942. https://doi.org/10.1109/CVPR.2009.5206641 doi: 10.1109/CVPR.2009.5206641 |
[28] | J. Kim, K. Grauman, Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates, IEEE, 2009, 2921−2928. https://doi.org/10.1109/CVPR.2009.5206569 doi: 10.1109/CVPR.2009.5206569 |
[29] | M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, L. S. Davis, Learning temporal regularity in video sequences, Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 2016, 733−742. https://doi.org/10.1109/CVPR.2016.86 doi: 10.1109/CVPR.2016.86 |
[30] | W. Luo, W. Liu, S. Gao, Remembering history with convolutional lstm for anomaly detection, IEEE, 2017, 439−444. https://doi.org/10.1109/ICME.2017.8019325 doi: 10.1109/ICME.2017.8019325 |
[31] | Y. S. Chong, Y. H. Tay, Abnormal event detection in videos using spatiotemporal autoencoder, Springer, Cham, 2017, 189−196. https://doi.org/10.1007/978-3-319-59081-3_23 |
[32] | Q. Sun, H. Liu, T. Harada, Online growing neural gas for anomaly detection in changing surveillance scenes, Pattern Recogn., 64 (2017), 187−201. https://doi.org/10.1016/j.patcog.2016.09.016 doi: 10.1016/j.patcog.2016.09.016 |
[33] | M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, N. Sebe, Abnormal event detection in videos using generative adversarial nets, IEEE, 2017, 1577−1581. https://doi.org/10.1109/ICIP.2017.8296547 doi: 10.1109/ICIP.2017.8296547 |
[34] | R. Hinami, T. Mei, S. Satoh, Joint detection and recounting of abnormal events by learning deep generic knowledge, Proc. IEEE Conf. Comput. Vis., 2017, 3619−3627. https://doi.org/10.1109/ICCV.2017.391 doi: 10.1109/ICCV.2017.391 |
[35] | R. I. Tudor, S. Smeureanu, B. Alexe, M. Popescu, Unmasking the abnormal events in video, Proc. IEEE Conf. Comput. Vis., 2017, 2895−2903. |
[36] | W. Luo, W. Liu, S. Gao, A revisit of sparse coding based anomaly detection in stacked RNN framework, Proc. IEEE Conf. Comput. Vis., 2017, 341−349. https://doi.org/10.1109/ICCV.2017.45 doi: 10.1109/ICCV.2017.45 |
[37] | W. Liu, W. Luo, D. Lian, S. Gao, Future frame prediction for anomaly detection—a new baseline, Proc. IEEE Conf. Comput. Vis. Pattern Recogn., 2018, 6536−6545. https://doi.org/10.1109/CVPR.2018.00684 doi: 10.1109/CVPR.2018.00684 |
[38] | D. Xu, Y. Yan, E. Ricci, N. Sebe, Detecting anomalous events in videos by learning deep representations of appearance and motion, Comput. Vis. Image Und., 156 (2017), 117−127. https://doi.org/10.1016/j.cviu.2016.10.010 doi: 10.1016/j.cviu.2016.10.010 |
[39] | M. Ravanbakhsh, M. Nabi, H. Mousavi, E. Sangineto, N. Sebe, Plug-and-play cnn for crowd motion analysis: An application in abnormal event detection, IEEE, 2018, 1689−1698. https://doi.org/10.1109/WACV.2018.00188 doi: 10.1109/WACV.2018.00188 |
[40] | R. T. Ionescu, F. S. Khan, M. I. Georgescu, L. Shao, Object-centric auto-encoders and dummy anomalies for abnormal event detection in video, 2019, 7842−7851. https://doi.org/10.1109/CVPR.2019.00803 |
[41] | C. Sun, Y. Jia, H. Song, Y. Wu, Adversarial 3d convolutional auto-encoder for abnormal event detection in videos, IEEE T. Multimedia, 23 (2020), 3292−3305. https://doi.org/10.1109/TMM.2020.3023303 doi: 10.1109/TMM.2020.3023303 |