Wearable on-device deep learning system for hand gesture recognition based on FPGA accelerator

Weibin Jiang; Xuelin Ye; Ruiqi Chen; Feng Su; Mengru Lin; Yuhanxiao Ma; Yanxiang Zhu; Shizhen Huang; Weibin Jiang; Xuelin Ye; Ruiqi Chen; Feng Su; Mengru Lin; Yuhanxiao Ma; Yanxiang Zhu; Shizhen Huang

doi:10.3934/mbe.2021007

Mathematical Biosciences and Engineering

2021, Volume 18, Issue 1: 132-153. doi: 10.3934/mbe.2021007

Previous Article Next Article

Research article

Wearable on-device deep learning system for hand gesture recognition based on FPGA accelerator

1.
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, China
2.
Department of Statistics, University of Warwick CV4 7AL, United Kingdom
3.
VeriMake Research, Nanjing Qujike Info-tech Co., Ltd., Nanjing 210088, China
4.
Tsinghua-Berkeley Shenzhen institute, Tsinghua University, Shenzhen 518055, China
5.
Gallatin School of Individualized Study, New York University, NY 10012, United States

Received: 28 August 2020 Accepted: 12 November 2020 Published: 23 November 2020

Gesture recognition is critical in the field of Human-Computer Interaction, especially in healthcare, rehabilitation, sign language translation, etc. Conventionally, the gesture recognition data collected by the inertial measurement unit (IMU) sensors is relayed to the cloud or a remote device with higher computing power to train models. However, it is not convenient for remote follow-up treatment of movement rehabilitation training. In this paper, based on a field-programmable gate array (FPGA) accelerator and the Cortex-M0 IP core, we propose a wearable deep learning system that is capable of locally processing data on the end device. With a pre-stage processing module and serial-parallel hybrid method, the device is of low-power and low-latency at the micro control unit (MCU) level, however, it meets or exceeds the performance of single board computers (SBC). For example, its performance is more than twice as much of Cortex-A53 (which is usually used in Raspberry Pi). Moreover, a convolutional neural network (CNN) and a multilayer perceptron neural network (NN) is used in the recognition model to extract features and classify gestures, which helps achieve a high recognition accuracy at 97%. Finally, this paper offers a software-hardware co-design method that is worth referencing for the design of edge devices in other scenarios.
- micro-control unit (MCU),
- accelerator,
- inertial measurement unit (IMU),
- field-programmable gate array (FPGA),
- gesture recognition,
- convolutional neural network (CNN)
Citation: Weibin Jiang, Xuelin Ye, Ruiqi Chen, Feng Su, Mengru Lin, Yuhanxiao Ma, Yanxiang Zhu, Shizhen Huang. Wearable on-device deep learning system for hand gesture recognition based on FPGA accelerator[J]. Mathematical Biosciences and Engineering, 2021, 18(1): 132-153. doi: 10.3934/mbe.2021007

Related Papers:

Abstract

Gesture recognition is critical in the field of Human-Computer Interaction, especially in healthcare, rehabilitation, sign language translation, etc. Conventionally, the gesture recognition data collected by the inertial measurement unit (IMU) sensors is relayed to the cloud or a remote device with higher computing power to train models. However, it is not convenient for remote follow-up treatment of movement rehabilitation training. In this paper, based on a field-programmable gate array (FPGA) accelerator and the Cortex-M0 IP core, we propose a wearable deep learning system that is capable of locally processing data on the end device. With a pre-stage processing module and serial-parallel hybrid method, the device is of low-power and low-latency at the micro control unit (MCU) level, however, it meets or exceeds the performance of single board computers (SBC). For example, its performance is more than twice as much of Cortex-A53 (which is usually used in Raspberry Pi). Moreover, a convolutional neural network (CNN) and a multilayer perceptron neural network (NN) is used in the recognition model to extract features and classify gestures, which helps achieve a high recognition accuracy at 97%. Finally, this paper offers a software-hardware co-design method that is worth referencing for the design of edge devices in other scenarios.

References

[1]	C. Wolf, A mathematical model for the propagation of a hantavirus in structured populations, Discrete Continuous Dyn. Syst. Ser. B, 4 (2004), 1065-1089. doi: 10.3934/dcdsb.2004.4.1065
[2]	J. Wu, R. Jafari, Orientation Independent Activity/Gesture Recognition Using Wearable Motion Sensors, IEEE Internet Things J., 6 (2018), 1427-1437.
[3]	P. K. Pisharady, M. Saerbeck, Recent methods and databases in vision-based hand gesture recognition: A review, Comput. Vis. Image Underst., 141 (2015), 152-165. doi: 10.1016/j.cviu.2015.08.004
[4]	H. S. Hasan, S. Kareem, Human computer interaction for vision based hand gesture recognition: A survey, Artif. Intell. Rev., 43 (2015), 1-54. doi: 10.1007/s10462-012-9356-9
[5]	H. I. Lin, Hsien-I., M. H. Hsu, W.-K. Chen, Human hand gesture recognition using a convolution neural network, In IEEE Int. Conf. Autom. Sci. Eng. (CASE), IEEE, 2014, 1038-1043.
[6]	O. K. Oyedotun, A. Khashman, Deep learning in vision-based static hand gesture recognition, Neural Comput. Appl., 28 (2017), 3941-3951. doi: 10.1007/s00521-016-2294-8
[7]	Z. Lu, X. Chen, Q. Li, X. Zhang, P. Zhou, A Hand Gesture Recognition Framework and Wearable Gesture-Based Interaction Prototype for Mobile Devices, IEEE Trans. Hum. Mach. Syst., 44 (2014), 293-299. doi: 10.1109/THMS.2014.2302794
[8]	Y. Huang, L. Gao, Y. Zhao, X. Guo, C. Liu, P. Liu, Highly flexible fabric strain sensor based on graphene nanoplatelet-polyaniline nanocomposites for human gesture recognition, J. Appl. Polymer Sci., 134 (2017), 45340. doi: 10.1002/app.45340
[9]	M. Panwar, Hand gesture recognition based on shape parameters, Int. Conf. Comput., Commun. Appl., Dindigul, Tamilnadu, 2012, 1-6.
[10]	C. Weng, Y. Li, M. Zhang, K. Guo, X. Tang, Z. Pan, Robust Hand Posture Recognition Integrating Multi-cue Hand Tracking. In Int. Conf. Technol. E-learn. Digital Entertain., Springer, Berlin, Heidelberg, 497-508.
[11]	D. H. Kim, J. Lee, H. S. Yoon, J. Kim, J. Sohn, Vision-based arm gesture recognition for a long-range human-robot interaction, J. Supercomput., 65 (2013), 336-352. doi: 10.1007/s11227-010-0541-9
[12]	J. Li, H. Huai, J. Gao, D. Kong, L. Wang, Spatial-temporal dynamic hand gesture recognition via hybrid deep learning model, J. Multimodal User Interfaces, 13 (2019), 363-371. doi: 10.1007/s12193-019-00304-z
[13]	Z. Lu, S. Qin, X. Li, L. Li, D. Zhang, One-shot learning hand gesture recognition based on modified 3d convolutional neural networks, Mach. Vis. Appl., 30 (2019), 1157-1180. doi: 10.1007/s00138-019-01043-7
[14]	A. Sarkar, A. Gepperth, U. Handmann, T. Kopinski, Dynamic Hand Gesture Recognition for Mobile Systems Using Deep LSTM, Int. Conference Intell. Hum. Comput. Interact., Springer, Cham, 2017.
[15]	T. Gonzalez-Sanchez, D. Puig, Real-time body gesture recognition using depth camera, Electron. Lett., 47 (2011), 697-698. doi: 10.1049/el.2011.0967
[16]	S. Diego, F. Bruno, B. Byron, HAGR-D: A Novel Approach for Gesture Recognition with Depth Maps, Sensors, 15 (2015), 28646-28664. doi: 10.3390/s151128646
[17]	B. Fang, F. Sun, H. Liu, 3D human gesture capturing and recognition by the IMMU-based data glove, Neurocomputing, 277 (2018), 198-207. doi: 10.1016/j.neucom.2017.02.101
[18]	P. Rouanet, P. Y. Oudeyer, F. Danieau, D. Filliat, The Impact of Human-Robot Interfaces on the Learning of Visual Objects, IEEE Trans. Robot., 29 (2013), 525-541. doi: 10.1109/TRO.2012.2228134
[19]	J. Zhu, L. G. Blumberg, Y. Zhu, M. Nisser, E. L. Carlson, X. Wen, et al., CurveBoards: Integrating Breadboards into Physical Objects to Prototype Function in the Context of Form, In Proc. CHI Conf. Hum. Factors Comput. Syst., 2020, 1-13.
[20]	D. Y. Lee, S. H. Lee, I. Oakley, Nailz: Sensing Hand Input with Touch Sensitive Nails, In Proc. CHI Conf. Hum. Factors Comput. Syst., 2020, 1-13.
[21]	A. Akl, C. Feng, S. Valaee, A Novel Accelerometer-Based Gesture Recognition System, IEEE Trans. Signal Process., 59 (2011), 6197-6205. doi: 10.1109/TSP.2011.2165707
[22]	M. Lee, K. Hwang, J. Park, S. Choi, S. Shin, W. Sung, FPGA-based low-power speech recognition with recurrent neural networks, IEEE Int. Workshop Signal Process. Syst. (SiPS), 2016,230-235.
[23]	V. Rybalkin, A. Pappalardo, M. M. Ghaffar, G. Gambardella, N. Wehn, M. Blott, FINN-L: Library extensions and design trade-off analysis for variable precision LSTM networks on FPGAs. In 28th Int. Conf. Field Program. Logic Appl. (FPL). IEEE, 2018, 89-897.
[24]	R Xie, J Cao, Accelerometer-Based Hand Gesture Recognition by Neural Network and Similarity Matching, IEEE Sensors J., 16 (2016), 4537-4545. doi: 10.1109/JSEN.2016.2546942
[25]	F. Ge, N. Wu, H. Xiao, Y. Zhang, F. Zhou, Compact Convolutional Neural Network Accelerator for IoT Endpoint SoC, Electronics, 8 (2019), 497. doi: 10.3390/electronics8050497
[26]	M. Roukhami, M. T. Lazarescu, F. Gregoretti, Y. Lahbib, A. Mami, Very Low Power Neural Network FPGA Accelerators for Tag-Less Remote Person Identification Using Capacitive Sensors, IEEE Access, 7 (2019), 102217-102231. doi: 10.1109/ACCESS.2019.2931392
[27]	J Chauhan, S Seneviratne, Y Hu, A. Misra, Breathing-Based Authentication on Resource-Constrained IoT Devices using Recurrent Neural Networks, Computer, 51 (2018), 60-67. doi: 10.1109/MC.2018.2381119
[28]	S. Okada, S. Ishibashi, T. Nishida, On-Line Unsupervised Segmentation for Multidimensional Time-Series Data and Application to Spatiotemporal Gesture data, In Int. Conference Ind. Eng. Other Appl. Applied Intell. Syst., IEA/AIE 2010, Proceedings, Part I. DBLP, 2010.
[29]	O. Dehzangi, V. Sahu, IMU-Based Robust Human Activity Recognition using Feature Analysis, Extraction, and Reduction, 24th Int. Conf. Pattern Recognition (ICPR), IEEE, 2018, 1402-1407.
[30]	J. H. Kim, G. S. Hong, B. G. Kim, D. P. Dogra, deepGesture: Deep learning-based gesture recognition scheme using motion sensors, Displays, 55 (2018), 38-45. doi: 10.1016/j.displa.2018.08.001
[31]	D. Jeong, B.-G. Kim, S.-Y. Dong, Deep Joint Spatiotemporal Network (DJSTN) for Efficient Facial Expression Recognition, Sensors, 20 (2020), 1936. doi: 10.3390/s20071936
[32]	L. García-Hernández, M. Pérez-Ortiz, A. Araúzo-Azofra, L. Salas-Morera, C. Hervás-Martínez, An evolutionary neural system for incorporating expert knowledge into the UA-FLP, Neurocomputing, 135 (2014), 69-78. doi: 10.1016/j.neucom.2013.01.068
[33]	E. Keogh, S. Chu, D. Hart, M. Pazzani, An online algorithm for segmenting time series, Proc. 2001 IEEE Int. Conf. Data Min., IEEE, 2001,289-296.
[34]	S. Mallat, Wavelets for a vision, Proc. IEEE, 84 (1996), 604-614. doi: 10.1109/5.488702
[35]	L. Perez, J. Wang, The Effectiveness of Data Augmentation in Image Classification using Deep Learning, arXiv preprint arXiv: 1712.04621.
[36]	J. Sola, J. Sevilla, Importance of input data normalization for the application of neural networks to complex industrial problems, IEEE Trans. Nucl. Sci., 44 (1997), 1464-1468. doi: 10.1109/23.589532
[37]	J. Han, J. Pei, M. Kamber, Data mining: concepts and techniques. Elsevier, 2011.
[38]	S. T. Chakradhar, M. Sankaradass, V. Jakkula, S. Cadambi, A dynamically configurable coprocessor for convolutional neural networks, 37th Int. Symp. Comput. Archit. (ISCA 2010), ACM, 2010.
[39]	T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, et al., DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning, ACM SIGARCH Comput. Architect. News, 42 (2014), 269-284. doi: 10.1145/2654822.2541967
[40]	P. Sayyah, T. L. Mihai, S. Bocchio, E. Ebeid, G. Palermo, D. Quaglia, et al., Virtual Platform-Based Design Space Exploration of Power-Efficient Distributed Embedded Applications, ACM Trans. Embed. Comput. Syst., (TECS), 14 (2015), 1-25.
[41]	J. G. Proakis, D. G. Manolakis, Digital signal processing: principles, algorithms, and applications, 1996.
[42]	U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate Arrays, Springer, 2007.
[43]	J. Cho, S. Mirzaei, J. Oberg, R. Kastner, FPGA-Based Face Detection System Using Haar Classifiers, In Proc. ACM/SIGDA Int. Symp. Field Program. Gate arrays, 103-112.
[44]	J. Hegarty, J. Brunhaver, Z. Devito, J. Ragan-Kelley, N.Cohen, S. Bell, et al., Darkroom: Compiling high-level image processing code into hardware pipelines, ACM Trans. Graph., 33 (2014), 1-11.
[45]	J. Kim, B. Kim, P. P. Roy, D. Jeong, Efficient facial expression recognition algorithm based on hierarchical deep neural network structure, IEEE Access, 7 (2019), 41273-41285. doi: 10.1109/ACCESS.2019.2907327
[46]	H. Akima, A New Method of Interpolation and Smooth Curve Fitting Based on Local Procedures, J. ACM, 17 (1970), 589-602. doi: 10.1145/321607.321609
[47]	M. Shi, A. Bermak, S. Chandrasekaran, A. Amira, S. Brahim-Belhouari. A Committee Machine Gas Identification System Based on Dynamically Reconfigurable FPGA, IEEE Sensors J., 8 (2008), 403-414. doi: 10.1109/JSEN.2008.917124
[48]	M. Hamouda, H. F. Blanchette, K. Al-Haddad, F. Fnaiech. An Efficient DSP-FPGA-Based Real-Time Implementation Method of SVM Algorithms for an Indirect Matrix Converter, IEEE Trans. Ind. Electron., 58 (2011), 5024-5031. doi: 10.1109/TIE.2011.2159952
[49]	N. Attaran, A. Puranik, J. Brooks, T. Mohsenin, Embedded Low-Power Processor for Personalized Stress Detection, IEEE Trans. Circuits Syst. Ⅱ-Express Briefs, 65 (2018), 2032-2036. doi: 10.1109/TCSII.2018.2799821
[50]	Z. Li, L. Wang, S. Guo, Y. Deng, Q. Dou, H. Zhou, et al., Laius: An 8-bit fixed-point CNN hardware inference engine, In Proc. IEEE Int. Symp. Parallel Distrib. Process. Appl. and 2017 IEEE Int. Conf. Ubiquitous Computi. Commun. (ISPA/IUCC), 143-150.
[51]	Y. Zhang, N. Wu, F. Zhou, M. R. Yahya, Design of Multifunctional Convolutional Neural Network Accelerator for IoT Endpoint SoC, In Proc. World Congress Eng. Comput. Sci. 2018, 16-19.
[52]	C. M. Morales, U. Flores, M. A. Medina, M. D. Saazar, J. A. Caballero, D. C. Cruz, et al., Digital Artificial Neural Network Implementation on a FPGA for data classification, IEEE Latin Am. Trans., 13 (2015), 3216-3220. doi: 10.1109/TLA.2015.7387224

Reader Comments

Your name:*

Email:*
© 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)