
Trusted emotion recognition based on multiple signals captured from video and its application in intelligent education

  • Received: 07 February 2024 Revised: 19 March 2024 Accepted: 15 May 2024 Published: 29 May 2024
  • The emotional variation can reflect shifts in mental and emotional states. It plays an important role in the field of intelligent education. Emotion recognition can be used as cues for teachers to evaluate the learning state, analyze learning motivation, interest, and efficiency. Although research on emotion recognition has been ongoing for a long time, there has been a restricted emphasis on analyzing the credibility of the recognized emotions. In this paper, the origin, development, and application of emotion recognition were introduced. Then, multiple signals captured from video that could reflect emotion changes were described in detail and their advantages and disadvantages were discussed. Moreover, a comprehensive summary of the pertinent applications and research endeavors of emotion recognition technology in the field of education was provided. Last, the trend of emotion recognition in the field of education was given.

    Citation: Junjie Zhang, Cheng Fei, Yaqian Zheng, Kun Zheng, Mazhar Sarah, Yu Li. Trusted emotion recognition based on multiple signals captured from video and its application in intelligent education[J]. Electronic Research Archive, 2024, 32(5): 3477-3521. doi: 10.3934/era.2024161

    Related Papers:

  • The emotional variation can reflect shifts in mental and emotional states. It plays an important role in the field of intelligent education. Emotion recognition can be used as cues for teachers to evaluate the learning state, analyze learning motivation, interest, and efficiency. Although research on emotion recognition has been ongoing for a long time, there has been a restricted emphasis on analyzing the credibility of the recognized emotions. In this paper, the origin, development, and application of emotion recognition were introduced. Then, multiple signals captured from video that could reflect emotion changes were described in detail and their advantages and disadvantages were discussed. Moreover, a comprehensive summary of the pertinent applications and research endeavors of emotion recognition technology in the field of education was provided. Last, the trend of emotion recognition in the field of education was given.


    [1] Q. Hu, L. Liu, N. Ding, The dilemma and solution of online education in the perspective of educational equity, China Educ. Technol., 8 (2020), 14−21. doi: 10.3969/j.issn.1006-9860.2020.08.003
    [2] M. Balaam, G. Fitzpatrick, J. Good, R. Luckin, Exploring affective technologies for the classroom with the subtle stone, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, (2010), 1623−1632.
    [3] A. Hutanu, P. E. Bertea, A review of eye tracking in elearning, in Proceedings of the 15th International Scientific Conference eLearning and Software for Education, (2019), 281−287.
    [4] Y. Wang, Q. Wu, S. Wang, X. Q. Fang, Q. Ruan, MI-EEG: Generalized model based on mutual information for EEG emotion recognition without adversarial training, Expert Syst. Appl., 244 (2024), 122777. doi: 10.1016/j.eswa.2023.122777
    [5] T. Fan, S. Qiu, Z. Wang, H. Zhao, J. Jiang, Y. Wang, et al., A new deep convolutional neural network incorporating attentional mechanisms for ECG emotion recognition, Comput. Biol. Med., 159 (2023), 106938. doi: 10.1016/j.compbiomed.2023.106938
    [6] Q. Xu, W. Sommer, G. Recio, Control over emotional facial expressions: Evidence from facial EMG and ERPs in a Stroop-like task, Biol. Psychol., 181 (2023), 108611. doi: 10.1016/j.biopsycho.2023.108611
    [7] J. J. Zhang, G. M. Sun, K. Zheng, S. Mazhar, X. H. Fu, Y. Li, et al., SSGNN: A macro and microfacial expression recognition graph neural network combining spatial and spectral domain features, IEEE Trans. Human-Mach. Syst., 52 (2022), 747−760. doi: 10.1109/THMS.2022.3163211
    [8] J. Zhang, K. Zheng, S. Mazhar, X. Fu, J. Kong, Trusted emotion recognition based on multiple signals captured from video, Expert Syst. Appl., 233 (2023), 120948. doi: 10.1016/j.eswa.2023.120948
    [9] J. Zhang, G. Sun, K. Zheng, Review of gaze tracking and its application in intelligent education, J. Comput. Appl., 40 (2020), 3346. doi: 10.11772/j.issn.1001-9081.2020040443
    [10] P. Van Cappellen, M. E. Edwards, M. N. Shiota, Shades of expansiveness: Postural expression of dominance, high-arousal positive affect, and warmth, Emotion, 23 (2023), 973−985. doi: 10.1037/emo0001146
    [11] Z. Yu, X. Li, G. Zhao, Facial-video-based physiological signal measurement: Recent advances and affective applications, IEEE Signal Process. Mag., 38 (2021), 50−58. doi: 10.1109/MSP.2021.3106285
    [12] R. W. Picard, Affective Computing, MIT Press, (2000),
    [13] J. J. Wang, Y. H. Gong, Recognition of multiple drivers' emotional state, in Proceedings of the 19th International Conference on Pattern Recognition, (2008), 1−4.
    [14] F. Ungureanu, R. G. Lupu, A. Cadar, A. Prodan, Neuromarketing and visual attention study using eye tracking techniques, in Proceedings of the 21st International Conference on System Theory, Control and Computing, (2017), 553−557.
    [15] M. Uljarevic, A. Hamilton, Recognition of emotions in autism: A formal meta-analysis, Journal of Autism and Developmental Disorders, 43 (2013), 1517−1526. doi: 10.1007/s10803-012-1695-5
    [16] I. Lopatovska, Searching for good mood: examining relationships between search task and mood, ASIS & T, 46 (2009), 1−13. doi: 10.1002/meet.2009.1450460222
    [17] P. Sarkar, A. Etemad, Self-supervised ECG representation learning for emotion recognition, IEEE Trans. Affect. Comput., 13 (2022), 1541−1554. doi: 10.1109/taffc.2020.3014842
    [18] P. Pandey, K. R. Seeja, Subject independent emotion recognition from EEG using VMD and deep learning, J. King Saud. Univ. Comput. Inf. Sci., 34 (2022), 1730−1738. doi: 10.1016/j.jksuci.2019.11.003
    [19] G. Giannakakis, D. Grigoriadis, K. Giannakaki, O. Simantiraki, A. Roniotis, M. Tsiknakis, Review on psychological stress detection using biosignals, IEEE Trans. Affective Comput., 13 (2019), 440−460. doi: 10.1109/taffc.2019.2927337
    [20] D. J. Diaz-Romero, A. M. R. Rincon, A. Miguel-Cruz, N. Yee, E. Stroulia, Recognizing emotional states with wearables while playing a serious game, IEEE Trans. Instrum. Meas., 70 (2021), 1−12. doi: 10.1109/tim.2021.3059467
    [21] S. Zhang, X. Zhao, Q. Tian, Spontaneous speech emotion recognition using multiscale deep convolutional LSTM, IEEE Trans. Affective Comput., 13 (2019), 680−688. doi: 10.1109/taffc.2019.2947464
    [22] S. Peng, R. Zeng, H. Liu, L. Cao, G. Wang, J. Xie, Deep broad learning for emotion classification in textual conversations, Tsinghua Sci. Technol., 29 (2024), 481−491. doi: 10.26599/tst.2023.9010021
    [23] A. Kleinsmith, N. Bianchi-Berthouze, Affective body expression perception and recognition: A survey, IEEE Trans. Affective Comput., 4 (2013), 15−33. doi: 10.1109/t-affc.2012.16
    [24] M. Jeong, B. C. Ko, Driver's facial expression recognition in real-time for safe driving, Sensors (Basel), 18 (2018), 4270. doi: 10.3390/s18124270
    [25] A. K. Davison, C. Lansley, N. Costen, K. Tan, M. H. Yap, SAMM: A spontaneous micro-facial movement dataset, IEEE Trans. Affective Comput., 9 (2018), 116−129. doi: 10.1109/taffc.2016.2573832
    [26] C. Cao, Y. Weng, S. Zhou, Y. Tong, K. Zhou, FaceWarehouse: A 3D facial expression database for visual computing, IEEE Trans. Visual. Comput. Graph., 20 (2014), 413−425. doi: 10.1109/tvcg.2013.249
    [27] O. Langner, R. Dotsch, G. Bijlstra, D. H. Wigboldus, S. T. Hawk, A. D. Van Knippenberg, Presentation and validation of the radboud faces database, Cognit. Emotion, 24 (2010), 1377-1388. doi: 10.1080/02699930903485076
    [28] M. Lyons, S. Akamatsu, M. Kamachi, J. Gyoba, Coding facial expressions with gabor wavelets, in Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, (1998), 200−205.
    [29] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, (2010), 94−101.
    [30] G. Zhao, X. Huang, M. Taini, S. Z. Li, M. Pietikalnen, Facial expression recognition from near-infrared videos, Image Vision Comput., 29 (2011), 607−619. doi: 10.1016/j.imavis.2011.07.002
    [31] I. J. Goodfellow, D. Erhan, P. L. Carrier, A. Courville, M. Mirza, B. Hamner, et al., Challenges in representation learning: A report on three machine learning contests, Neural Networks, 65 (2015), 59−63. doi: 10.1016/j.neunet.2014.09.005
    [32] A. Mollahosseini, B. Hasani, M. H. Mahoor, Affectnet: A database for facial expression, valence, and arousal computing in the wild, IEEE Trans. Affect. Comput., 10 (2017), 18−31. doi: 10.1109/taffc.2017.2740923
    [33] S. Li, W. Deng, Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition, IEEE Trans. Image Process., 28 (2018), 356−370. doi: 10.1109/tip.2018.2868382
    [34] C. F. Benitez-Quiroz, R. Srinivasan, A. M. Martinez, EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild, in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, (2016), 5562−5570.
    [35] P. Ekman, W. V. Friesen, Measuring facial movement, J. Nonverbal. Behav., 1 (1976), 56−75. doi: 10.1007/BF01115465
    [36] Y. Fang, J. Luo, C. Lou, Fusion of multi-directional rotation invariant uniform LBP features for face recognition, in 2009 Third International Symposium on Intelligent Information Technology Application, (2009), 332−335.
    [37] T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan, K. Yan, A deep neural network-driven feature learning method for multi-view facial expression recognition, IEEE Trans. Multimedia, 18 (2016), 2528−2536. doi: 10.1109/TMM.2016.2598092
    [38] P. Kumar, S. L. Happy, A. Routray, A real-time robust facial expression recognition system using HOG features, in 2016 International Conference on Computing, Analytics and Security Trends (CAST), (2016), 289−293.
    [39] N. Zeng, H. Zhang, B. Song, W. Liu, Y. Li, A. M. Dobaie, Facial expression recognition via learning deep sparse autoencoders, Neurocomputing, 273 (2018), 643−649. doi: 10.1016/j.neucom.2017.08.043
    [40] X. Jian, D. X. Qing, W. S. Jin, W. Y. Shou, Background subtraction based on a combination of texture, color and intensity, in Proceedings of the 9th International Conference on Signal Processing, (2008), 1400−1405.
    [41] S. Shojaeilangari, W. Y. Yau, K. Nandakumar, J. Li, E. K. Teoh, Robust representation and recognition of facial emotions using extreme sparse learning, IEEE Trans. Image Process, 24 (2015), 2140−2152. doi: 10.1109/TIP.2015.2416634
    [42] Y. D. Chen, X. Yang, T. J. Cham, J. F. Cai, Towards unbiased visual emotion recognition via causal intervention, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 60−69.
    [43] L. Wang, G. Jia, N. Jiang, H. Wu, J. Yang, EASE: Robust facial expression recognition via emotion ambiguity-sensitive cooperative networks, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 218−227.
    [44] P. Barros, E. Barakova, S. Wermter, Adapting the interplay between personalized and generalized affect recognition based on an unsupervised neural framework, IEEE Trans. Affect. Comput., 13 (2022), 1349−1365. doi: 10.1109/TAFFC.2020.3002657
    [45] K. Zheng, L. Tian, Z. Li, H. Li, J. Zhang, Incorporating eyebrow and eye state information for facial expression recognition in mask-obscured scenes, Electron. Res. Arch., 32 (2024), 2745−2771. doi: 10.3934/era.2024124
    [46] A. S. Cowen, D. Keltner, F. Schroff, B. Jou, H. Adam, G. Prasad, Sixteen facial expressions occur in similar contexts worldwide, Nature, 589 (2021), 251−257. doi: 10.1038/s41586-020-3037-7
    [47] K. Zheng, D. Yang, J. Liu, Recognition of teachers' facial expression intensity based on convolutional neural network and attention mechanism, IEEE Access, 8 (2020), 226437−226444. doi: 10.1109/access.2020.3046225
    [48] J. J. Zhang, G. M. Sun, K. Zheng, S. Mazhar, X. H. Fu, D. Yang, Emotion recognition based on graph neural networks, in Proceedings of the International Conference on Cognitive Systems and Signal Processing ICCSIP 2020: Cognitive Systems and Signal Processing, (2021), 472−480.
    [49] W. J. Yan, Q. Wu, Y. J. Liu, S. J. Wang, X. Fu, CASME database: a dataset of spontaneous micro-expressions collected from neutralized faces, in Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, (2013), 1−7.
    [50] W. J. Yan, X. Li, S. J. Wang, G. Zhao, Y. J. Liu, Y. H. Chen, X. Fu, CASME Ⅱ: An improved spontaneous micro-expression database and the baseline evaluation, PLoS One, 9 (2014), e86041. doi: 10.1371/journal.pone.0086041
    [51] J. Li, Z. Dong, S. Lu, S. J. Wang, W. J. Yan, Y. Ma, et al., CAS(ME).3: A third generation facial spontaneous micro-expression database with depth information and high ecological validity, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2023), 2782−2800. doi: 10.1109/tpami.2022.3174895
    [52] C. H. Yap, C. Kendrick, M. H. Yap, SAMM Long Videos: A spontaneous facial micro- and macro-expressions dataset, in Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, (2020), 771−776.
    [53] P. Husak, J. Cech, J. Matas, Spotting facial micro-expressions in the wild, in Proceedings of the 22nd Computer Vision Winter Workshop, (2017). 21669949
    [54] G. Warren, E. Schertler, P. Bull, Detecting deception from emotional and unemotional cues, J. Nonverbal Behav., 33 (2009), 59−69. doi: 10.1007/s10919-008-0057-7
    [55] M. Shreve, S. Godavarthy, D. Goldgof, S. Sarkar, Macro-and micro-expression spotting in long videos using spatio-temporal strain, in Proceedings of the 2011 IEEE International Conference on Automatic Face and Gesture Recognition, (2011), 51−56.
    [56] S. Polikovsky, Y. Kameda, Y. Ohta, Facial micro-expressions recognition using high speed camera and 3D-gradient descriptor, in Proceedings of the 3rd International Conference on Image for Crime Detection and Prevention, (2009), 16−21.
    [57] X. Ben, Y. Ren, J. Zhang, S. J. Wang, K. Kpalma, W. Meng, et al., Video-based facial micro-expression analysis: A survey of datasets, features and algorithms, IEEE Trans. Pattern Anal., 44 (2022), 5826−5846. doi: 10.1109/tpami.2021.3067464
    [58] M. Peng, C. Wang, T. Chen, G. Liu, X. Fu, Dual temporal scale convolutional neural network for micro-expression recognition, Front. Psychol., 8 (2017), 1745. doi: 10.3389/fpsyg.2017.01745
    [59] D. H. Kim, W. J. Baddar, J. Jang, Y. M. Ro, Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition, IEEE Trans. Affect. Comput., 10 (2017), 223−236. doi: 10.1109/taffc.2017.2695999
    [60] M. Verma, S. K. Vipparthi, G. Singh, S. Murala, LEARNet: Dynamic imaging network for micro expression recognition, IEEE Trans. Image Process., 29 (2019), 1618−1627. doi: 10.1109/tip.2019.2912358
    [61] B. Song, K. Li, Y. Zong, J. Zhu, W. Zheng, J. Shi, et al., Recognizing spontaneous micro-expression using a three-stream convolutional neural network, IEEE Access, 7 (2019), 184537−184551. doi: 10.1109/access.2019.2960629
    [62] Z. Xia, X. Hong, X. Gao, X. Feng, G. Zhao, Spatiotemporal recurrent convolutional networks for recognizing spontaneous micro-expressions, IEEE Trans. Multimedia, 22 (2019), 626−640. doi: 10.1109/tmm.2019.2931351
    [63] M. Peng, Z. Wu, Z. Zhang, T. Chen, From macro to micro expression recognition: Deep learning on small datasets using transfer learning, in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), (2018), 657−661.
    [64] L. Ma, T. Tan, Y. Wang, D. Zhang, Efficient iris recognition by characterizing key local variations, IEEE Trans. Image Process., 13 (2004), 739−750. doi: 10.1109/tip.2004.827237
    [65] Z. N. Sun, T. N. Tan, Ordinal measures for iris recognition, IEEE Trans. Pattern Anal. Mach. Intell., 31 (2009), 2211−2226. doi: 10.1109/tpami.2008.240
    [66] Z. F. He, T. N. Tan, Z. N. Sun, X. Qiu, Towards accurate and fast iris segmentation for iris biometrics, IEEE Trans. Pattern Anal. Mach. Intell., 31 (2009), 1670−1684. doi: 10.1109/tpami.2008.183
    [67] T. N. Tan, Z. F. He, Z. N. Sun, Efficient and robust segmentation of noisy iris images for non-cooperative iris recognition, Image Vision Comput., 28 (2010), 223−230. doi: 10.1016/j.imavis.2009.05.008
    [68] P. J. Phillips, K. W. Bowyer, P. J. Flynn, X. Liu, W. T. Scruggs, The iris challenge evaluation 2005, in Proceedings of the 2008 IEEE Second International Conference on Biometrics: Theory, Applications and Systems, (2008), 1−8.
    [69] S. Shah, A. Ross, Generating synthetic irises by feature agglomeration, in Proceedings of the IEEE International Conference on Image Processing, (2006), 317−320.
    [70] M. Tonsen, X. C. Zhang, Y. Sugano, A. Bulling, Labelled pupils in the wild: A dataset for studying pupil detection in unconstrained environments, in Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research and Applications, (2016), 139−142.
    [71] M. Dobes, J. Martinek, D. Skoupil, Z. Dobesova, J. Pospisil, Human eye localization using the modified Hough transform, Optik, 117 (2006), 468−473. doi: 10.1016/j.ijleo.2005.11.008
    [72] H. Proenca, L. A. Alexandre, UBIRIS: A noisy iris image database, in Proceedings of the 13 International Conference on Image Analysis and Processing, (2005), 970−977.
    [73] H. Proenca, S. Filipe, R. Santos, J. Oliveira, L. A. Alexandre, The UBIRIS.v2: A database of visible wavelength iris images captured on-the-move and at-a-distance, Trans. Pattern Anal. Mach. Intell., 32 (2009), 1529−1535. doi: 10.1109/tpami.2009.66
    [74] W. Fuhl, G. Kasneci, E. Kasneci, TEyeD: Over 20 million real-world eye image with pupil, Eyelid, and Iris 2D and 3D segmentations, 2D and 3D landmarks, 3D eyeball, gaze vector, and eye movement types, in Proceedings of the 2021 IEEE International Symposium on Mixed and Augmented Reality, (2021), 367−375.
    [75] G. Sun, J. Zhang, K. Zheng, X. Fu, Eye tracking and roi detection within a computer screen using a monocular camera, J. Web Eng., (2020), 1117−1146. doi: 10.13052/jwe1540-9589.19789
    [76] G. Heusch, A. Anjos, S. Marcel, A reproducible study on remote heart rate measurement, preprint, arXiv: 1709.00962.
    [77] G. G. Hsu, A. Ambikapathi, M. S. Chen, Deep learning with time-frequency representation for pulse estimation from facial videos, in Proceedings of the 2017 IEEE International Joint Conference on Biometrics, (2017), 383−389.
    [78] R. Stricker, S. Muller, H. M. Gross, Non-contact video-based pulse rate measurement on a mobile service robot, in Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, (2014), 1056−1062.
    [79] S. Bobbia, R. Macwan, Y. Benezeth, A. Mansouri, J. Dubois, Unsupervised skin tissue segmentation for remote photoplethysmography, Pattern Recogn. Lett., 124 (2019), 82−90. doi: 10.1016/j.patrec.2017.10.017
    [80] X. Niu, H. Han, S. Shan, X. Chen, VIPL-HR: A multi-modal database for pulse estimation from less-constrained face video, in Proceedings of the Asian Conference on Computer Vision, (2018), 562−576.
    [81] X. Li, H. Han, H. Lu, X. Niu, Z. Yu, A. Dantcheva, et al., The 1st challenge on remote physiological signal sensing, in Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2020), 1274−1281.
    [82] Z. Zhang, J. M. Girard, Y. Wu, X. Zhang, P. Liu, U. Ciftci, et al., Multimodal spontaneous emotion corpus for human behavior analysis, in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, (2016), 3438−3446.
    [83] E. M. Nowara, T. K. Marks, H. Mansour, A. Veeraraghavan, Near-infrared imaging photoplethysmography during driving, IEEE Trans. Intell. Trans. Syst., 23 (2022), 3589−3600. doi: 10.1109/tits.2020.3038317
    [84] E. M. Nowara, T. K. Marks, H. Mansour, SparsePPG: Towards driver monitoring using camera-based vital signs estimation in near-infrared, in Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, (2018), 1353−1362.
    [85] X. Li, I. Alikhani, J. Shi, T. Seppanen, J. Junttila, K. Majamaa-Voltti, et al., The OBF database: A large face video database for remote physiological signal measurement and atrial fibrillation detection, in Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, (2018), 242−249.
    [86] Y. C. Chou, B. Y. Ye, H. R. Chen, Y. H. Lin, A real-time and non-contact pulse rate measurement system on fitness equipment, IEEE Trans. Instrum. Meas., 71 (2021), 1−11. doi: 10.1109/TIM.2021.3136173
    [87] Q. V. Tran, S. F. Su, W. Sun, M. Q. Tran, Adaptive pulsatile plane for robust noncontact heart rate monitoring, IEEE Trans. Syst. Man Cybern., 51 (2021), 5587−5599. doi: 10.1109/TSMC.2019.2957159
    [88] R. Belaiche, R. M. Sabour, C. Migniot, Y. Benezeth, D. Ginhac, K. Nakamura, et al., Emotional state recognition with micro-expressions and pulse rate variability, in Proceedings of the 20th International Conference on Image Analysis and Processing, (2019), 26−35.
    [89] R. M. Sabour, Y. Benezeth, F. Marzani, K. Nakamura, R. Gomez, F. Yang, Emotional state classification using pulse rate variability, in Proceedings of the 4th International Conference on Signal and Image Processing, (2019), 86−90.
    [90] F. Bevilacqua, H. Engstrom, P. Backlund, Game-calibrated and user-tailored remote detection of stress and boredom in games, Sensors-Basel, 19 (2019), 2877. doi: 10.3390/s19132877
    [91] K. Zheng, K. Ci, H. Li, L. Shao, G. Sun, J. Liu, et al., Heart rate prediction from facial video with masks using eye location and corrected by convolutional neural networks, Biomed. Signal Process., 75 (2022), 103609. doi: 10.1016/j.bspc.2022.103609
    [92] K. Zheng, K. Ci, J. Cui, J. Hong, J. Zhou, Non-contact heart rate detection when face information is missing during online learning, Sensors-Basel, 20 (2020), 7021. doi: 10.3390/s20247021
    [93] K. Zheng, J. J. Shen, G. M. Sun, H. Li, Y. Li, Shielding facial physiological information in video, Math. Biosci. Eng., 19 (2022), 5153−5168. doi: 10.3934/mbe.2022241
    [94] S. K. A. Prakash, C. S. Tucker, Bounded Kalman filter method for motion-robust, non-contact heart rate estimation, Biomed. Opt. Express, 9 (2018), 873−897. doi: 10.1364/boe.9.000873
    [95] Y. Qiu, Y. Liu, J. Arteaga-Falconi, H. Dong, A. El Saddik, EVM-CNN: Real-time contactless heart rate estimation from facial video, IEEE Trans. Multimedia, 21 (2018), 1778−1787. doi: 10.1109/tmm.2018.2883866
    [96] W. J. Han, H. F. Li, H. B. Ruan, L. Ma, Review on speech emotion recognition, J. Software, 25 (2014), 37−50. doi: 10.13328/j.cnki.jos.004497
    [97] S. R. Livingstone, F. A. Russo, The ryerson audio-visual database of emotional speech and song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in north American English, PLoS One, 13 (2018), e0196391, doi: 10.1371/journal.pone.0196391
    [98] Y. Wang, L. Guan, Recognizing human emotional state from audiovisual signals, IEEE Trans. Multimedia, 10 (2008), 659−668. doi: 10.1109/tmm.2008.927665
    [99] S. Zhalehpour, O. Onder, Z. Akhtar, C. E. Erdem, BAUM-1: A spontaneous audio-visual face database of affective and mental states, IEEE Trans. Affect. Comput., 8 (2017), 300−313. doi: 10.1109/taffc.2016.2553038
    [100] C. Busso, M. Bulut, C. C. Lee, A. Kazemzadeh, E. Mower, S. Kim, et al., IEMOCAP: Interactive emotional dyadic motion capture database, Lang. Resour. Eval., 42 (2008), 335−359. doi: 10.1007/s10579-008-9076-6
    [101] A. Metallinou, Z. Yang, C. C. Lee, C. Busso, S. Carnicke, S. Narayanan, The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations, Lang. Resour. Eval., 50 (2016), 497−521. doi: 10.1007/s10579-015-9300-0
    [102] M. Grimm, K. Kroscher, S. Narayanan, The Vera am Mittag German audio-visual emotional speech database, in Proceedings of 2008 IEEE International Conference on Multimedia and Expo, (2008), 865−868.
    [103] G. Mckown, M. Valstar, R. Cowie, M. Pantic, M. Schroder, The SEMAINE database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent, IEEE Trans. Affect. Comput., 3 (2012), 5−17. doi: 10.1109/t-affc.2011.20
    [104] F. Ringeval, A. Sonderegger, J. Sauer, D. Lalanne, Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions, in Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, (2013), 1−8.
    [105] V. V. Nanavare, S. K. Jagtap, Recognition of human emotions from speech processing, Procedia Comput. Sci., 49 (2015), 24−32. doi: 10.1016/j.procs.2015.04.223
    [106] P. Vasuki, C. Aravindan, Improving emotion recognition from speech using sensor fusion techniques, in Proceedings of TENCON 2012 IEEE Region 10 Conference, (2012), 1−6.
    [107] X. L. Zhao, Q. R. Mao, Y. Z. Zhan, New method of speech emotion recognition fusing functional paralanguages, J. Front. Comput. Sci. Technol., 8 (2014), 186−199. doi: 10.3778/j.issn.1673-9418.1309002
    [108] J. H. Hsu, M. H. Su, C. H. Wu, Y. H. Chen, Speech emotion recognition considering nonverbal vocalization in affective conversations, IEEE-ACM Trans. Audio Speech Lang. Process., 29 (2021), 1675−1686. doi: 10.1109/taslp.2021.3076364
    [109] S. Zhang, S. Zhang, T. Huang, W. Gao, Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching, IEEE Trans. Multimedia, 20 (2017), 1576−1590. doi: 10.1109/tmm.2017.2766843
    [110] Z. M. Wang, G. Liu, H. Song, Speech emotion recognition method based on multiple kernel learning feature fusion, Comput. Eng., 45 (2019), 248−254. doi: 10.19678/j.issn.1000-3428.0053232
    [111] J. Wang, M. Xue, R. Culhane, E. Diao, J. Ding, V. Tarokh, Speech emotion recognition with dual-sequence LSTM architecture, in IEEE International Conference on Acoustics, Speech and Signal Processing, (2020), 6474−6478.
    [112] J. Zhao, X. Mao, L. Chen, Speech emotion recognition using deep 1D & 2D CNN LSTM networks, Biomed. Signal Process., 47 (2019), 312−323. doi: 10.1016/j.bspc.2018.08.035
    [113] O. Atila, A. Sengur, Attention guided 3D CNN-LSTM model for accurate speech based emotion recognition, Appl. Acoust., 182 (2021), 108260. doi: 10.1016/j.apacoust.2021.108260
    [114] X. Wu, Y. Cao, H. Lu, S. Liu, D. Wang, Z. Wu, et al., Speech emotion recognition using sequential capsule networks, IEEE-ACM Trans. Audio Speech Lang. Process., 29 (2021), 3280−3291. doi: 10.1109/taslp.2021.3120586
    [115] I. Shahin, N. Hindawi, A. B. Nassif, A. Alhudhaif, K. Polat, Novel dual-channel long short-term memory compressed capsule networks for emotion recognition, Expert Syst. Appl., 188 (2022), 116080. doi: 10.1016/j.eswa.2021.116080
    [116] S. Zhang, R. Liu, Y. Yang, X. Zhao, J. Yu, Unsupervised domain adaptation integrating transformer and mutual information for cross-corpus speech emotion recognition, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 120−129.
    [117] D. Jing, T. Manting, Z. Li, Transformer-like model with linear attention for speech emotion recognition, J. Southeast Univ. (Engl. Ed.), 37 (2021), 164−170. doi: 10.3969/j.issn.1003-7985.2021.02.005
    [118] J. Lei, X. Zhu, Y. Wang, BAT: Block and token self-attention for speech emotion recognition, Neural Networks, 156 (2022), 67−80. doi: 10.1016/j.neunet.2022.09.022
    [119] L. Yi, M. W. Mak, Improving speech emotion recognition with adversarial data augmentation network, IEEE Trans. Neur. Net. Learn. Syst., 33 (2020), 172−184. doi: 10.1109/tnnls.2020.3027600
    [120] Z. Yucel, S. Koyama, A. Monden, M. Sasakura, Estimating level of engagement from ocular landmarks, Int. J. Hum. Comput. Int., 36 (2020), 1527−1539. doi: 10.1080/10447318.2020.1768666
    [121] Z. Pi, M. Chen, F. Zhu, J. Yang, W. Hu, Modulation of instructor's eye gaze by facial expression in video lectures, Innov. Educ. Teach. Int., 59 (2022), 15−23. doi: 10.1080/14703297.2020.1788410
    [122] M. Mahmoud, P. Robinson, Interpreting hand-over-face gestures, in International Conference on Affective Computing and Intelligent Interaction, (2011), 248−255.
    [123] K. Zheng, J. Kong, L. Tian, B. Li, H. Li, J. Zhou, Hand-over-face occlusion and distance adaptive heart rate detection based on imaging photoplethysmography and pixel distance in online learning, Biomed. Signal Process., 85 (2023), 104898, doi: 10.1016/j.bspc.2023.104898
    [124] M. Haghighat, M. Abdel-Mottaleb, W. Alhalabi, Discriminant correlation analysis: Real-time feature level fusion for multimodal biometric recognition, IEEE Trans. Inf. Forensics Secur., 11 (2016), 1984−1996. doi: 10.1109/tifs.2016.2569061
    [125] S. Koelstra, C. Muehl, M. Soleymani, A. Yazdani, T. Ebrahimi, T. Pun, et al., DEAP: A database for emotion analysis using physiological signals, IEEE Trans. Affect. Comput., 3 (2012), 18−31. doi: 10.1109/t-affc.2011.15
    [126] A. Zadeh, P. P. Liang, S. Poria, P. Vij, E. Cambria, L. P. Morency, Multi-attention recurrent network for human communication comprehension, in Proceedings of the AAAI Conference on Artificial Intelligence, (2018), 5642−5649.
    [127] W. Yu, H. Xu, F. Meng, Y. Zhu, Y. Ma, J. Wu, J. Zou, K. Yang, CH-SIMS: A Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, (2020), 3718−3727.
    [128] N. Xu, W. Mao, G. Chen, Multi-interactive memory network for aspect based multimodal sentiment analysis, in Proceedings of the AAAI Conference on Artificial Intelligence, (2019), 371−378.
    [129] Y. Baveye, E. Dellandrea, C. Chamaret, LIRIS-ACCEDE: A video database for affective content analysis, IEEE Trans. Affect. Comput., 6 (2015), 43−55. doi: 10.1109/taffc.2015.2396531
    [130] M. Soleymani, J. Lichtenauer, T. Pun, A multimodal database for affect recognition and implicit tagging, IEEE Trans. Affect. Comput., 3 (2012), 42−55. doi: 10.1109/t-affc.2011.25
    [131] O. Martin, I. Kotsia, B. Macq, I. Pitas, The eNTERFACE'05 audio-visual emotion database, in Proceedings of the 22nd International Conference on Data Engineering Workshops, (2006).
    [132] H. Zhou, J. Du, Y. Zhang, Q. Wang, Q. F. Liu, C. H. Lee, Information fusion in attention networks using adaptive and multi-level factorized bilinear pooling for audio-visual emotion recognition, IEEE-ACM Trans. Audio Speech Lang. Process., 29 (2021), 2617−2629. doi: 10.1109/taslp.2021.3096037
    [133] M. Wu, W. Su, L. Chen, W. Pedrycz, K. Hirota, Two-stage fuzzy fusion based-convolution neural network for dynamic emotion recognition, IEEE Trans. Affect. Comput., 13 (2020), 805−817. doi: 10.1109/taffc.2020.2966440
    [134] J. Chen, Z. Chen, Z. Chi, H. Fu, Facial expression recognition in video with multiple feature fusion, IEEE Trans. Affect. Comput., 9 (2018), 38−50. doi: 10.1109/taffc.2016.2593719
    [135] Y. Kim, E. M. Provost, ISLA: Temporal segmentation and labeling for audio-visual emotion recognition, IEEE Trans. Affect. Comput., 10 (2017), 196−208. doi: 10.1109/taffc.2017.2702653
    [136] P. Bhattacharya, R. K. Gupta, Y. P. Yang, Exploring the contextual factors affecting multimodal emotion recognition in videos, IEEE Trans. Affect. Comput., 14 (2023), 1547−1557. doi: 10.1109/taffc.2021.3071503
    [137] L. Vaiani, M. L. Quatra, L. Cagliero, P. Garza, ViPER: Video-based perceiver for emotion recognition, in Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, (2022), 67−73.
    [138] Y. Wu, Z. Y. Zhang, P. Peng, Y. Y. Zhao, B. Qin, Leveraging multi-modal interactions among the intermediate representations of deep transformers for emotion recognition, in Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, (2022), 101−109.
    [139] D. K. Yang, S. Huang, H. P. Kuang, Disentangled representation learning for multimodal emotion recognition, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 1642−1651.
    [140] Y. P. Liu, W. Sun, X. Zhang, Y. B. Qin, Improving dimensional emotion recognition via feature-wise fusion, in Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, (2022), 55−60.
    [141] M. Y. Tsalamlal, M. A. Amorim, J. C. Martin, M. Ammi, Combining facial expression and touch for perceiving emotional valence, IEEE Trans. Affect. Comput., 9 (2018), 437−449. doi: 10.1109/taffc.2016.2631469
    [142] Y. Yang, Q. Gao, Y. Song, X. L. Song, Z. M. Mao, J. J. Liu, Investigating of deaf emotion cognition pattern by EEG and facial expression combination, IEEE J. Biomed. Health, 26 (2022), 589−599. doi: 10.1109/jbhi.2021.3092412
    [143] Siddharth, T. P. Jung, T. J. Sejnowski, Utilizing deep learning towards multi-modal bio-sensing and vision-based affective computing, IEEE Trans. Affect. Comput., 13 (2022), 96−107. doi: 10.1109/taffc.2019.2916015
    [144] N. Braunschweiler, R. Doddipatla, S. Keizer, S. Stoyanchev, Factors in emotion recognition with deep learning models using speech and text on multiple corpora, IEEE Signal Proc. Lett., 29 (2022), 722−726. doi: 10.1109/lsp.2022.3151551
    [145] X. Zhang, J. Liu, J. Shen, S. Li, K. Hou, B. Hu, et al., Emotion recognition from multimodal physiological signals using a regularized deep fusion of kernel machine, IEEE Trans. Cybern., 51 (2021), 4386−4399. doi: 10.1109/tcyb.2020.2987575
    [146] Z. Jia, Y. Lin, J. Wang, Z. Feng, X. Xie, C. Chen, HetEmotionNet: Two-stream heterogeneous graph recurrent neural network for multi-modal emotion recognition, in Proceedings of the 29th ACM International Conference on Multimedia, (2021), 1047−1056.
    [147] M. Soleymani, M. Pantic, T. Pun, Multimodal emotion recognition in response to videos, IEEE Trans. Affect. Comput., 3 (2011), 211−223. doi: 10.1109/t-affc.2011.37
    [148] W. L. Zheng, W. Liu, Y. Lu, B. L. Lu, A. Cichocki, Emotionmeter: A multimodal framework for recognizing human emotions, IEEE Trans. Cybern., 49 (2018), 1110−1122. doi: 10.1109/tcyb.2018.2797176
    [149] Q. Wang, M. Wang, Y. Yang, X. Zhang, Multi-modal emotion recognition using EEG and speech signals, Comput. Biol. Med., 149 (2022), 105907. doi: 10.1016/j.compbiomed.2022.105907
    [150] S. Scrimin, U. Moscardino, L. Finos, L. Mason, Effects of psychophysiological reactivity to a school-related stressor and temperament on early adolescents' academic performance, J. Early Adolesc., 39 (2019), 904−931. doi: 10.1177/0272431618797008
    [151] B. Cowley, N. Ravaja, T. Heikura, Cardiovascular physiology predicts learning effects in a serious game activity, Comput. Educ., 60 (2013), 299−309. doi: 10.1016/j.compedu.2012.07.014
    [152] K. N. Cranford, J. M. Tiettmeyer, B. C. Chuprinko, S. Jordan, N. P. Grove, Measuring load on working memory: The use of heart rate as a means of measuring chemistry students' cognitive load, J. Chem. Educ., 91 (2014), 641−647. doi: 10.1021/ed400576n
    [153] N. Thompson, T. J. McGill, Genetics with Jean: The design, development and evaluation of an affective tutoring system, Educ. Technol. Res., 65 (2017), 279−299. doi: 10.1007/s11423-016-9470-5
    [154] A. Versluis, B. Verkuil, P. Spinhoven, J. F. Brosschot, Feasibility and effectiveness of a worry-reduction training using the smartphone: A pilot randomised controlled trial, Br. J. Guid. Couns., 48 (2020), 227−239. doi: 10.1080/03069885.2017.1421310
    [155] K. Fromel, Z. Svozil, F. Chmelik, L. Jakubec, D. Groffik, The role of physical education lessons and recesses in school lifestyle of adolescents, J. School Health, 86 (2016), 143−151. doi: 10.1111/josh.12362
    [156] M. Slingerland, L. Haerens, G. Cardon, L. Borghouts, Differences in perceived competence and physical activity levels during single-gender modified basketball game play in middle school physical education, Eur. Phys. Educ. Rev., 20 (2014), 20−35. doi: 10.1177/1356336x13496000
    [157] P. Klein, J. Viiri, S. Mozaffari, A. Dengel, J. Kuhn, Instruction-based clinical eye-tracking study on the visual interpretation of divergence: How do students look at vector field plots?, Phys. Rev. Phys. Educ. Res., 14 (2018), 010116. doi: 10.1103/physrevphyseducres.14.010116
    [158] A. I. Molina, O. Navarro, M. Ortega, M. Lacruz, Evaluating multimedia learning materials in primary education using eye tracking, Comput. Stand. Int., 59 (2018), 45−60. doi: 10.1016/j.csi.2018.02.004
    [159] L. Mason, P. Pluchino, M. C. Tornatora, Using eye-tracking technology as an indirect instruction tool to improve text and picture processing and learning, Br. J. Educ. Technol., 47 (2016), 1083−1095. doi: 10.1111/bjet.12271
    [160] M. Van Wermeskerken, T. Van Gog, Seeing the instructor's face and gaze in demonstration video examples affects attention allocation but not learning, Comput. Educ., 113 (2017), 98−107. doi: 10.1016/j.compedu.2017.05.013
    [161] V. Clinton, J. L. Cooper, J. E. Michaelis, M. W. Alibali, M. J. Nathan, How revisions to mathematical visuals affect cognition: Evidence from eye tracking, in Eye-Tracking Technology Applications in Educational Research, (2017), 195−218.
    [162] Y. C. Jian, Eye-movement patterns and reader characteristics of students with good and poor performance when reading scientific text with diagrams, Reading. Writing., 30 (2017), 1447−1472. doi: 10.1007/s11145-017-9732-6
    [163] J. M. Karch, J. C. Garcia Valles, H. Sevian, Looking into the black box: Using gaze and pupillometric data to probe how cognitive load changes with mental tasks, J. Chem. Educ., 96 (2019), 830−840. doi: 10.1021/acs.jchemed.9b00014
    [164] K. Krstic, A. Soskic, V. Kovic, K. Holmqvist, All good readers are the same, but every low-skilled reader is different: an eye-tracking study using PISA data, Eur. J. Psychol. Educ., 33 (2018), 521−541. doi: 10.1007/s10212-018-0382-0
    [165] X. Zhu, Z. Chen, Dual-modality spatiotemporal feature learning for spontaneous facial expression recognition in e-learning using hybrid deep neural network, Vis. Comput., 36 (2020), 743−755. doi: 10.1007/s00371-019-01660-3
    [166] B. T. Shobana, G. A. Kumar, I-Quiz: An intelligent assessment tool for non-verbal behaviour detection, Comput. Syst. Sci. Eng., 40 (2022), 1007−1021. doi: 10.32604/csse.2022.019523
    [167] T. S. Ashwin, R. M. R. Guddeti, Impact of inquiry interventions on students in e-learning and classroom environments using affective computing framework, User Model. User-Adap. Int., 30 (2020), 759−801. doi: 10.1007/s11257-019-09254-3
    [168] I. Alkabbany, A. Ali, A. Farag, I. Bennett, M. Ghanoum, A. Farag, Measuring student engagement level using facial information, in 2019 IEEE International Conference on Image Processing (ICIP), (2019), 3337−3341.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (
通讯作者: 陈斌,
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索


Article views(910) PDF downloads(54) Cited by(0)

Article outline

Figures and Tables

Figures(14)  /  Tables(7)


DownLoad:  Full-Size Img  PowerPoint
