Facial expression is the crucial component for human beings to express their mental state and it has become one of the prominent areas of research in computer vision. However, the task becomes challenging when the given facial image is non-frontal. The influence of poses on facial images is alleviated using an encoder of a generative adversarial network capable of learning pose invariant representations. State-of-art results for image generation are achieved using styleGAN architecture. An efficient model is proposed to embed the given image into the latent vector space of styleGAN. The encoder extracts high-level features of the facial image and encodes them into the latent space. Rigorous analysis of semantics hidden in the latent space of styleGAN is performed. Based on the analysis, the facial image is synthesized, and facial expressions are recognized using an expression recognition neural network. The original image is recovered from the features encoded in the latent space. Semantic editing operations like face rotation, style transfer, face aging, image morphing and expression transfer can be performed on the image obtained from the image generated using the features encoded latent space of styleGAN. L2 feature-wise loss is applied to warrant the quality of the rebuilt image. The facial image is then fed into the attribute classifier to extract high-level features, and the features are concatenated to perform facial expression classification. Evaluations are performed on the generated results to demonstrate that state-of-art results are achieved using the proposed method.
Citation: R Nandhini Abiram, P M Durai Raj Vincent. Identity preserving multi-pose facial expression recognition using fine tuned VGG on the latent space vector of generative adversarial network[J]. Mathematical Biosciences and Engineering, 2021, 18(4): 3699-3717. doi: 10.3934/mbe.2021186
Facial expression is the crucial component for human beings to express their mental state and it has become one of the prominent areas of research in computer vision. However, the task becomes challenging when the given facial image is non-frontal. The influence of poses on facial images is alleviated using an encoder of a generative adversarial network capable of learning pose invariant representations. State-of-art results for image generation are achieved using styleGAN architecture. An efficient model is proposed to embed the given image into the latent vector space of styleGAN. The encoder extracts high-level features of the facial image and encodes them into the latent space. Rigorous analysis of semantics hidden in the latent space of styleGAN is performed. Based on the analysis, the facial image is synthesized, and facial expressions are recognized using an expression recognition neural network. The original image is recovered from the features encoded in the latent space. Semantic editing operations like face rotation, style transfer, face aging, image morphing and expression transfer can be performed on the image obtained from the image generated using the features encoded latent space of styleGAN. L2 feature-wise loss is applied to warrant the quality of the rebuilt image. The facial image is then fed into the attribute classifier to extract high-level features, and the features are concatenated to perform facial expression classification. Evaluations are performed on the generated results to demonstrate that state-of-art results are achieved using the proposed method.
[1] | T. Fádel, S. Carvalho, B. Santos, Facial expression recognition in Alzheimer's disease: a systematic review, J. Clin. Exp. Neuropsychol., 41 (2019), 192-203. doi: 10.1080/13803395.2018.1501001 |
[2] | S. Li, W. Deng, Blended emotion in-the-wild: Multi-label facial expression recognition using crowdsourced annotations and deep locality feature learning, Int. J. Comput. Vis., 127 (2019), 884-906. doi: 10.1007/s11263-018-1131-1 |
[3] | W. Su, M. Liu, Y. Yang, J. Wang, S. Li, H. Lv, et al., PPD: a manually curated database for experimentally verified prokaryotic promoters, J. Mol. Biol., (2019), forthcoming. |
[4] | D. Jain, P. Shamsolmoali, P. Sehdev, Extended deep neural network for facial emotion recognition, Pattern Recognit. Lett., 120 (2019), 69-74. doi: 10.1016/j.patrec.2019.01.008 |
[5] | C. Gong, F. Lin, X. An, A novel emotion control system for embedded human-computer interaction in green Iot, IEEE Access, 7 (2019), 185148-185156. doi: 10.1109/ACCESS.2019.2960832 |
[6] | J. Deng, G. Pang, Z. Zhang, Z. Pang, H. Yang, G. Yang, cGAN based facial expression recognition for human-robot interaction, IEEE Access, 7 (2019), 9848-9859. doi: 10.1109/ACCESS.2019.2891668 |
[7] | F. Y. Dao, H. Lv, D. Zhang, Z. Zhang, L. Liu, H. Lin, DeepYY1: a deep learning approach to identify YY1-mediated chromatin loops, Brief. Bioinf., 356 (2020). |
[8] | M. Wu, W. Su, L. Chen, Z. Liu, W. Cao, K. Hirota, Weight-adapted convolution neural network for facial expression recognition in human-robot interaction, IEEE Trans. Syst. Man Cybern., 51 (2021), 1473-1484. doi: 10.1109/TSMC.2019.2897330 |
[9] | M. Sajjad, A. Shah, Z. Jan, S. I. Shah, S. W. Baik, I. Mehmood, Facial appearance and texture feature-based robust facial expression recognition framework for sentiment knowledge discovery, Cluster Comput., 21 (2018), 549-567. doi: 10.1007/s10586-017-0935-z |
[10] | A. Nandi, P. Dutta, M. Nasir, Automatic facial expression recognition using Histogram oriented Gradients (HoG) of shape information matrix, in International Conference on Intelligent Computing and Communication, Springer, (2019), 343-351. |
[11] | F. Y. Dao, H. Lv, H. Zulfiqar, H. Yang, W. Su, H. Gao, et al., A computational platform to identify origins of replication sites in eukaryotes, Brief. Bioinf., 22 (2020), 1940-1950. |
[12] | S. Berretti, B. B. Amor, M. Daoudi, A. Bimbo, 3D facial expression recognition using SIFT descriptors of automatically detected keypoints, Visual Comput., 27 (2011), 1021. doi: 10.1007/s00371-011-0611-x |
[13] | O. Starostenko, C. Cruz-Perez, V. Alarcon-Aquino, R. Rosas-Romero, Real-time facial expression recognition using local appearance-based descriptors, J. Intell. Fuzzy Syst., 36 (2019), 5037-5049. doi: 10.3233/JIFS-179049 |
[14] | H. Yang, W. Yang, F. Dao, H. Lv, H. Ding, W. Chen, et al., A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae, Brief. Bioinf., 21 (2020), 1568-1580. doi: 10.1093/bib/bbz123 |
[15] | J. Chorowski, R. J. Weiss, S. Bengio, A. Oord, Unsupervised speech representation learning using wavenet autoencoders, IEEE/ACM Trans. Audio Speech Lang. Process., 27 (2019), 2041-2053. doi: 10.1109/TASLP.2019.2938863 |
[16] | R. R. N. Abirami, P. M. D. R. Vincent, K. Srinivasan, U. Tariq, C. Y. Chang, Deep CNN and deep GAN in computational visual perception-driven image analysis, Complexity, 2021 (2021), 5541134. |
[17] | I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial nets, preprint, arXiv: 1406.2661. |
[18] | C. Han, L. Rundo, R. Araki, Y. Nagano, Y. Furukawa, G. Mauri, et al., Combining noise-to-image and image-to-image GANs: brain MR Image augmentation for tumor detection, IEEE Access, 7 (2019), 156966-156977. doi: 10.1109/ACCESS.2019.2947606 |
[19] | C. Xu, P. M. Feng, H. Yang, W. Qiu, W. Chen, H. Lin, iRNAD: a computational tool for identifying D modification sites in RNA sequence, Bioinformatics, 35 (2019), 4922-4929. doi: 10.1093/bioinformatics/btz358 |
[20] | H. Y. Lai, Z. Zhang, Z. Su, W. Su, H. Ding, W. Chen, et al., iProEP: a computational predictor for predicting promoter, Mol. Ther. Nucleic Acids, 17 (2019), 337-346. doi: 10.1016/j.omtn.2019.05.028 |
[21] | R. Yanagi, R. Togo, T. Ogawa, M. Haseyama, Query is GAN: scene retrieval with attentional text-to-image generative adversarial network, IEEE Access, 7 (2019), 153183-153193. doi: 10.1109/ACCESS.2019.2947409 |
[22] | Z. Y. Liang, H. Y. Lai, H. Yang, C. J. Zhang, H. Yang, H. Wei, et al., Pro54DB: a database for experimentally verified sigma-54 promoters, Bioinformatics, 33 (2017), 467-469. |
[23] | J. Cai, Z. Meng, A. S. Khan, Z. Li, J. O'Reily, Y. Tong, Identity-free facial expression recognition using conditional generative adversarial network, preprint, arXiv: 1903.08051. |
[24] | K. Ali, C. Hughes, Facial expression representation learning by synthesizing expression images, preprint, arXiv: 1912.01456. |
[25] | T. Karras, S. Laine, T. Aila, A style-based generator architecture for generative adversarial networks, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 4401-4410. |
[26] | W. Li, Z. J. Zhong, P. P. Zhu, E. Deng, H. Ding, W. Chen, et al., Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes, Front. Microbiol., 5 (2014), 574. |
[27] | H. Zhang, I. Goodfellow, D. Metaxas, A. Odena, Self-attention generative adversarial networks, in International Conference on Machine Learning, (2019), 7354-7363. |
[28] | H. Ding, S. H. Guo, E. Z. Deng, L. Yuan, F. Guo, J. Huang, et al., Prediction of Golgi-resident protein types by using feature selection technique, Chemom. Intell. Lab. Syst., 124 (2013), 9-13. doi: 10.1016/j.chemolab.2013.03.005 |
[29] | T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of GANs for improved quality, Stability, and Variation, preprint, arXiv: 1710.10196. |
[30] | A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, preprint, arXiv: 1511.06434. |
[31] | M. Mirza, S. Osindero, Conditional generative adversarial nets, preprint, arXiv: 1411.1784. |
[32] | X. Yin, X. Yu, K. Sohn, X. Liu, M. Chandraker, Towards large-pose face frontalization in the wild, in Proceedings of the IEEE International Conference on Computer Vision (ICCV), (2017), 3990-3999. |
[33] | Y. Shen, B. Zhou, P. Luo, X. Tang, FaceFeat-GAN: a two-stage approach for identity-preserving face synthesis, preprint, arXiv: 1812.01288. |
[34] | H. Lin, H. Ding, F. B. Guo, J. Huang, Prediction of subcellular location of mycobacterial protein using feature selection techniques, Mol. Diversity, 14 (2010), 667-671. doi: 10.1007/s11030-009-9205-1 |
[35] | G. Arvanitidis, L. Hansen, S. Hauberg, Latent space oddity: on the curvature of deep generative models, preprint, arXiv: 1710.11379. |
[36] | F. Ma, U. Ayaz, S. Karaman, Invertibility of convolutional generative networks from partial measurements, Adv. Neural Inf. Process. Syst., 31 (2018), 9628-9637. |
[37] | H. Lin, Q. Z. Li, Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components, J. Comput. Chem., 28 (2007), 1463-1466. doi: 10.1002/jcc.20554 |
[38] | X. Wang, X. Wang, Y. Ni, Unsupervised domain adaptation for facial expression recognition using generative adversarial networks, Comput. Intell. Neurosci., 2018 (2018), 7208794. |
[39] | T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan, K. Yan, A deep neural network-driven feature learning method for multi-view facial expression recognition, IEEE Trans. Multimedia, 18 (2016), 2528-2536. doi: 10.1109/TMM.2016.2598092 |
[40] | B. Yang, J. Cao, R. Ni, Y. Zhang, Facial expression recognition using weighted mixture deep neural network based on double-channel facial images, IEEE Access, 6 (2017), 4630-4640. |
[41] | J. H. Kim, B. G. Kim, P. P. Roy, D. Jeong, Efficient facial expression recognition algorithm based on hierarchical deep neural network structure, IEEE Access, 7 (2019), 41273-41285. doi: 10.1109/ACCESS.2019.2907327 |
[42] | C. Zhang, P. Wang, K. Chen, J. K. Kämäräinen, Identity-aware convolutional neural networks for facial expression recognition, J. Syst. Eng. Electron., 28 (2017), 784-792. doi: 10.21629/JSEE.2017.04.18 |
[43] | P. M. Ferreira, F. Marques, J. S. Cardoso, A. Rebelo, Physiological inspired deep neural networks for emotion recognition, IEEE Access, 6 (2018), 53930-53943. doi: 10.1109/ACCESS.2018.2870063 |
[44] | S. M. González-Lozoya, J. de la Calleja, L. Pellegrin, H. J. Escalante, M. A. Medina, A. Benitez-Ruiz, Recognition of facial expressions based on CNN features, Multimedia Tools Appl., 79 (2020), 13987-14007. doi: 10.1007/s11042-020-08681-4 |
[45] | L. Yang, Y. Tian, Y. Song, N. Yang, K. Ma, L. Xie, A novel feature separation model exchange-GAN for facial expression recognition, Knowl. Based Syst., 204 (2020), 106217. doi: 10.1016/j.knosys.2020.106217 |
[46] | S. Liong, Y. Gan, D. Zheng, S. Li, H. Xu, H. Zhang, et al., Evaluation of the spatio-temporal features and gan for micro-expression recognition system, J. Signal Process. Syst., 92 (2020), 705-725. doi: 10.1007/s11265-020-01523-4 |
[47] | R. Wu, G. Zhang, S. Lu, T. Chen, Cascade ef-gan: Progressive facial expression editing with local focuses, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 5021-5030. |
[48] | A. Jahanian, L. Chai, P. Isola, On the "steerability" of generative adversarial networks, preprint, arXiv: 1907.07171. |
[49] | T. Kanade, J. F. Cohn, Y. Tian, Comprehensive database for facial expression analysis, in Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), IEEE, (2000), 46-53. |
[50] | N. Aifanti, A. Delopoulos, Linear subspaces for facial expression recognition, Signal Process. Image Commun., 29 (2014), 177-188. doi: 10.1016/j.image.2013.10.004 |
[51] | I. Dagher, E. Dahdah, M. Al Shakik, Facial expression recognition using three-stage support vector machines, Visual Comput. Ind. Biomed. Art, 2 (2019), 24. doi: 10.1186/s42492-019-0034-5 |
[52] | K. Shan, J. Guo, W. You, D. Liu, R. Bie, Automatic facial expression recognition based on a deep convolutional-neural-network structure, in 2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA), IEEE, (2017), 123-128. |
[53] | V. Mayya, R. M. Pai, M. M. Pai, Automatic facial expression recognition using DCNN, Procedia Comput. Sci., 93 (2016), 453-461. doi: 10.1016/j.procs.2016.07.233 |