Proposed big data architecture for facial recognition using machine learning

Suriya Priya R Asaithambi; Sitalakshmi Venkatraman; Ramanathan Venkatraman; Suriya Priya R Asaithambi; Sitalakshmi Venkatraman; Ramanathan Venkatraman

doi:10.3934/electreng.2021005

AIMS Electronics and Electrical Engineering

2021, Volume 5, Issue 1: 68-92. doi: 10.3934/electreng.2021005

Previous Article Next Article

Review Topical Sections

Proposed big data architecture for facial recognition using machine learning

1.
Institute of Systems Science, National University of Singapore, Singapore
2.
Department of Information Technology, Melbourne Polytechnic, VIC, Australia

Received: 03 November 2020 Accepted: 11 January 2021 Published: 03 February 2021

With the abundance of raw data generated from various sources including social networks, big data has become essential in acquiring, processing, and analyzing heterogeneous data from multiple sources for real-time applications. In this paper, we propose a big data framework suitable for pre‑processing and classification of image as well as text analytics by employing two key workflows, called big data (BD) pipeline and machine learning (ML) pipeline. Our unique end-to-end workflow integrates data cleansing, data integration, data transformation and data reduction processes, followed by various analytics using suitable machine learning techniques. Further, our model is the first of its kind to augment facial recognition with sentiment analysis in a distributed big data framework. The implementation of our model uses state-of-the-art distributed technologies to ingest, prepare, process and analyze big data for generating actionable data insights by employing relevant ML algorithms such as k-NN, logistic regression and decision tree. In addition, we demonstrate the application of our big data framework to facial recognition system using open sources by developing a prototype as a use case. We also employ sentiment analysis on non-repetitive semi structured public data (text) such as user comments, image tagging, and other information associated with the facial images. We believe our work provides a novel approach to intersect Big Data, ML and Face Recognition and would create new research to alleviate some of the challenges associated with big data processing in real world applications.

Keywords:

Citation: Suriya Priya R Asaithambi, Sitalakshmi Venkatraman, Ramanathan Venkatraman. Proposed big data architecture for facial recognition using machine learning[J]. AIMS Electronics and Electrical Engineering, 2021, 5(1): 68-92. doi: 10.3934/electreng.2021005

Related Papers:

[1]	Imad Ali, Faisal Ghaffar . Robust CNN for facial emotion recognition and real-time GUI. AIMS Electronics and Electrical Engineering, 2024, 8(2): 227-246. doi: 10.3934/electreng.2024010
[2]	Akram Zardadi . Data selection with set-membership affine projection algorithm. AIMS Electronics and Electrical Engineering, 2019, 3(4): 359-369. doi: 10.3934/ElectrEng.2019.4.359
[3]	Loris Nanni, Michelangelo Paci, Gianluca Maguolo, Stefano Ghidoni . Deep learning for actinic keratosis classification. AIMS Electronics and Electrical Engineering, 2020, 4(1): 47-56. doi: 10.3934/ElectrEng.2020.1.47
[4]	Efthymios N. Lallas . A survey on key roles of optical switching and labeling technologies on big data traffic of Data Centers and HPC environments. AIMS Electronics and Electrical Engineering, 2019, 3(3): 233-256. doi: 10.3934/ElectrEng.2019.3.233
[5]	Deven Nahata, Kareem Othman . Exploring the challenges and opportunities of image processing and sensor fusion in autonomous vehicles: A comprehensive review. AIMS Electronics and Electrical Engineering, 2023, 7(4): 271-321. doi: 10.3934/electreng.2023016
[6]	Habib Hadj-Mabrouk . Analysis and prediction of railway accident risks using machine learning. AIMS Electronics and Electrical Engineering, 2020, 4(1): 19-46. doi: 10.3934/ElectrEng.2020.1.19
[7]	Abdullah Yahya Abdullah Amer, Tamanna Siddiqu . A novel algorithm for sarcasm detection using supervised machine learning approach. AIMS Electronics and Electrical Engineering, 2022, 6(4): 345-369. doi: 10.3934/electreng.2022021
[8]	J. Rajeshwari, M. Sughasiny . Modified PNN classifier for diagnosing skin cancer severity condition using SMO optimization technique. AIMS Electronics and Electrical Engineering, 2023, 7(1): 75-99. doi: 10.3934/electreng.2023005
[9]	Vani H Y, Anusuya M A . Improving speech recognition using bionic wavelet features. AIMS Electronics and Electrical Engineering, 2020, 4(2): 200-215. doi: 10.3934/ElectrEng.2020.2.200
[10]	Youness Chawki, Khalid Elasnaoui, Mohamed Ouhda . Classification and detection of Covid-19 based on X-Ray and CT images using deep learning and machine learning techniques: A bibliometric analysis. AIMS Electronics and Electrical Engineering, 2024, 8(1): 71-103. doi: 10.3934/electreng.2024004

Abstract

1. Introduction

In the age of innovation and digital transformation, data is generated in huge volumes and in an increasing velocity that constitute a recently popular term, 'big data'. Recently, big data (BD) related technologies have developed into a hotspot that attracts great attention from academia, industry and even governments around the world. However, three of the key features of big data (3V's), namely multi-sources (Variety), huge volume (Volume) and fast-changing (Velocity), make it difficult for traditional data processing methods such as data mining to effectively support the processing of heterogeneous big data. To address the computational complexity of big data applications, there is a need to explore new approaches for building scalable big data processing architecture. Apache Spark along with tools from Hadoop eco system, enables complex analytics processing using in-memory computational techniques. The principal advantage of such big data technologies ^[1,2,3] is in their ability to provide computation-intensive operations upon massive data sets in real-time with significant accuracy and performance. Therefore, big data technologies could be considered ideal for facial recognition applications that warrant resource-intensive image analysis using machine learning (ML) algorithms on large corpus of image data collected from multiple internal and external big data sources. However, big data ^[4,5] and facial recognition ^[6] form two disparate advancements in technologies that are coming into some common convergence only recently.

Facial recognition with machine learning capabilities is a hot research topic due to its various applications in social media, surveillance systems, online shopping, banking, law enforcement, personalized marketing and access control for Internet of Things within various radical real world scenarios For instance, recent popularity of social networking hubs are requiring security and law enforcement in future to apply Artificial Intelligence (AI) on big data streams of facial data in real-time. AI-enabled facial recognition is required by retail and banking industries to understand consumer behavior patterns to improve their personalized products and services. While there is a plethora of facial recognition technologies, they need to be adapted with the massive upsurge of social networks and Internet of Things that are accompanied by big data of facial images stored and retrieved from several intertwined and disparate real-world application domains. Further, the process of facial recognition from a large set of images or videos is complex. Classical ML approaches involve domain knowledge of the data to create features, and such techniques are not applicable for the radical applications of the future. There are practical challenges of real-time processes apart from interpersonal variations due to similarities between two persons such as twins, or intrapersonal variations due to differences in two different image data of the same person contributed by several factors such as pose, obstruction, age, expression, quality and noise. Modern ML approaches require automatic feature extraction from large image data sets that remain invariant to such variations by adopting novel deep learning techniques. This forms the main motivation of this research work to propose a big data framework for providing an effective solution to the said problem.

In this paper, we leverage on recent developments in public large datasets, social networking public media and relevant ML algorithms to transform the conventional view of addressing facial recognition issues with a contemporary perspective. Our key contributions of this work are three-fold forming a modest initial step towards advancing an important research in this direction. These are given below

1. An approach first of its kind to intersect big data, machine language and facial recognition through the proposal of a big data architecture by employing two main workflow processes, namely BD pipeline and ML pipeline.

2. Application of the proposed BD architecture for developing a novel facial recognition prototype as a use case. We develop a prototype as an amalgamation of BD pipeline to include image data pre-processing, real time data streaming along with ML pipeline processing on stored facial images against real time streams.

3. Our unique method augments image analysis with text analytics on selected attributes such as social media tweet, and face tagging associated with social networks towards improving facial recognition.

The rest of the paper is organized as follows. In Section 2 we provide a review of related work and the unique contribution of our work. Section 3 describes our proposed big data architecture with the details of the solution model using relevant big data technologies. In Section 4, we demonstrate the application of proposed model for an effective facial recognition solution using machine learning. Finally, we provide the conclusion of our study in Section 5.

2. Related work

The global market for software taking benefit of facial recognition is expected to grow from $3.85 billion USD in 2017 to ＄9.78 billion USD by 2023. The Asia Pacific region, which holds around 16% of its market share, is the fastest-growing region ^[7]. This section is divided into three parts. In these parts, we provide an overview of some exiting works done from three perspectives: i) feature engineering and hyper parameter tuning for image or video processing, ii) facial recognition and learning algorithms, and iii) big data architecture and technologies.

2.1. Feature engineering and hyper parameter tuning for images and videos

Feature selection reduces the dataset by removing irrelevant or redundant features. Based on the feature extraction techniques used, face recognition uses global or local feature extraction methods. Global feature extraction algorithms used by researchers include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), and Two Dimensional PCA (2DPCA) ^[8]. Local feature extraction includes PCA, Support Vector Machines (SVM), Local Binary Pattern (LBP), and Local Binary Pattern Histogram Wavelet Feature (LBPWT) ^[9]. Global features can recognize pervasive features in the image or video such as texture, shape and other background information. Local features can use them for guidance and focus on smaller subset of processing.

Recently, a k-NN algorithmic variant was employed for expression mining using facial image tagging and classification in the Hadoop and MapReduce environment as a cloud hosting ^[10]. The experiment used 3120 images of 120 persons (65 male and 55 female candidates) from the AR public Face database using PCA, CCA and LDA combination. This is quite contrary to other studies conducted historically where PCA has been considered as the performant popular choice for facial image processing ^[11,12].

The combination of multiple local features can also improve the accuracy of face recognition. However, one shortcoming observed is that the local features tend to be sensitive, which makes them vulnerable to local lighting, expression, posture and other factors, and lack of robustness. Therefore, the best feature engineering models that deal with facial recognition effectively try to combine the advantages of both local features and global features. Our work attempts to combine selected global and local feature algorithms as appropriate.

2.2. Facial recognition using learning algorithms

Facial recognition is a technology for identifying or verifying a person in images or videos. The first face recognition ^[13] algorithm published in 1991 used eigenfaces ^[12]. In the past, Parkhi et al. used a convolutional neural network (CNN) ^[14], and He et al. used Laplacian faces ^[15]. However, in recent works, very Deep Neural Network ^[16] is observed to be in use towards achieving lightweight performance and high accuracy rates from loss correction and hyper parameter tuning ^{[17,18,19,20,21,22]}.

Generally, the process of facial recognition is performed in two key steps: (1) Feature extraction and selection, and (2) Classification of objects. Recent deep learning developments have introduced several other methods, such as the use of facial recognition algorithms, three-dimensional recognition, skin texture analysis, and thermal cameras. While deep learning for facial recognition is well researched in the field, its influence on feature engineering and hyper parameter tuning lack the required importance and exploration. This paper attempts to fill that gap by combining deep learning algorithms with enhanced feature tuning.

2.3. Big data architecture and technologies

Big data architecture and technologies are being well researched in certain domains such as HealthCare ^[23], Smart Initiatives ^[24,25,26], and Climate Change ^[27]. Researchers have published technical stacks as big data architecture templates using combination of tools from the Hadoop Eco System and distributed computing frameworks such as Spark, Flink and Beam. However, there are a few research observations regarding the hybrid architectural models that use batch as well as real-time processing towards facial recognition in the literature. Our work is a humble step towards bridging this gap.

3. Proposed big data architecture

Training complex face recognition on images and videos can take hours, days, or even weeks. In most cases, a single multi-GPU machine is enough to train large models in a reasonable amount of time. However, for more demanding real-time face recognition workloads, spreading computational loads across multiple machines can dramatically reduce training time, enabling rapid iterative experimentation, and accelerating deep learning deployments. Thus, big data processing architectures and parallel processing frameworks like MapReduce and Spark play a key role in such frontiers.

3.1. Big data technologies

There are two main approaches to enhance facial recognition, namely model parallelism and data parallelism. The big data architecture relies on distributed cluster computing for pre-processing and classification tasks of the heterogeneous and disparate big data streams emanating from various data sources. In this context, data parallelism can be achieved from the regular Hadoop MapReduce stack as proposed in our technical stack. However, when it comes to model parallelism, in memory frameworks play a key role. Thus, we propose to use the Spark in-memory processing to achieve learning model parallelism.

To support both data and model parallelism, we adopt the common open-source choice of technologies, namely Hadoop distributed file system and HBase for collecting data from heterogeneous data sources in our proposed big data architecture. Our approach is to use a hybrid of both data parallelism and model parallelism illustrated in Figure 1, where the parallelism can be achieved using MapReduce framework of Hadoop. MapReduce is a programming model for processing large datasets that are employed in a variety of real-world tasks. Developers can specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks ^[10,28,29]. However, research in the past have only focused on the feature-based batch implementation of facial recognition using MapReduce ^[29,30,31]. However, for performance reasons we have chosen Spark over MapReduce for the facial recognition algorithms implemented in this work.

Figure 1. Data vs model parallelism.

Reference	Category	# Features	# Instances	Size (GB)	Framework
^[35]	Missing values	73	2, 534	0.0016	-
^[36]	Missing values	9	1, 473	0.0000	-
^[38]	Outliers	8	615	0.0000	-
^[39]	Outliers	11	1, 025, 010	0.0236	-
^[40]	Outliers	11	19, 020	0.0014	-
^[41]	Outliers	2000	500, 000	3.6000	Apache Spark
^[46]	Data reduction	2000	500, 000	3.6000	Apache Spark
^[45]	Data reduction	41	4, 856, 151	1.4834	Hadoop MapReduce
^[29]	Data reduction	631	65, 000, 000	7.4460	Hadoop MapReduce
^[44]	Data reduction	631	65, 000, 000	7.4460	Apache Spark
^[47]	Imbalance data	41	4, 000, 000	0.743	Hadoop MapReduce
^[48]	Imbalance data	41	4, 000, 000	0.743	Apache Spark
^[49]	Discretization	631	65, 000, 000	7.4460	Apache Spark

Object	Description
Tweet	The Tweet object has a long list of root-level fields, such as id (unique ID for each tweet), text (tweet content), and created at (date timestamp of tweet). Tweet objects are also the parent object to several child objects including user and media objects.
User	The user object contains Twitter user account metadata describing the referenced user. Fields included are name, username, date of creation of account, number of followers, tweet counts and more.
Media	If a Tweet contains media (such as images), then the media object can be requested using the media.fields parameter and includes fields such as the media_key, type and URL.

[1]	Chen M, Mao S, Liu Y (2014) Big data: A survey. Mobile networks and applications 19: 171-209. doi: 10.1007/s11036-013-0489-0
[2]	McAfee A, Brynjolfsson E, Davenport TH, et al. (2012) Big data: the management revolution. Harvard business review 90: 60-68.
[3]	Venkatraman R, Venkatraman S (2019) Big Data Infrastructure, Data Visualisation and Challenges. Proceedings of the 3rd International Conference on Big Data and Internet of Things, 13-17.
[4]	Labrinidis A, Jagadish HV (2012) Challenges and opportunities with big data. Proceedings of the VLDB Endowment 5: 2032-2033. doi: 10.14778/2367502.2367572
[5]	Venkatraman S, Venkatraman R (2019) Big data security challenges and strategies. AIMS MATHEMATICS 4: 860-879. doi: 10.3934/math.2019.3.860
[6]	Masi I, Wu Y, Hassner T, et al. (2018) Deep face recognition: A survey. 2018 31st SIBGRAPI conference on graphics, patterns and images (SIBGRAPI), 471-478.
[7]	Singh A, Bhadani R (2020) Mobile Deep Learning with TensorFlow Lite, ML Kit and Flutter. Packt Publishing.
[8]	Zhu Y, Jiang Y (2020) Optimization of face recognition algorithm based on deep learning multi feature fusion driven by big data. Image Vision Comput 104: 104023. doi: 10.1016/j.imavis.2020.104023
[9]	Reddy KS, Krishna VV, Kumar VV (2016) A Method for Facial Recognition Based On Local Features. International Journal of Mathematics and Computation 27: 98-109.
[10]	Qateef JS, Kazm AA (2016) Facial expression recognition via mapreduce assisted k-nearest neighbor algorithm. International Journal of Computer Science and Information Security 14: 170.
[11]	Sirovich L, Kirby M (1987) Low-dimensional procedure for the characterization of human faces. Josa a 4: 519-524. doi: 10.1364/JOSAA.4.000519
[12]	Turk MA, Pentland AP (1991) Face recognition using eigenfaces. Proceedings. 1991 IEEE computer society conference on computer vision and pattern recognition, 586-587. IEEE Computer Society.
[13]	Bruce V, Young A (1986) Understanding face recognition. British journal of psychology 77: 305-327. doi: 10.1111/j.2044-8295.1986.tb02199.x
[14]	Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition.
[15]	He X, Yan S, Hu Y, et al. (2005) Face recognition using Laplacianfaces. IEEE T Pattern Anal 27: 328-340. doi: 10.1109/TPAMI.2005.55
[16]	Deng J, Guo J, Xue N, et al. (2019) Arcface: Additive angular margin loss for deep face recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4690-4699.
[17]	Zhou E, Cao Z, Yin Q (2015) Naive-deep face recognition: Touching the limit of LFW benchmark or not? arXiv preprint arXiv: 150104690.
[18]	Wang H, Wang Y, Zhou Z, et al. (2018) Cosface: Large margin cosine loss for deep face recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 5265-5274.
[19]	Wen Y, Zhang K, Li Z, et al. (2016) A discriminative feature learning approach for deep face recognition. European conference on computer vision, 499-515.
[20]	Deng J, Zhou Y, Zafeiriou S (2017) Marginal loss for deep face recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 60-68.
[21]	Ding H, Zhou SK, Chellappa R (2017) Facenet2expnet: Regularizing a deep face recognition net for expression recognition. 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), 118-126.
[22]	Wang F, Chen L, Li C, et al. (2018) The devil of face recognition is in the noise. Proceedings of the European Conference on Computer Vision (ECCV), 765-780.
[23]	Benhlima L (2018) Big data management for healthcare systems: architecture, requirements, and implementation. Advances in bioinformatics 2018.
[24]	Asaithambi SPR, Venkatraman R, Venkatraman S (2020) MOBDA: Microservice-Oriented Big Data Architecture for Smart City Transport Systems. Big Data and Cognitive Computing 4: 17. doi: 10.3390/bdcc4030017
[25]	Costa C, Santos MY (2016) BASIS: A big data architecture for smart cities. 2016 SAI Computing Conference (SAI), 1247-1256.
[26]	He X, Wang K, Huang H, et al. (2018) QoE-driven big data architecture for smart city. IEEE Commun Mag 56: 88-93. doi: 10.1109/MCOM.2018.1700231
[27]	Lopez D, Manogaran G (2016) Big data architecture for climate change and disease dynamics. The human element of big data: issues, analytics, and performance, 301-331.
[28]	Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Communications of the ACM 51: 107-113. doi: 10.1145/1327452.1327492
[29]	Peralta D, Del Río S, Ramírez-Gallego S, et al. (2015) Evolutionary feature selection for big data classification: A mapreduce approach. Math Probl Eng 2015.
[30]	Gao W, Zhao X, Gao Z, et al. (2019) 3D Face Reconstruction From Volumes of Videos Using a Mapreduce Framework. IEEE Access 7: 165559-165570. doi: 10.1109/ACCESS.2019.2938671
[31]	Mahmoud SM, Habeeb RS (2019) Analysis of Large Set of Images Using MapReduce Framework. International Journal of Modern Education and Computer Science 11: 47. doi: 10.5815/ijmecs.2019.12.05
[32]	Apache Spark™. A unified analytics engine for large-scale data processing.
[33]	Hazarika AV, Ram GJSR, Jain E (2017) Performance comparision of Hadoop and spark engine. 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), 671-674.
[34]	Zaharia M, Chowdhury M, Das T, et al. (2012) Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. 9th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 12), 15-28.
[35]	Luengo J, García S, Herrera F (2012) On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl Inf Syst 32: 77-108. doi: 10.1007/s10115-011-0424-2
[36]	Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17: 519-533. doi: 10.1080/713827181
[37]	Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE T Syst Man Cy B, 408-421.
[38]	Sánchez JS, Barandela R, Marqués AI, et al. (2003) Analysis of new techniques to obtain quality training sets. Pattern Recogn Lett 24: 1015-1022. doi: 10.1016/S0167-8655(02)00225-8
[39]	Garcia S, Derrac J, Cano J, et al. (2012) Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE T Pattern Anal 34: 417-435. doi: 10.1109/TPAMI.2011.142
[40]	Triguero I, Derrac J, Garcia S, et al. (2011) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE T Syst Man Cy C 42: 86-100. doi: 10.1109/TSMCC.2010.2103939
[41]	García-Gil D, Luengo J, García S, et al. (2019) Enabling smart data: noise filtering in big data classification. Inform Sciences 479: 135-152. doi: 10.1016/j.ins.2018.12.002
[42]	Xue B, Zhang M, Browne WN, et al. (2015) A survey on evolutionary computation approaches to feature selection. IEEE T Evolut Comput 20: 606-626. doi: 10.1109/TEVC.2015.2504420
[43]	Navot A, Shpigelman L, Tishby N, et al. (2005) Nearest neighbor based feature selection for regression and its application to neural activity. Advances in neural information processing systems 18: 996-1002.
[44]	Ramírez-Gallego S, García S, Xiong N, et al. (2018) BELIEF: A distance-based redundancy-proof feature selection method for Big Data. arXiv preprint arXiv: 180405774.
[45]	Triguero I, Peralta D, Bacardit J, et al. (2015) MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150: 331-345. doi: 10.1016/j.neucom.2014.04.078
[46]	García-Gil D, Ramírez-Gallego S, García S, et al. (2018) On the Use of Random Discretization and Dimensionality Reduction in Ensembles for Big Data. International Conference on Hybrid Artificial Intelligence Systems, 15-26.
[47]	Triguero I, Galar M, Vluymans S, et al. (2015) Evolutionary undersampling for imbalanced big data classification. 2015 IEEE Congress on Evolutionary Computation (CEC), 715-722.
[48]	Triguero I, Galar M, Merino D, et al. (2016) Evolutionary undersampling for extremely imbalanced big data classification under apache spark. 2016 IEEE Congress on Evolutionary Computation (CEC), 640-647.
[49]	Ramírez-Gallego S, García S, Benítez JM, et al. (2018) A distributed evolutionary multivariate discretizer for big data processing on apache spark. Swarm Evol Comput 38: 240-250. doi: 10.1016/j.swevo.2017.08.005
[50]	Maillo J, Triguero I, Herrera F (2015) A mapreduce-based k-nearest neighbor approach for big data classification. 2015 IEEE Trustcom/BigDataSE/ISPA 2: 167-172. doi: 10.1109/Trustcom.2015.577
[51]	Maillo J, Ramírez S, Triguero I, et al. (2017) kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data. Knowl-Based Syst 117: 3-15. doi: 10.1016/j.knosys.2016.06.012
[52]	Deng Z, Zhu X, Cheng D, et al. (2016) Efficient kNN classification algorithm for big data. Neurocomputing 195: 143-148. doi: 10.1016/j.neucom.2015.08.112
[53]	Gallego A-J, Calvo-Zaragoza J, Valero-Mas JJ, et al. (2018) Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation. Pattern Recogn 74: 531-543. doi: 10.1016/j.patcog.2017.09.038
[54]	Wang F, Wang Q, Nie F, et al. (2018) Efficient tree classifiers for large scale datasets. Neurocomputing 284: 70-79. doi: 10.1016/j.neucom.2017.12.061

1.	Sophia Diana Rozario, Sitalakshmi Venkatraman, Malliga Marimuthu, Seyed Mohammad Sadegh Khaksar, Gopi Subramani, Creating Smart Cities: A Review for Holistic Approach, 2021, 4, 2571-5577, 70, 10.3390/asi4040070
2.	Yidi Zhang, M. Praveen Kumar Reddy, Hadoop Small Image Processing Technology Based on Big Data Processing and Its Application Effect in Face Feature Extraction and Face Recognition System Design, 2022, 2022, 1875-905X, 1, 10.1155/2022/7493441
3.	Fadhil Hidayat, Ulva Elviani, George Bryan Gabriel Situmorang, Muhammad Zaky Ramadhan, Figo Agil Alunjati, Reza Fauzi Sucipto, Face Recognition for Automatic Border Control: A Systematic Literature Review, 2024, 12, 2169-3536, 37288, 10.1109/ACCESS.2024.3373264
4.	Areeba Umair, Elio Masciari, Muhammad Habib Ullah, Vaccine sentiment analysis using BERT + NBSVM and geo-spatial approaches, 2023, 79, 0920-8542, 17355, 10.1007/s11227-023-05319-8
5.	Nitesh Thapliyal, Madhu Sharma Gaur, 2023, Security Threats in Healthcare Big Data: A Comparative Study, 979-8-3503-2391-7, 32, 10.1109/CISES58720.2023.10183402
6.	Huiyun He, 2023, Research on Computer Interaction Design Based on User Emotion Recognition Algorithm, 979-8-3503-0082-6, 1, 10.1109/NMITCON58196.2023.10276275
7.	Chhavi Dixit, Shashank Mouli Satapathy, Deep CNN with late fusion for real time multimodal emotion recognition, 2024, 240, 09574174, 122579, 10.1016/j.eswa.2023.122579
8.	Dyala R. Ibrahim, Hisham A. Shehadeh, Mohammad A. Aladaileh, Kamal Alieyan, Ghaith M. Jaradat, We’am Telfah, Xiaopeng Wang, 2023, 2979, 0094-243X, 070005, 10.1063/5.0174736

AIMS Electronics and Electrical Engineering

Proposed big data architecture for facial recognition using machine learning

Related Papers:

Abstract

1. Introduction

2. Related work

2.1. Feature engineering and hyper parameter tuning for images and videos

2.2. Facial recognition using learning algorithms

2.3. Big data architecture and technologies

3. Proposed big data architecture

3.1. Big data technologies

3.2. Proposed solution model

3.3. Big data pre-processing

3.4. Big data classification

4. Proposed classification (ML) model for facial recognition

4.1. K-Nearest Neighbors (k-NN)

4.2. K-Means clustering

4.3. Decision tree

5. Prototype implementation and results

6. Conclusion

Acknowledgements

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog