An improved self-supervised learning for EEG classification

Yanghan Ou; Siqin Sun; Haitao Gan; Ran Zhou; Zhi Yang; Yanghan Ou; Siqin Sun; Haitao Gan; Ran Zhou; Zhi Yang

doi:10.3934/mbe.2022325

Mathematical Biosciences and Engineering

2022, Volume 19, Issue 7: 6907-6922. doi: 10.3934/mbe.2022325

Previous Article Next Article

Research article Special Issues

An improved self-supervised learning for EEG classification

1.
School of Computer Science, Hubei University of Technology, Wuhan 430068, China
2.
Wuhan Third Hospital (Tongren Hospital of Wuhan University), Wuhan 430074, China
3.
Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China

Academic Editor: Domenico Tegolo
‡ The authors contributed equally to this work.

Received: 15 January 2022 Revised: 29 April 2022 Accepted: 02 May 2022 Published: 09 May 2022

Motor Imagery EEG (MI-EEG) classification plays an important role in different Brain-Computer Interface (BCI) systems. Recently, deep learning has been widely used in the MI-EEG classification tasks, however this technology requires a large number of labeled training samples which are difficult to obtain, and insufficient labeled training samples will result in a degradation of the classification performance. To address the degradation problem, we investigate a Self-Supervised Learning (SSL) based MI-EEG classification method to reduce the dependence on a large number of labeled training samples. The proposed method includes a pretext task and a downstream classification one. In the pretext task, each MI-EEG is rearranged according to the temporal characteristic. A network is pre-trained using the original and rearranged MI-EEGs. In the downstream task, a MI-EEG classification network is firstly initialized by the network learned in the pretext task and then trained using a small number of the labeled training samples. A series of experiments are conducted on Data sets 1 and 2b of BCI competition IV and IVa of BCI competition III. In the case of one third of the labeled training samples, the proposed method can obtain an obvious improvement compared to the baseline network without using SSL. In the experiments under different percentages of the labeled training samples, the results show that the designed SSL strategy is effective and beneficial to improving the classification performance.

Keywords:

Citation: Yanghan Ou, Siqin Sun, Haitao Gan, Ran Zhou, Zhi Yang. An improved self-supervised learning for EEG classification[J]. Mathematical Biosciences and Engineering, 2022, 19(7): 6907-6922. doi: 10.3934/mbe.2022325

Related Papers:

[1]	Keying Du, Liuyang Fang, Jie Chen, Dongdong Chen, Hua Lai . CTFusion: CNN-transformer-based self-supervised learning for infrared and visible image fusion. Mathematical Biosciences and Engineering, 2024, 21(7): 6710-6730. doi: 10.3934/mbe.2024294
[2]	Yufeng Li, Chengcheng Liu, Weiping Zhao, Yufeng Huang . Multi-spectral remote sensing images feature coverage classification based on improved convolutional neural network. Mathematical Biosciences and Engineering, 2020, 17(5): 4443-4456. doi: 10.3934/mbe.2020245
[3]	Basem Assiri, Mohammad Alamgir Hossain . Face emotion recognition based on infrared thermal imagery by applying machine learning and parallelism. Mathematical Biosciences and Engineering, 2023, 20(1): 913-929. doi: 10.3934/mbe.2023042
[4]	Shuai Cao, Biao Song . Visual attentional-driven deep learning method for flower recognition. Mathematical Biosciences and Engineering, 2021, 18(3): 1981-1991. doi: 10.3934/mbe.2021103
[5]	Eric Ke Wang, Nie Zhe, Yueping Li, Zuodong Liang, Xun Zhang, Juntao Yu, Yunming Ye . A sparse deep learning model for privacy attack on remote sensing images. Mathematical Biosciences and Engineering, 2019, 16(3): 1300-1312. doi: 10.3934/mbe.2019063
[6]	Jun Gao, Qian Jiang, Bo Zhou, Daozheng Chen . Convolutional neural networks for computer-aided detection or diagnosis in medical image analysis: An overview. Mathematical Biosciences and Engineering, 2019, 16(6): 6536-6561. doi: 10.3934/mbe.2019326
[7]	Akansha Singh, Krishna Kant Singh, Michal Greguš, Ivan Izonin . CNGOD-An improved convolution neural network with grasshopper optimization for detection of COVID-19. Mathematical Biosciences and Engineering, 2022, 19(12): 12518-12531. doi: 10.3934/mbe.2022584
[8]	Danial Sharifrazi, Roohallah Alizadehsani, Javad Hassannataj Joloudari, Shahab S. Band, Sadiq Hussain, Zahra Alizadeh Sani, Fereshteh Hasanzadeh, Afshin Shoeibi, Abdollah Dehzangi, Mehdi Sookhak, Hamid Alinejad-Rokny . CNN-KCL: Automatic myocarditis diagnosis using convolutional neural network combined with k-means clustering. Mathematical Biosciences and Engineering, 2022, 19(3): 2381-2402. doi: 10.3934/mbe.2022110
[9]	Bakhtyar Ahmed Mohammed, Muzhir Shaban Al-Ani . An efficient approach to diagnose brain tumors through deep CNN. Mathematical Biosciences and Engineering, 2021, 18(1): 851-867. doi: 10.3934/mbe.2021045
[10]	Zijian Wang, Yaqin Zhu, Haibo Shi, Yanting Zhang, Cairong Yan . A 3D multiscale view convolutional neural network with attention for mental disease diagnosis on MRI images. Mathematical Biosciences and Engineering, 2021, 18(5): 6978-6994. doi: 10.3934/mbe.2021347

Abstract

1. Introduction

A fast diagnosis is an emergent issue in the present situation like-COVID-19 test ^[1,2]. It is a routine test of viruses following the R.T.–PCR method. However, this test is carried out sequentially. So there is a chance of a high FNR ratio. A test like this will take a longer time to complete. Besides that, there arises a shortage of R.T.–PCR test kits. So there is a severe need for alternative tactics to diagnose patients more quickly to manage these situations. The infrared image is self-sufficient for identifying these diseases by measuring temperature as a fast finding. C.T. scan and other pathological tests are essential in evaluating a patient with a suspected pandemic infection ^[3,4]. Moreover, for detection deep precognitive analysis is applied for the presentation of diseases detection during pandemic situations. In such situation biologically-Inspired convolution fuzzy network would be more effective ^[5]. However, a patient's radiological findings may be expected at first. In this paper, we tried to collect information by an infrared camera of a patient's eye-retina to measure temperature by using a smartphone with an advanced camera enriched with infrared features. Almost every mobile phone with an exemplary configuration has this feature. If not, an infrared-enabled application can be installed from the Google Play Store. Then it will be noted, and the next course of action will be taken. Instead of using a regular infrared camera, we want to incorporate a mobile camera with its features. So this technique ensures both applications (through mobile and infrared cameras).

There are numerous applications of infrared images for detecting humans and their body parts for visual surveillance, human-activity tracking ^[6], medical applications for eye-diseases detection ^[7], driver safety, homeland security, etc. An essential possibility is monitoring people by CCTV and surveillance camera-based systems. Infrared images are also related to authenticating human facial parts ^[8,9]. The infrared camera can capture the images in any lighting conditions, while an RGB camera frequently needs proper lighting to capture high-resolution-based clear images. Meanwhile, infrared imaging schemes use infrared light sources to produce a healthier image without natural or artificial light. The traditional face recognition systems depend on self-reporting of manual measurement schemes ^[10], but the infrared technique depends on an automated system ^[11]. The traditional system is weak in checking it continuously and quickly. Researchers looked into contactless sensors, infrared cameras, and infrared cameras to solve these issues. Since temperature is directly connected to the physical parts of a human face like the eyes, forehead, cheek, lips, etc., few researchers applied the physical parts to identify temperature. Lips, mouth, and eyebrows are intimately related to different temperature measurement schemes, so a projected model is addressed. In work ^[10] proposed a method to identify human faces by visual features like the blinking ratio of closed and open eyes. In additional work, researchers noticed by tracing 3D face images using mobile phones. In their recent study M.Kim, B. H. Kim, S. Jo ^[12] developed a contactless-based real-time scheme for recognizing a driver operating a car in a sleeping or drowsing mood.

The use of a digital camera is a normal phenomenon. However, due to various good characteristics, the infrared camera and its imaging techniques are used in various applications like pandemic disease identification. It is proposed in recent literature that the computational timing of an infrared imaging technique is less than that of an RGB imaging technique ^[13] to work by taking advantage of these changes. Using different imaging types, the researchers have shown their efficiency in quantitative measurement. For example, in work ^[14] researchers achieved 65% accuracy utilizing images, but researchers ^[15] achieved more than 91% accuracy using infrared images. It demonstrates that it is possible to achieve a greater accuracy rate in recognizing anything by employing infrared imaging techniques.

The following are the key contributions of our manuscript.

● We proposed an infrared imaging technique to identify the pandemic and make accurate and quicker measurements.

● Few parts of a face are considered for measuring temperature. So a segmentation technique is used in dividing a face into various classes.

● A novel searching technique will be incorporated to search the regions from left to right.

The paper is separated into sub-sections as follows:

Section 2 illustrates related works. The proposed tracking scheme is described in section 3 in detail. The image registration process with its sub-sections (Pre-processing, Segmentation by improved gradient method, Feature extraction, Hybrid adaptive optimized classifier system, Head pose estimation and correction, Feature weights extraction, Regions detection, Vector formation from the segmented region) has been demonstrated in section 4. The feature extraction technique is described in section 4. The findings of the experiment are presented in section 5. In the final step, a decision is reached, and then we discuss the path that our work will take in the future.

2. Related works

Mobile and other electrical gadgets are used for face tracking ^[16] from a short distance in many applications nowadays. However, it may be unsuitable in many situations. Many scholars took facial signals for security and safety to control some surveillance-related issues. A video-based human identification system might have offered a proficient nonintrusive explanation for our day-to-day use. A video-based imaging approach routinely uses two imaging methods, i.e., infrared and color. In work ^[17], projected an improved resolution-based infrared facial image database by extensive manual explanations and clarified the flexibility in various applications. They proposed a set of correlated algorithms for detection and approximation within a facial image. They advised that a multi-process description by networking middleware resolve be used. It might be used for real-time face authentication and recognition using infrared pictures. By evaluating the frontal view of facial regions, scholars ^[18] proposed how infrared video can be exploited with the help of the SIFT flow approach. Furthermore, eye-tracking using video and its effectiveness has been presented along with the efficiencies suggested ^[19]. Finally, the evaluation is finished using the face-temperature histogram value.

Surveillance systems that use network devices like cameras to monitor and track activities generate massive volumes of data ^[20]. Both the migration of data over bandwidth constraints and the accumulation of lags in network technologies are problems that need to be solved. It proposes constructing a decentralized facial recognition algorithm applying PCA and LBP ^[21] for a distributed surveillance system using wireless networks and cloud computing. In the regionalized face tracking approach, face recognition, feature extraction and matching ^[22] are done in two steps. First, face detection and feature extraction are done on a specified cloudlet adjacent to the security cameras, avoiding spending vast amounts of data on a distant processing center. Face matching, on the other hand, is done using the facial features vector in any private cloudlet environment. According to this study, the suggested approach works effectively in the Wireless Network-Cloud architecture of its usefulness in finding "lost" people.

In recent years, the essential part of human life has been security. The most significant consideration at this point is cost. This technology is quite advantageous in minimizing the cost of external movement monitoring. We provide a real-time recognition method in this study that would allow us to handle photographs fast ^[23]. The main objective of this manuscript is to recognize people so that the house and office can be secured. A PIR sensor is utilized to detect movement in a defined area. The Raspberry Pi will next capture the photographs. The face in the captured image will then be detected and recognized ^[24]. Finally, the photographs and notifications will be uploaded to a smartphone-based Wireless Network via the Telegram program. The proposed systems calculate in real-time, are rapid, and are low-cost. The experiments suggest that the proposed facial recognition system may be employed in real-time.

This work proposes a higher efficiency in face recognition approach based on characteristics that use a newly created process named Floor of Log (FoL). This method has the benefit of conserving space and energy while maintaining precision. The scholars ^[25] used K-Nearest Neighbours (KNN) and Support Vector Machine (SVM) ^[26] approach to discover the optimal factor of the FoL technique utilizing cross-validation. The correctness and size far ahead of compression of the suggested approach were assessed. In the Extended YaleB, AR, LFW and CelebA face datasets, the FoL produced better results than a technique with the equivalent classifiers ^[27] without compressed features, with 86% to 91% relative to the same data size. This study provides a robust and easy feature compression technique for FER applications for various parts recognition ^[28]. The FoL is a supervised compression approach that may be modified to get better results and is compatible with edge computing schemes.

The wireless network is a concept that integrates technology into our day-to-day activities by applying deep neural networks and convolution neural network (CNN) learning techniques ^[29,30]. In their FER system, the authors of work ^[31] consider introducing convolutional neural network (CNN) and lengthy as well as short-term memory (LSTM) techniques. On the other hand, researchers ^[32] implemented CNN using fewer data and the eight-folding approach, yielding improved results. One of the key categories in which skill assists us is safety and privacy. Smartphones may be used as a safety alert scheme because they are the most extensively used smart gadgets. Artificial intelligence (A.I.)-enabled intelligent wireless network devices have grown in popularity in recent years. This study developed an innovative network security solution for the smart home using the neuro-fuzzy optimized classifier ^[33]. A security system is built around a Raspberry Pi and a NoIR(no Infrared) Pi camera unit that records and captures images ^[34,35]. A PIR-MS (Passive infrared motion sensor) is also used to identify motion. We propose combining images and motion sensor records from the NoIR(no Infrared) Pi camera unit to identify a safety threat using our algorithm's facial recognition classification technique ^[36,37]. In the event of an emergency, the system can notify the user. The proposed system has a 95.5% accuracy and 91% precision in detecting any security threat.

The Visual Internet of Things (VIoTs) and wireless communication have gotten much interest in recent years because of their capacity to extract object position from scene picture information, attach an optical tag to the item, and then return scene object information to the wireless network. Face recognition is one of the ideal visual network methods since a person's face is an intrinsic label ^[38]. The researchers ^[39] developed a pose estimation technique to resolve the problem due to long-range pixels leading to poor performance in FER. However, due to a lack of processing resources, existing state-of-the-art facial acknowledgement methods based on huge deep artificial neural networks (ANN) ^[40] are challenging to implement in the implanted podium for the visual wireless network. To overcome this problem, we provide a small deep ANN-based facial recognition system for the VIoTs. The proposed technique employs deep neural networks with minimal complexity ^[41] to function in an embedded set.

Moreover, it can withstand changes in lighting and position. We exhibit comparable correctness and enactment results designed for the LFW authentication benchmark using the mobile facial acknowledgement dataset. Scholars ^[42] have demonstrated that a Facial Action Coding System (FACS) is used in the detection of the characterization of a human being and that this same system could be applied to the detection of the categorization of varied consumers' goods by employing the affective reactions of those selective consumers. Work ^[43] proved and implemented a method in real-time on an Android-built platform for panic-face detection ^[44], fatigue detection ^[45] by incorporating mutation, genetic algorithm ^[46,47] and many more in expression recognition. Moreover, scholars ^[48] have proved that a histogram approach becomes very appropriate in these predictions for quicker identification of depth measurement and various expression recognition. Alzheimer's disease is a progressive degenerative neurological illness. It is now incurable, and those who suffer from it are denied the freedom to leave their houses compared to the general population. This article aims to develop and form an IoT prototype that can identify Alzheimer's victimized people, thereby increasing the high worth of life and relaxing caregivers' jobs. The patient wears a small dorsal belt that contains a Node MCU ESP8266 board, a GPS module, and a small portable WiFi modem/router. The patient's location is tracked via a web application and an Android/iOS mobile application. This research also allows the Kalman Filter to track the patient's movement and estimate his position, especially when the patient wanders outside.

Pneumonia causes a high morbidity and mortality rate in infants. This sickness affects the lungs' tiny air sacs, necessitating rapid diagnosis and treatment. One of the most popular diagnostics for diagnosing pneumonia is a chest X-ray. This paper explains how to identify pneumonia in chest X-ray pictures using a real-time Wireless Network system. Three medical experts reviewed the information, including 6000 images of children's chest X-rays. This study adopted twelve alternative image Net-trained Convolutional Neural Networks (CNNs) ^[49,50] architectures as resource extractors. The CNN and deep neural network are very emergent if they are used in FER for disguised and distorted expression recognition ^[51]. Many prototype designs are proposed by scholars ^[52] based on deep CNN for identifying people's faces, iris, and fingers vein in our daily life. CNNs were then integrated with other learning approaches like KNN, Naive Bayes (N.B.), Random Forest (R.F.), Multilayer Perceptron (MLP) ^[53], and SVM. The best model for diagnosing pneumonia in these chest radiographs was the VGG19 architecture with the SVM classifier and RBF kernel. The accuracy precision scores for this combination were 96.47%, 96.46%, and 96.46%, respectively. Paralleled to other articles in the collected works, the proposed approach yielded better results for the measures used. These findings imply that using a real-time Wireless Network system to detect pneumonia in children is beneficial and could be utilized as a diagnostic tool in the future. Doctors will obtain faster and more precise findings using this technology, allowing them to provide the best treatment possible.

Moreover, many works use machine learning-based recognition approaches. Researchers employ machine learning models such as SVM, CNN, and ANN (Artificial Neural Network), Genetic Algorithm for facial identification. A CNN is a deep learning approach ^[54,55] with a high-performance level and can extract features from training data. It achieves features from side to side using several convolutional layers and is often validated by a series of totally related layers ^[30]. On the CK+ database, investigators ^[32] raised CNN accuracy to 96.76%. Improved pre-processing processes such as sample formation and intensity normalization have enriched accuracy. A BDBN with numerous facial expression classifications was employed in another investigation. The accuracy on the CK+ database was 96.7 %, whereas it was 91.8 % on the JAFFE database. On the other hand, the execution period lasts for eight days.

In In this study, the proposed technique saves processing time while improving recognition performance using wireless communication accompanied by a cloudlets server. The proposed method uses numerous pre-processing techniques in conjunction with CNN to achieve excellent accuracy. It is also recommended that A.R.s focus on critical data in categorization to forecast projected expressions. This also helps to cut down on processing time. In addition, parallelism ^[30] is frequently employed to improve speed and precision.

Finally, we conclude that techniques based on appearance can obviate the need for meticulously created visual pieces to characterize a gaze in the eye-tracking system. However, we will incur a substantial penalty in execution time and storage space if we apply the complete input image to a classifier to forecast the gaze. Moreover, training eye image data, including poses and locations, is required during the development phase.

3. Proposed tracking scheme

Classifying distinct face portions is the primary problem while tracking a human face and extracting features from an infrared-facial image. Poses have a direct impact on a person's facial expression. The head movement and poses of a gaze vector are inextricably linked. We suggest a two- part flow diagram. We illustrated it by our experiment's results in Figure 2. First, we use all strategies to build Experiment-Ⅰ's face region feature vectors.

Figure 1. Infrared image and its divisions into six classes.

DownLoad: Full-Size Img PowerPoint

Figure 2. Facial image is separated into Six Classes and Twelve Regions.

DownLoad: Full-Size Img PowerPoint

Figure 3. Proposed flow diagram for implemented infrared image retrieval classification and analysis system.

DownLoad: Full-Size Img PowerPoint

Here we have followed the processes of registering an image, Sequence Maintenance of its sequence, Central-point and facial regions detection. Then, using the mapping function obtained from getting processes, we examined the correlation among the class calibrations in Experiment Ⅱ. The system will proceed to Experiment Ⅱ after completing the calibration process. Otherwise, it returns to the beginning of Experiment Ⅰ. Finally, from the grouped regions, we proceeded to retrieve attributes.

4. Image registration

Image registration is required to guarantee acceptable infrared image dataset series collected from a facial image. First, the image must be registered in the database if it has not already been done. As a result, the proposed technique can help detect picture duplication during image registration. During the image registration process, video images are converted into frames.

4.1. Pre-processing

Due to noises in the images, traditional recognition methods have proved that obtaining correct portions of any part of a human face in unusual settings (low lighting, dark, rainy period, or natural calamity) is exceedingly challenging. We applied the notion of using histograms to avoid difficulties like this, which assures that any lighting effect does not cause too many complications. Normal and sensitive histograms can be used to implant 3D data. The noises can be removed here in this stage from the infrared image. It is common practice to use nonlinear optical filtering to remove noise from an image. Pre-handling is done to work on a different type of filtering used in the infrared image. In our application, we applied Adaptive Weiner Filter (AWF) ^[43] to remove Gaussian Noise and other noises accompanying the infrared images we have used in our application.

Algorithm 1. Pose estimation (PE) algorithm

${P}_{m}=PE({P}_{2D}, {P}_{3D}, {f}_{w}, f)$

$Input:{P}_{2D}, {P}_{3D}, {f}_{w}, f$

$X=Q\left[{P}_{2D}\right]; c=R\left[X\right]$

$Y={P}_{{2D}_{x}/f}; X={P}_{{2D}_{y}/f}$

$h=\left[{P}_{3D}, c\right]; o=Pinv\left(h\right)$

$while\left(true\right)$

$\{j=oxY; k=oxX;$

${L}_{z}=\frac{1}{\sqrt{(\frac{1}{j}}x\frac{1}{k})}$

${p}_{{m}_{1}}=jx{L}_{z}$ ;

${p}_{{m}_{2}}=kx{L}_{z};$

${R}_{{m}_{1}}={p}_{{m}_{1}}\left(1:6\right); {R}_{{m}_{2}}={p}_{{m}_{2}}\left(1:6\right);$

${R}_{{m}_{3}}=\frac{{R}_{{m}_{1}}}{‖{R}_{{m}_{1}}‖}x\frac{{R}_{{m}_{2}}}{‖{R}_{{m}_{2}}‖}$ ;

${p}_{{m}_{3}}=[{R}_{{m}_{3}}, {L}_{z}]$ ;

$c=hx\frac{{p}_{{m}_{3}}}{{L}_{z}};$

$YY=Y; XX=X;$

$Y=c.fw.{P}_{{2D}_{x}};$

$X=c.fw.{P}_{{2D}_{y}};$

${E}_{x}=Y-{Y}^{2};{E}_{y}=X-{X}^{2}$

${if(‖E‖ < E}_{x})$

${\{p}_{{m}_{6}}={[{p}_{{m}_{1}}\left(1:6\right){p}_{{m}_{2}}\left(1:6\right), {p}_{{m}_{3}}\left(1:6\right), {p}_{{m}_{4}}\left(1:6\right), {p}_{{m}_{5}}\left(1:6\right), {p}_{{m}_{6}}\left(1:6\right), {L}_{z}, 1]}^{tm};$

$break;$

$\}$

$Output:{P}_{m}$

| Show Table

DownLoad: CSV

This is a straightforward way of determining the safest method for re-establishing a magnificent sign. The suggested study employs AWF to handle photo positions efficiently. The AWF is used to simplify the image with the most negligible fluctuation. Histograms are commonly balanced to improve the picture's uniformity. Histogram correction is a computer-assisted process for enhancing visual contrast. The prime consciousness esteems been fundamentally increased, i.e. the variety of image strength has been broadened. It creates a less close association with the development of district ties. As a result, following histogram correction, the usual picture contrast increases.

${N}^{x} = \frac{P}{\mathrm{T}\mathrm{P}}$

(1)

Where x = 0, 1 & -1; P = number of pixel, TP = > Total number of pixels

If we want to calculate the histogram equalized image then we may follow the equation below

$H{Q_{m,n}} = lo{g^{{e^{\left( {1 - x} \right)}}}}\left( {\sum_{x = 0}^{{b^{m,n}}} {\left( {{N^x}} \right)} } \right)$

(2)

Where ${log}^{{e}^{\left(1-x\right)}}$ represent nearest neighbourhood integer value.

The similar representation with respect to the pixel intensity value is as follows:

$\frac{\partial P}{\partial x}\left(\int_0^P H Q(x) = \partial P(x) \partial y = \partial P\left(P^{\frac{1}{x}} . \mathrm{P}\right) \delta / \delta P\right.$

(3)

Finally, the probability distribution function (PDF) can be illustrated uniformity as $\frac{\partial P}{\partial x}$ .

However, the outcome shows that the equalization method can smooth and improve histograms.

4.2. Segmentation by improved gradient method

Image line boundary detection ^[47], also known as edge identification is critical in visual interpretation. Edges store massive amounts of info. As a result, the image size is drastically decreased, and less restorative material is combed through, preserving the image's essential core elements. An edge location is extensively employed in image separation because borders frequently appear at picture object boundaries. We incorporated the AROI by selecting from the six divided classes of the face regions. The restrictions of an aim depicted on an image or volume are calculated here. This method of dividing determines whether or not neighboring pixels of starting seeds should be added. They are combining pixels or sub-regions with a local tool. The simplest of these processes is pixel amalgamation, which starts with a collection of "Images" focuses and develops regions by linking pixels with common characteristics.

${AROI}^{S} = \sum\limits _{m, n\in Q2}^{S}Vel.\left({Hp}_{m, }{Lp}_{n}\right)P.log{d}_{m}+\delta \int {e}_{m}\partial P$

(4)

Where ${AROI}^{S}$ is the separated AROI, $Vel$ is the velocity measured by gradient value, ${Hp}_{m, }{Lp}_{n}$ are the pixels values of low and high, $log{d}_{m}$ represents the spatial image size, $\delta$ signifies the image frequency coefficient, ${d}_{m}$ is the distance between two pixels.

4.3. Feature extraction

We used a grey level co-occurrence matrix to understand texture characteristics. It represents the grey level. The input image's spatial information determines the probability of matches with values obtained. Irrespective of these traits, this method examines 18 texture attributes. We applied the image retrieval method by providing a sample image file that retrieval images from a vast dataset that seems to be adjacent. A range of infrared images is used to evaluate the algorithm's performance upon that texturing dataset. We propose sixteen texture classes. Each texture image is broken into six sub-images for all these examples. Many images are obtained depending on the distance between the queries among the data set. The image features are extracted and used in the investigation. Grey level co-occurrence matrix can choose the pixel frequency inside the individual result. The segmentation's directional value can then be used to erase the image attributes utilized in the segmentation. The following is an example of the grey layer co-occurrence matrix technique:

$(m, n) = Vc(m, n, u, v)\sum\limits _{m = 1}^{H}Vc(m, n, u, v)$

(5)

Where Vc is the vector, m; n; u; v; are the pixel values with respect to high and low; C is the image characteristics. We used a grey layer co-occurrence matrix to gain the various attributes for feature extraction. Finally, the features are chosen based on texture and color.

4.4. Hybrid adaptive optimized classifier system

We incorporated the Enhanced Cuckoo Search Optimization (ECSO) algorithm ^[33] to gain the Adaptive Optimized classification (AOC). The Cuckoo Search algorithm (CSA) evaluates the inconsistency characteristics. Moreover, CSA is proposed to minimize the cost of network congestion we may face while collecting input images. In the ECSO algorithm, we have a guideline that we will place a random value on each node during selecting an arbitrary node on the cloudlets. The next cloudlet will move to the most vital node with the most significant number of images. The host server is static, and the cuckoo value possibly will be calculated in coincidence with the possibility of Pr[0; 1] by the host value. To overcome the challenge posed by network congestion and picture optimization variables, ECSO is a required method.

Algorithm 2 ECSO

${Input:Image\_Features(Img}_{ftr}), Image\_Cordinate({Img}_{cor})$

${Output:Classified \; Value(CL}_{val})$

$Begin\; to\; compute\; The\; Random\; Value\; \left(RV\right)$

$form=1:Range({Img}_{ftr}, 1)$

$forn=1:Range({Img}_{ftr}, 1)$

$Distance(m, n)=\sqrt{({Img}_{ftr}\left(m, 1\right)+{Img}_{ftr}\left(m-1\right)+{({Img}_{ftr}\left(m, 1\right)-{Img}_{ftr}\left(n, 1\right))}^{2}}$

$End$

${Img}_{ftr}= {Img}_{ftr}Distance(m, n)$

$RV=({{Img}_{ftr}}^{Distance(m, n)}$ )

$Class\; label\left(CL\right)=unique\left(node\right)$

$L=lenth\left(CL\right)$

$form=1:L$

$T=mean\_all\_cloudlet\_node\left(Img\right)$

$X\left(Img, n\right)=-\frac{1}{2}xTxmean\_all\_cloudlet\_node+\mathrm{l}\mathrm{o}\mathrm{g}\left(m\right)$

$X\left(Img, 2:end\right)=T$

$End$

| Show Table

DownLoad: CSV

Using linear discrimination analysis (LDA), the significance of respective image is considered after the infrared images are categorized. The results of the used classifier are given to the LDA. This approach measures the precision of picture statistics. The infrared images classification involves a remarkable role. It is usual practice to utilize LDA, a data inquiry tool, to reduce the dimension of numerous interconnected variables while maintaining the maximum amount of relevant data. As a type of image categorization, LDA analysis is valid. Matrix creation is demonstrated for use in operators performing image processing. We illustrated the LDA investigation processed by the following equations. The LDA function collects samples in the first phase to prepare properties from testing datasets. This dataset will be built from the six divided classes. The training datasets will be prepared and collected from the twelve regions. The LDA has the K classes (K < = 6) and R regions(R < = 12). There will be vector representation for each class. It will be represented in multidimensional space. In choosing the features from the regions, we applied the searching algorithm (ECSO)) as stated by Algorithm 2. The sensitive picture histogram is exhibited as shown in Figure 4. There are four images in this set, each of which has been converted utilizing sensitive histograms and various illuminations.

Figure 4. Dissimilar illuminations with image histograms equalization.

DownLoad: Full-Size Img PowerPoint

4.5. Features weights extraction

As shown in Figure 5, the features' weights (F.W.) must now be determined, establishing the image's critical pixels and weights. The importance of the features represented by the pixels is reflected in the weights. For example, a set of features represents an A.R. The method of locating POI and important A.R.s such as the nose-tip, RE, L.E., and regions of lips are depicted in Figure 5. Essential qualities are given larger weights to improve the dependability of information, which improves recognition.

Figure 5. Recognition from Active Regions.

DownLoad: Full-Size Img PowerPoint

The following phase separates the features. The Stochastic Face Shape Model (SFSM) and Optimized Principal Component Analysis (OPCA) were used to compare patterns on original and altered pictures (after the corrections of angles). From there, the features are retrieved and expressed in vector form. When the vectors are formed, phase-I is done. It is worth mentioning that an image-based sequence should be kept. The procedure of extracting facial features and preparing vectors is described below.

This section explains how to track facial features in video sequences and use the head posture estimation technique. Previous head pose estimate (P.E.) methods have relied on a stereo camera to provide correct 3-D information for head pose and make the necessary correction by rotation and normalization ^[35]. However, a head model's complex illustrations and exact starting value make a real-time clarification perfect. For example, the human Head is frequently revealed by an ellipsoid. Therefore, the cylindrical head model, CHM, calculates head position with 3-D positions on the corresponding sinusoidal surface. The 2-D to 3-D alteration scheme is utilized to gain the head posture statistics when a 2-D facial characteristic is traced in the individual video frame. Pose scaling with iterations is a two-dimensional to three-dimensional adaptation. Because the 2-D face characteristics have a variety of ramifications when recreating the position.

4.6. Different regions detection

Our proposed method will detect the regions based on the feature weights of the corresponding regions. First, we smoothed the ocular region using L0 as per Gradient Minimization Method (GMM) to estimate the radius because it supports to remove noise on an image pixel. Then, we used the canny edge detector on the ocular areas. We get a few invalid edges here, which we can filter out with a filter. Finally, we collect the related information from the identified regions depending on E1 and E2.

${E_1} = \sum \left( {{R_E}{\rm{x}}{C_R}} \right)$

(6)

${E_2} = \sqrt {G_X^2 + G_Y^2}$

(7)

Where R_E and C_R represent the regions, we measure any region's radius based on the values of R_E and C_R. The pixels are measured horizontally and vertically. G_X and G_Y, respectively, represent these two values. To detect any region, we have to lessen the intensity of that region and make the most of the strength or weightage of that region. The parameter τ controls the trade-off. That is

${U}_{c}, {V}_{c} = \underset{(x, y)}{\mathrm{min}}\left\{{E}_{1}\left(U, V\right)-T.\left({\int }_{-x/5}^{x/5}{E}_{1}(U, V)\delta s+{\int }_{4x/5}^{6x/5}{E}_{2}(U, V)\delta s\right)\right\}$

(8)

Where ( ${U_c}, {V_c}$ ) is the coordinate of that region. It establishes the intervals between ([ $- \frac{1}{5}\pi$ , $\frac{1}{5}\pi$ ] & [ $\frac{4}{5}\pi$ , $\frac{6}{5}\pi$ ]) because the regions do not intersect with each other.

4.7. Vector Formation from segmented region

When G represents the identified region and p represents the vector of that region. To execute a mapping function, it transmits gaze data. The user will give a calibration procedure, and the region vectors will be registered. The mapping function links the vector of that region. Its coordinate's value is measured by (U, V). We applied the SLM, i.e., a simple linear SVM model. We also incorporated the polynomial model (PM) to establish the appropriate mapping functions. In addition, we applied the second-order and third-order polynomial functions in the calibration stage to gain better-synthesized results. As a result, a mapping algorithm can accurately be determined based on a segregated frame.

5. Experimental evaluation

We completed our experiment based on our quantitative and qualitative evaluation algorithm. Therefore, we have organized our entire procedure into five regions, and the execution will be done in parallel in a time-sharing manner. In this testing phase, we are looking into forehead, eyes, nose-tip, and lip feature detection.

5.1. Active regions detection

5.2. Classification based on CNN-Tree-Level Method

Recognition and feature extraction of an eye is a thrilling job. We propose the EPDF function to detect the eye area accurately. In this respect, the CK+ and NAVIE datasets are used in our experiment. The CK+ dataset has 1000 grayscale images using ten subjects and different lighting conditions with different scales. Our experiment has a higher recognition accuracy, i.e., 91.73% in RGB images and 92.39% accuracy in detection by twenty-one infrared images. FER is a new field that requires a high rate of identification for accuracy. This study combines the AROI and CNN methods for identifying and classifying face expressions ^[35]. Five optimized active regions ^[30] of interest (OAROI), namely- The forehead, LE, RE, Nose-tip, and Lip region, are taken into account. The mentioned three AROIs are trained for CNN. The ultimate classification outcome is gained utilizing a decision tree-level synthesized method. The figure of the applied method is shown in Figure 6.

Figure 6. Schematic diagram of learning feature weight, classification of expression based on decision-tree level synthesis scheme.

DownLoad: Full-Size Img PowerPoint

The extents of OAR (optimized active regions) differ for each image. Therefore, the following steps are taken to achieve a similar result.

1. OAR of interests (OAROI) is measured by 120*80 pixels.

2. The features and their respective feature-weights of the used images have been learnt based on the CNN density layers and then by the sub-sampling layers simultaneously.

3. Lastly, it will pass through the Fully Connected Layer.

The fully-connected layers have applied the learned feature-weights to gain the classified expressions. Finally, one (1) is used as a hot code to mean that it will be labelled. There are five regions.

The hot code (1) size is five. An individual bit of the hot code resembles one class of regions. The hot index code (1) resulting in one area is showed. Here zero (0) bit implies false, and one (1) implies true. Figure 6 displays an example of the resultant region is the nose tip.

The final recognition result is attained in the last phase built on decision-tree level synthesized scheme. The decision-level fusion technique is being supported in the following way.

From the above, we can understand that if two or more classifiers need to be used to categorize the expression as the similar class i, i∈I, then the ultimate results after classification will be considered as j. Therefore, if two CNNs classifications are high, the synthesis result will be increased. On the other hand, when the results of five CNNs all will be dissimilar, we have to choose the outcome of the CNN among the five AROI (Forehead, LE, RE, Nose and Lip) based on the maximum accurateness, as it is shown in "Experimental Section".

${ Final\; Result }=\left\{\begin{array}{c} { i, \;if } 2 \; { or \; more \;CNNs } \\ { classify \;the \; expression } \\ \;{ as } \; i, { i \in I } \\ { The\; result \;of\; CNN \;for \;AROI } \\ { of \;Forehead, Left \;Eye,Right \; Eye, Lip } \\ {, otherwise } \end{array}\right.$

5.3. CNN Structure with Ten Fold Validity

The suggested CNN configuration is displayed in Figure 7. The CNN accepts 80*60 input infrared images. We incorporated 3 convolution layers. These three layers are being synthesized into subsampling layers. Sub-sampling layers correlate with the layers. The kernel sizes inside convolution layers, as shown in Figure 7, are divided into 5*5 and 2*2. The stride step point is fixed to 2 for each subsampling layer. As seen in Figure 7, the dimension and quantity of feature maps afterwards the convolution is 80*60*32. After that, an identical style is also applied to the remaining layers. In each of the two fully-connected layers, there are 1024 neurons. Finally, CNN generates the appropriate outputs based on the vote results of five different classed statements. The N (output), as shown in Figure 7, is calculated based on the images stored in the database.

Figure 7. CNN structure with convolution layers, sub-sampling layers and fully connected layers.

DownLoad: Full-Size Img PowerPoint

6. Performance analysis

The testing is accepted from ground truth and CK+ datasets. The performance in tracking accuracy is shown in Table 1. It is shown that CK+ image-set detection in the nose-tip region is always higher, i.e., 93.98%. At the same time, the accuracy in the forehead, left eye, and right eye religion is about 89.71%, 90.89%, and 92.57%, respectively. The lip region has better accuracy, i.e., 93.02%, than the eyes regions. The result of tracking by using infrared images became higher than RGB images. Using twenty-one infrared image sets, we achieved 94.35% on the nose tip region. In the same way, the accuracy in the forehead, LE, RE, and nose-tip regions are 91.21%, 91.47%, 93.31%, and 94.16%, respectively. So the average detection became 92.96% instead of 92.03%, which is comparatively higher than the RGB image.

Table 1. Recognition accuracy.

Applied images	Forehead region accuracy	Left eye region accuracy	Right eye region accuracy	Nose tip region accuracy	Lip region detection accuracy
RGB images	89.71%	90.89%	92.57%	93.98%	93.02%
Infrared images	91.21%	91.47%	93.31%	94.35%	94.46%

| Show Table

DownLoad: CSV

7. Conclusion

A quicker method is applied in recognizing facial temperature by applying the infrared image because the infrared image is self-sufficient to measure temperature quickly. Moreover, we incorporated an AI-based machine intelligence scheme that accelerated the diagnosis process. In choosing the active regions from the six classes and twelve regions, we applied the Enhanced Cuckoo Search Optimization (ECSO) algorithm by incorporating the wireless network using a cloudlets server so that processing becomes more accessible with minimal infrastructure, as stated by Algorithm 2. The devices (infrared or mobile camera) will help measure temperature first. Then, deep learning CNN will be applied during the record processing and analysis to yield better-synthesized results. Table 1 shows the results regarding the temperature measurement accuracy of five regions. We demonstrated that the recognition rate is always greater when using CK+ image-set identification within the nose-tip region, i.e., 93.98 %. The accuracy of the left and right eye religions is around 90.89 % and 92.57 %. The accuracy of the lip region is higher, at 93.02 %, than those of the eyes. The tracking outcome improved when infrared images were used instead of RGB images. We gained 94.35 % on the nose tip region by applying twenty-one infrared images.

Similarly, accuracy is 91.47 %, 93.31 %, and 94.16 % in the left eye, right eye, and nose-tip regions. As a result, the average recognition rate increased to 92.39 % from 91.73%, more significant than the RGB image set. For example, it is shown in Table 2 that by the CNN method applied to Forehead, LE, RE, Nose, and Lips region, we achieved an accuracy of 89.32%, 91.14%, 90.75%, 91.15%, and 93.32%, respectively. On the other hand, by incorporating our Decision tree-level synthesis method and ten-folded-validation technique applied to the Forehead, LE, RE, Nose, and Lips region, we achieved an accuracy of 91.36%, 94.14%, 93.49%, 94.44%, and 97.26% respectively. However, we achieved 3.29% greater accuracy by incorporating the "decision tree level synthesis scheme" and "ten-folded-validation method."

Table 2. Performance result analysis with different investigators.

Investigators	Applied Method	Accuracy found by applied Datasets
Investigators	Applied Method	JAFFE	CK+	MNI	NVIE	SFEW	OWN
S. Happy et al. ^[17]	SFP	85.06%	89.64%
Y. Liu et al. ^[21]	AUDN + 8-fold validity	63.40%	92.40%
L. Zhong, et al. ^[23]	CSPL		89.89%	73.53%
S.H. Lee, et al. ^[27]	SRC	87.85%	94.70%	93.81%
A.Mollahosseini et al. ^[29]	DNN		93.20%	77.90%
R. Saranya et al. ^[31]	CNN + LSTM with 10-fold validity			81.60%		56.68%	95.21%
A.T. Lopes et al. ^[32]	Active appearance method (AMM) Integrated AAM DAF Integrated DAF				66.24%
					71.72%
					79.16%
					91.52%
J.A. Zhanfu et al. ^[39]	AAM + infrared thermal + KNN				63.00%
	Bayesian networks (BNs)				85.51%
P. Shen, et al. ^[34]	KNN				73.00%
Our applied method	CNN method applied on
	1. Forehead region 2. Left-Eye region 3. Right-Eye region 4. Nose region 5. Lips region	89.32% 91.14% 90.75% 91.15% 93.32%
Our proposed method	Decision tree level synthesized technique and 10-fold validation method applied on 1. Forehead region 2. Left-Eye region 3. Right-Eye region 4. Nose region 5. Lips region		91.36%94.14% 93.49% 94.44% 97.26%
Our Achievement	We achieved 3.29% increased accuracy by incorporating decision tree level synthesized scheme and 10 folded validity.

| Show Table

DownLoad: CSV

Acknowledgments

We don't have any funding source during our study.

Conflict of interest

The authors declare there is no conflict of interest.

References

[1]	P. Sandheep, S. Vineeth, M. Poulose, D. P. Subha, Performance analysis of deep learning CNN in classification of depression EEG signals, in TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), IEEE, (2019), 1339–1344. https://doi.org/10.1109/TENCON.2019.8929254
[2]	S. M. Usman, S. Khalid, R. Akhtar, Z. Bortolotto, Z. Bashir, H. Qiu, Using scalp EEG and intracranial EEG signals for predicting epileptic seizures: Review of available methodologies, Seizure, 71 (2019), 258–269. https://doi.org/10.1016/j.seizure.2019.08.006 doi: 10.1016/j.seizure.2019.08.006
[3]	F. Lotte, L. Bougrain, A. Cichocki, M. Clerc, M. Congedo, A. Rakotomamonjy, et al., A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update, J. Neural Eng., 15 (2018), 031005. https://doi.org/10.1088/1741-2552/aab2f2 doi: 10.1088/1741-2552/aab2f2
[4]	B. J. Edelman, J. Meng, D. Suma, C. A. Zurn, E. Nagarajan, B. Baxter, et al., Noninvasive neuroimaging enhances continuous neural tracking for robotic device control, Sci. Rob., 4 (2019). https://doi.org/10.1126/scirobotics.aaw6844
[5]	Q. He, S. Du, Y. Zhang, G. Jiang, P. Xie, Classification of motor imagery based on single-channel frame and multi-channel frame, Yi Qi Yi Biao Xue Bao/Chin. J. Sci. Instrum., 39 (2018), 20–29. https://doi.org/10.19650/j.cnki.cjsi.J1803816 doi: 10.19650/j.cnki.cjsi.J1803816
[6]	J. M $\ddot{u}$ ller-Gerking, G. Pfurtscheller, H. Flyvbjerg, Designing optimal spatial filters for single-trial EEG classification in a movement task, Clin. Neurophysiol., 110 (1999), 787–798. https://doi.org/10.1016/S1388-2457(98)00038-8 doi: 10.1016/S1388-2457(98)00038-8
[7]	D. Huang, P. Lin, D. Fei, X. Chen, O. Bai, Decoding human motor activity from EEG single trials for a discrete two-dimensional cursor control, J. Neural Eng., 6 (2009). https://doi.org/10.1088/1741-2560/6/4/046005
[8]	R. Chatterjee and T. Bandyopadhyay, EEG based motor imagery classification using SVM and MLP, in 2016 2nd International Conference on Computational Intelligence and Networks (CINE), (2016), 84–89. https://doi.org/10.1109/CINE.2016.22
[9]	K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
[10]	M. Tan and Q. V. Le, Efficientnet: Rethinking model scaling for convolutional neural networks, in Proceedings of the 36th International Conference on Machine Learning ICML (eds. K. Chaudhuri and R. Salakhutdinov), 97 (2019), 6105–6114. Available from: http://proceedings.mlr.press/v97/tan19a/tan19a.pdf.
[11]	X. Liu, Y. Shen, J. Liu, J. Yang, P. Xiong, F. Lin, Parallel spatial-temporal self-attention cnn-based motor imagery classification for bci, Front. Neurosci., 14 (2020). https://doi.org/10.3389/fnins.2020.587520
[12]	P. Autthasan, R. Chaisaen, T. Sudhawiyangkul, S. Kiatthaveephong, P. Rangpong, N. Dilokthanakul, et al., MIN2net: End-to-end multi-task learning for subject-independent motor imagery EEG classification, IEEE Trans. Biomed. Eng., 2021. https://doi.org/10.1109/TBME.2021.3137184
[13]	R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, et al., Deep learning with convolutional neural networks for EEG decoding and visualization, Hum. Brain Mapp., 38 (2017), 5391–5420. https://doi.org/10.1002/hbm.23730 doi: 10.1002/hbm.23730
[14]	S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning ICML (eds. F. R. Bach and D. M. Blei), 37 (2015), 448–456. https://doi.org/10.48550/arXiv.1502.03167
[15]	D. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus), preprint, arXiv: 1511.07289.
[16]	H. Yang, S. Sakhavi, K. K. Ang, C. Guan, On the use of convolutional neural networks and augmented CSP features for multi-class motor imagery of EEG signals classification, in 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, (2015), 2620–2623. https://doi.org/10.1109/EMBC.2015.7318929
[17]	X. An, D. Kuang, X. Guo, Y. Zhao, L. He, A deep learning method for classification of EEG data based on motor imagery, in Intelligent Computing in Bioinformatics (eds. D. S. Huang, K. Han, M. Gromiha), ICIC 2014, Lecture Notes in Computer Science, Springer, 8590 (2014), 203–210. https://doi.org/10.1007/978-3-319-09330-7_25
[18]	M. Li, J. Han, J. Yang, Automatic feature extraction and fusion recognition of motor imagery EEG using multilevel multiscale CNN, Med. Biol. Eng. Comput., 59 (2021), 2037–2050. https://doi.org/10.1007/s11517-021-02396-w doi: 10.1007/s11517-021-02396-w
[19]	V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung, B. J. Lance, EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces, J. Neural Eng., 15 (2018), 056013. https://doi.org/10.1088/1741-2552/aace8c doi: 10.1088/1741-2552/aace8c
[20]	R. K. Malhotra and A. Y. Avidan, Sleep stages and scoring technique, in Atlas of Sleep Medicine, (2013), 77–99. https://doi.org/10.1016/B978-1-4557-1267-0.00003-5
[21]	D. Hendrycks, M. Mazeika, S. Kadavath, D. Song, Using self-supervised learning can improve model robustness and uncertainty, in 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, (2019), 15637–15648. Available from: https://papers.nips.cc/paper/2019/file/a2b15837edac15df90721968986f7f8e-Paper.pdf.
[22]	M. Noroozi and P. Favaro, Unsupervised learning of visual representations by solving jigsaw puzzles, in Computer Vision - ECCV 2016 - 14th European Conference, Lecture Notes in Computer Science, Springer, (2016), 69–84. https://doi.org/10.1007/978-3-319-46466-4_5
[23]	Y. Li, J. Zeng, S. Shan, X. Chen, Self-supervised representation learning from videos for facial action unit detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 10924–10933. https://doi.org/10.1109/CVPR.2019.01118
[24]	H. J. Banville, G. Moffat, I. Albuquerque, D. Engemann, A. Hyvärinen, A. Gramfort, Self-supervised representation learning from electroencephalography signals, in 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), IEEE, (2019), 1–6. https://doi.org/10.1109/MLSP.2019.8918693
[25]	A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012 (eds. P. L. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, K. Q. Weinberger), (2012), 1106–1114. https://doi.org/10.1145/3065386
[26]	C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, et al., Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2015), 1–9. https://doi.org/10.1109/CVPR.2015.7298594
[27]	V. Nair, G. E. Hinton, Rectified linear units improve restricted boltzmann machines, in Proceedings of the 27th International Conference on Machine Learning (eds. J. Fürnkranz and T. Joachims), (2010), 807–814. Available from: https://icml.cc/Conferences/2010/papers/432.pdf.
[28]	A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, et al., Mobilenets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861.
[29]	J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in 2018 IEEE Conference on Computer Vision and Pattern Recognition CVPR, (2018), 7132–7141. Available from: https://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_Squeeze-and-Excitation_Networks_CVPR_2018_paper.pdf.
[30]	G. Pfurtscheller and F. H. Lopes da Silva, Event-related EEG/MEG synchronization and desynchronization: basic principles, Clin. Neurophysiol., 110 (1999), 1842–1857. https://doi.org/10.1016/s1388-2457(99)00141-8 doi: 10.1016/s1388-2457(99)00141-8
[31]	E. Dong, G. Zhu, C. Chen, Classification of four categories of EEG signals based on relevance vector machine, in 2017 IEEE International Conference on Mechatronics and Automation (ICMA), (2017), 1024–1029. https://doi.org/10.1109/ICMA.2017.8015957
[32]	Y. Meirovitch, H. Harris, E. Dayan, A. Arieli, T. Flash, Alpha and beta band event-related desynchronization reflects kinematic regularities, J. Neurosci., 35 (2015), 1627–1637. https://doi.org/10.1523/jneurosci.5371-13.2015 doi: 10.1523/jneurosci.5371-13.2015

This article has been cited by:

1.	Mohammed Hassan Osman Abdalraheem, Mohammad Alamgir Hossain, Alfadil Ahmed Hamdan, M Tahar Kechadi, Suresh Limkar, 2023, Estimation of Facial Emotion Based on Landmark Points by Applying Artificial Intelligence and Machine Learning, 979-8-3503-0426-8, 1, 10.1109/ICCUBEA58933.2023.10392279
2.	Pradnya Borkar, Vishal Ashok Wankhede, Deepak T. Mane, Suresh Limkar, J. V. N. Ramesh, Samir N. Ajani, RETRACTED ARTICLE: Deep learning and image processing-based early detection of Alzheimer disease in cognitively normal individuals, 2023, 1432-7643, 10.1007/s00500-023-08615-w
3.	Saad Mamoun Abdel Rahman, Nasrullah Armi, Mohammed Eltahir Abdelhag, Sherif Tawfik Amin, Hassan Abu Eishah, 2023, Rapid and Efficient Facial Landmark Identification by Light and High Resolution Network using Artificial Intelligence, 979-8-3503-4389-2, 320, 10.1109/ICRAMET60171.2023.10366566
4.	Mohammad Alamgir Hossain, Mohammed Hassan Osman, Alfadil Ahmed Hamdan, Mohammed Eltahir Abdelhag, M Tahar Kechadi, 2023, FERLP: Facial Emotion Recognition Based on Landmark Points using Artificial Intelligence and Machine Learning, 979-8-3503-3509-5, 1, 10.1109/ICCCNT56998.2023.10308392
5.	Abdullah M. Sheneamer, Malik H. Halawi, Meshari H. Al-Qahtani, Priyadarsan Parida, A hybrid human recognition framework using machine learning and deep neural networks, 2024, 19, 1932-6203, e0300614, 10.1371/journal.pone.0300614
6.	Abdoh Jabbari, 2023, Tracking and Analysis of Pilgrims' Movement Throughout Umrah and Hajj Applying Artificial Intelligence and Machine Learning, 979-8-3503-0426-8, 1, 10.1109/ICCUBEA58933.2023.10392217
7.	Mohammed Hameed Alhameed, 2024, Adaptive Scheduling Architecture for IoT Environment, 9798400716379, 295, 10.1145/3674029.3674075
8.	Mohammed Alhameed, Mohammad Alamgir Hossain, 2023, Rapid Detection of Pilgrims Whereabouts During Hajj and Umrah by Wireless Communication Framework : An application AI and Deep Learning, 978-1-6654-7524-2, 1, 10.1109/ESCI56872.2023.10099969
9.	Dwarakanath B, Pandimurugan V, Mohandas R, Sambath M, Baiju B.V, Chinnasamy A, Detecting the symptoms of COVID-19 during pandemic environment using smart spectacle thermal images and deep capsule networks, 2024, 1573-7721, 10.1007/s11042-024-18812-w
10.	Suresh Limkar, Mohammad Alamgir Hossain, Sherif Tawfik Amin, Yasir Ahmad, 2025, 9781394256044, 185, 10.1002/9781394256075.ch10
11.	Mohammad Mazedul Huq Talukdar, Alfadil Ahmed Hamdan, Yagoub Abbker Adam, Mohammad Alamgir Hossain, Mohammed Hassan Osman, Mohammad Khamruddin, Mohammed Eltahir Abdelhag, 2024, Enhanced Approach to Predict Early Stage Chronic Kidney Disease, 979-8-3315-4310-5, 127, 10.1109/AGERS65212.2024.10932874

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Mathematical Biosciences and Engineering

3.9

Metrics

Article views(3302) PDF downloads(279) Cited by(15)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(9) / Tables(4)

Mathematical Biosciences and Engineering

An improved self-supervised learning for EEG classification

Related Papers:

Abstract

1. Introduction

2. Related works

3. Proposed tracking scheme

4. Image registration

4.1. Pre-processing

4.2. Segmentation by improved gradient method

4.3. Feature extraction

4.4. Hybrid adaptive optimized classifier system

4.5. Features weights extraction

4.6. Different regions detection

4.7. Vector Formation from segmented region

5. Experimental evaluation

5.1. Active regions detection

5.2. Classification based on CNN-Tree-Level Method

5.3. CNN Structure with Ten Fold Validity

6. Performance analysis

7. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Mathematical Biosciences and Engineering

An improved self-supervised learning for EEG classification

Related Papers:

Abstract

1. Introduction

2. Related works

3. Proposed tracking scheme

4. Image registration

4.1. Pre-processing

4.2. Segmentation by improved gradient method

4.3. Feature extraction

4.4. Hybrid adaptive optimized classifier system

4.5. Features weights extraction

4.6. Different regions detection

4.7. Vector Formation from segmented region

5. Experimental evaluation

5.1. Active regions detection

5.2. Classification based on CNN-Tree-Level Method

5.3. CNN Structure with Ten Fold Validity

6. Performance analysis

7. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog