Research article

3D human pose detection using nano sensor and multi-agent deep reinforcement learning

  • Received: 29 October 2022 Revised: 16 December 2022 Accepted: 19 December 2022 Published: 05 January 2023
  • Due to the complexity of three-dimensional (3D) human pose, it is difficult for ordinary sensors to capture subtle changes in pose, resulting in a decrease in the accuracy of 3D human pose detection. A novel 3D human motion pose detection method is designed by combining Nano sensors and multi-agent deep reinforcement learning technology. First, Nano sensors are placed in key parts of the human to collect human electromyogram (EMG) signals. Second, after de-noising the EMG signal by blind source separation technology, the time-domain and frequency-domain features of the surface EMG signal are extracted. Finally, in the multi-agent environment, the deep reinforcement learning network is introduced to build the multi-agent deep reinforcement learning pose detection model, and the 3D local pose of the human is output according to the features of the EMG signal. The fusion and pose calculation of the multi-sensor pose detection results are performed to obtain the 3D human pose detection results. The results show that the proposed method has high accuracy for detecting various human poses, and the accuracy, precision, recall and specificity of 3D human pose detection results are 0.97, 0.98, 0.95 and 0.98, respectively. Compared with other methods, the detection results in this paper are more accurate, and can be widely used in medicine, film, sports and other fields.

    Citation: Yangjie Sun, Xiaoxi Che, Nan Zhang. 3D human pose detection using nano sensor and multi-agent deep reinforcement learning[J]. Mathematical Biosciences and Engineering, 2023, 20(3): 4970-4987. doi: 10.3934/mbe.2023230

    Related Papers:

    [1] Ruiping Yuan, Jiangtao Dou, Juntao Li, Wei Wang, Yingfan Jiang . Multi-robot task allocation in e-commerce RMFS based on deep reinforcement learning. Mathematical Biosciences and Engineering, 2023, 20(2): 1903-1918. doi: 10.3934/mbe.2023087
    [2] Kun Zheng, Bin Li, Yu Li, Peng Chang, Guangmin Sun, Hui Li, Junjie Zhang . Fall detection based on dynamic key points incorporating preposed attention. Mathematical Biosciences and Engineering, 2023, 20(6): 11238-11259. doi: 10.3934/mbe.2023498
    [3] Xiaowen Jia, Jingxia Chen, Kexin Liu, Qian Wang, Jialing He . Multimodal depression detection based on an attention graph convolution and transformer. Mathematical Biosciences and Engineering, 2025, 22(3): 652-676. doi: 10.3934/mbe.2025024
    [4] Haifeng Song, Weiwei Yang, Songsong Dai, Haiyan Yuan . Multi-source remote sensing image classification based on two-channel densely connected convolutional networks. Mathematical Biosciences and Engineering, 2020, 17(6): 7353-7377. doi: 10.3934/mbe.2020376
    [5] Shangbin Li, Yu Liu . Human motion recognition based on Nano-CMOS Image sensor. Mathematical Biosciences and Engineering, 2023, 20(6): 10135-10152. doi: 10.3934/mbe.2023444
    [6] Yingying Xu, Chunhe Song, Chu Wang . Few-shot bearing fault detection based on multi-dimensional convolution and attention mechanism. Mathematical Biosciences and Engineering, 2024, 21(4): 4886-4907. doi: 10.3934/mbe.2024216
    [7] Zhangjie Wu, Minming Gu . A novel attention-guided ECA-CNN architecture for sEMG-based gait classification. Mathematical Biosciences and Engineering, 2023, 20(4): 7140-7153. doi: 10.3934/mbe.2023308
    [8] Sakorn Mekruksavanich, Anuchit Jitpattanakul . RNN-based deep learning for physical activity recognition using smartwatch sensors: A case study of simple and complex activity recognition. Mathematical Biosciences and Engineering, 2022, 19(6): 5671-5698. doi: 10.3934/mbe.2022265
    [9] Siqi Chen, Ran Su . An autonomous agent for negotiation with multiple communication channels using parametrized deep Q-network. Mathematical Biosciences and Engineering, 2022, 19(8): 7933-7951. doi: 10.3934/mbe.2022371
    [10] Hongwen Hu, Miao Ye, Chenwei Zhao, Qiuxiang Jiang, Xingsi Xue . Intelligent multicast routing method based on multi-agent deep reinforcement learning in SDWN. Mathematical Biosciences and Engineering, 2023, 20(9): 17158-17196. doi: 10.3934/mbe.2023765
  • Due to the complexity of three-dimensional (3D) human pose, it is difficult for ordinary sensors to capture subtle changes in pose, resulting in a decrease in the accuracy of 3D human pose detection. A novel 3D human motion pose detection method is designed by combining Nano sensors and multi-agent deep reinforcement learning technology. First, Nano sensors are placed in key parts of the human to collect human electromyogram (EMG) signals. Second, after de-noising the EMG signal by blind source separation technology, the time-domain and frequency-domain features of the surface EMG signal are extracted. Finally, in the multi-agent environment, the deep reinforcement learning network is introduced to build the multi-agent deep reinforcement learning pose detection model, and the 3D local pose of the human is output according to the features of the EMG signal. The fusion and pose calculation of the multi-sensor pose detection results are performed to obtain the 3D human pose detection results. The results show that the proposed method has high accuracy for detecting various human poses, and the accuracy, precision, recall and specificity of 3D human pose detection results are 0.97, 0.98, 0.95 and 0.98, respectively. Compared with other methods, the detection results in this paper are more accurate, and can be widely used in medicine, film, sports and other fields.



    With the in-depth research in the field of computer vision, human pose detection has received more and more attention, especially in medical, film, sports and other fields [1,2]. Three-dimensional (3D) human pose detection refers to solving human motion pose with the aid of intelligent technology, and generating motion pose models in 3D space to enhance people's understanding of motion behavior [3,4]. Due to the complex structure and diverse poses of 3D human, the difficulty of data acquisition is limited, and meanwhile, the wide application and high precision data demand in many fields such as video surveillance, behavior recognition and human-computer interaction make 3D human pose estimation a major hotspot and difficult problem in the field of computer vision. Therefore, it is important to study 3D human pose detection methods. In 3D human pose detection research, the acquisition of human EMG signals using sensors is the basic step, and with the continuous development and advancement of sensor technology, Nano sensors have emerged. Compared with traditional sensors, Nano sensors are smaller, accurate, and more suitable for human EMG signal acquisition. There are many technical approaches for 3D human pose detection, and multi-intelligent deep reinforcement learning is the current popular learning technique. Multi-intelligent deep reinforcement learning is a new technical approach that uses deep learning to sense environmental features and reinforcement learning methods to find the optimal strategy in a shared environment of multiple intelligences. It integrates the advantages of deep learning and reinforcement learning to solve complex tasks using the cooperation of multiple intelligences, and is currently widely used in many fields such as human pose research.

    At present, there are many human pose detection methods. However, they show many shortcomings in practical applications. For example, X. Song, L. Fan [5] proposed a human pose detection method based on 3D multi-view basketball movement data set. The convolution neural network framework used is VGG11. After the RGB basketball motion image passes through the semantic segmentation network, the image containing the target object is obtained, and the image is input into the constructed feature fusion network model. After feature extraction of RGB image and depth image, respectively, RGB feature, local feature and global feature of point cloud are spliced and fused to form feature vector. According to the result of feature vector extraction, human pose detection is realized. However, in practical applications, this method has low accuracy. W. Ren et al. [6] proposed a human pose detection method based on fuzzy logic and machine learning. Acquire human pose images, and classify human pose using hybrid fuzzy logic and machine learning methods. The pose detector uses relatively small data sets for training, inputs all data to the pose detector, and obtains the detection results of human pose. This method has the problem of low recognition rate of human pose, and has a certain gap with the ideal application effect. W. Ding et al. [7] proposed a human pose detection method based on multiple features and rule learning. First, we define a 219-dimensional vector containing angle features and distance features. Then, in the process of human pose classification, the rule learning method is combined with bagging and random subspace methods to create different samples and features, to improve the classification performance of the sub- classifier for different samples. Finally, according to the results of feature extraction and classification, the human pose is detected to obtain more accurate human pose detection results. However, the accuracy of this method is low, and the practical application effect is poor. J. Wang, X.H. Liu [8] proposed a human pose detection method based on depth sensors. The Kinect depth sensor is used to collect human skeleton information, and the direction cosine method is used for feature extraction. The feature vectors are sent to the BP neural network for training and recognition, and the human pose detection results are obtained. However, this method has the problem of low recall rate, which has a certain gap with the ideal application effect. D. He, L. Li [9] proposed a human pose detection method based on Kinect. First, use Kinect to obtain the spatial coordinates of human joints. Then, calculate the angle through two-point method, and define the pose library. Finally, the pose detection results are obtained by analyzing the pose detection using the angle matching of the pose database. However, the specificity of this method is low, and the actual application effect is not good.

    However, in practical application, it is found that the traditional 3D human pose detection methods have low accuracy, accuracy, precision, recall and specificity of human pose detection, and the actual application effect is not good. To solve the problems of the above methods, a new 3D human pose detection method based on Nano sensors and multi-agent deep reinforcement learning is proposed. Unlike traditional methods, this paper uses advanced Nano sensor to acquire human EMG signals and introduces deep reinforcement learning networks to build a multi-intelligent deep reinforcement learning detection model for detection and analysis. It is expected to obtain more efficient 3D human pose detection results with the help of advanced instrumentation and popular technical means. The contributions of this paper are as follows:

    (1) The Nano sensor is used to collect the human EMG signal, with the advantages of higher accuracy, easier use and longer lifetime of Nano sensors, the accuracy and efficiency of EMG signal acquisition are improved to provide important data guarantee for subsequent 3D human pose detection. (2) The deep reinforcement learning network is introduced to build the multi-agent deep reinforcement learning pose detection model, which combines the advantages of deep learning and reinforcement learning while enhancing the interaction between the intelligence and the environment, helps to improve the pose detection accuracy and efficiency.(3)Based on the feature extraction and de-noising of the EMG signal, a multi-intelligence deep reinforcement learning structure was used for feature extraction again to increase the feature extraction accuracy.

    The Nano gold flexible sensor [10] was designed with the Nano gold flexible material and PDMS as the substrate as the basis for the EMG signal acquisition. Compared with conventional sensors, Nano sensors have the advantages of higher accuracy, easier use and longer lifetime. The composition of Nano gold flexible sensor is shown in Figure 1.

    Figure 1.  Composition of Nano gold flexible sensor.

    It can be seen from the analysis of the structure in Figure 1 that the Nano gold flexible sensor is mainly composed of sensitive elements, conversion elements, conversion circuit, auxiliary power supply, display, recorder, and data processing instruments. The sensitive elements are mainly arranged in key parts of the human. The sensitive elements are used to capture relevant signals, convert and transform the signals, store them in the recorder, and obtain EMG signals through the displayed reality.

    Different from conventional sensors, Nano sensors are mainly used to obtain EMG signals during human motion based on their resistance changes. The resistance change formula is:

    d=d1×ˉd+2×d1×d2+d2×ˉdd1+2×ˉd+d2 (1)

    where d is the resistance change value; d1 represents island resistance; d2 refers to gap resistance; ˉd is the resistance of suspension beam. In the process of application, the muscle contraction intensity of different parts of the human varies greatly, resulting in different electromyography signals for the human. To obtain the electromyography signal data required for 3D human pose detection, according to the principle of electromyography signal generation, the Nano sensor is pasted in the appropriate position. The generation process of EMG signal is shown in Figure 2.

    Figure 2.  EMG signal generation.

    According to Figure 2, All EMG signals are determined by the human central nervous system. After the influence of delay function and shock function, a single EMG signal is formed, and then all the information collected by Nano sensors is collected to generate the final physiological EMG signal. The completion of each pose of human requires the joint participation of multiple muscle parts. According to the sensing range of Nano sensors, Nano sensors are placed separately for key joints such as elbow joints and knee joints to ensure that the collection range of EMG signals covers all parts of the human.

    According to the collection results of EMG signals, the features of surface EMG signals are extracted. The EMG signal itself belongs to a pure bioelectricity signal. However, there is always some interference information in the EMG signal collected by the Nano sensor [11]. To reduce the impact of interference information on the feature extraction of surface EMG signal, the concept of blind source separation [12] is adopted in this paper to reconstruct the observed signal.

    {λ=Q×θˉθ=Qˉλ (2)

    where λ is muscle power signal matrix. Q represents the reconstruction matrix.θ is surface EMG signal. ˉλ represents the source signal matrix after removing the noise source.

    The blind source separation technology is used to extract the EMG signal source and noise source, respectively, and the acquired EMG signal is de-noised through Principal Component Analysis (PCA) reconstruction strategy [13]. During the operation, the signal-to-noise ratio (SNR) of the source signal can be expressed as

    η=10lg(ε12ε22) (3)

    where η is the SNR, lg stands for logarithmic function, ε1 stands for platform period data.ε2 is the data of the rest period.

    PCA processing technology is used to count the SNR of all surface muscle power signals and sort them in descending order. The source signal with low SNR is regarded as the noise source signal, which is removed from the data set to obtain the surface muscle power signal after noise removal, and then extract the time-domain and frequency-domain features [14]. In which, the time domain features are mainly extracted by data statistical analysis. There are four kinds of time-domain features extracted in this paper, which are average absolute value, slope change, waveform length and zero crossing points.

    M=1ψψi=1|xi| (4)
    C=1ψψi=1ρi,ρi={1,(xixi1)(xixi+1)τ0,others (5)
    L=1ψψ1i=1|xi1xi| (6)
     = ψ1i=1sgn((xi+1)×xi) (7)

    where M is the average absolute value.ψ represents the number of sampling points of a segment of EMG signal, i is the sampling point, xi represents the original data of the sampling point, C is the slope change, ρ is the slope.τ is the threshold, L represents the waveform length, is the number of zero crossings, sgn stands for step function.

    In addition to time-domain characteristics, analyzing the EMG signal from the frequency domain perspective can find that there is a large amount of frequency domain features information in the surface EMG signal. In this paper, the spectral analysis method [15] is used to extract the average power frequency and median frequency.

    F=+0f×ς(f)+0ς(f) (8)
    E=f0ς(f)=+fς(f)=12+0ς(f) (9)

    where F is the average power frequency, E is the median frequency, f represents EMG signal frequency, ς stands for power spectral density function.f is the median frequency.

    Through the above operations, the feature extraction of surface electromyography signal is completed, which is used as the key input of the subsequent 3D human pose detection and input into the multi-agent deep reinforcement learning detection model.

    After the EMG signal is de-noised by blind source separation technology, the time-domain and frequency-domain features in the surface EMG signal are extracted, and combined with the feature extraction results, a multi-agent deep reinforcement learning detection model is constructed to ensure the accuracy and efficiency of motion pose detection. The multi-intelligent deep reinforcement learning detection model combines the advantages of deep learning and reinforcement learning, at the same time, due to the operation in the multi-agent environment, the interaction between the agent and the environment is enhanced to ensure the accuracy and efficiency of motion pose detection.

    In the multi-agent environment, deep reinforcement learning technology is introduced to build the multi-agent deep reinforcement learning detection model, further analyze the feature information, and output the 3D human local pose detection results. The detection model built in this paper, Figure 3 is the centralized multi-agent deep reinforcement learning structure.

    Figure 3.  Centralized multi-agent deep reinforcement learning structure.

    It can be seen from the analysis of Figure 3 that the EMG characteristic signal is input into the centralized multi-agent deep reinforcement learning network to obtain data features, and determine the human pose through joint behavior and joint reward. To facilitate the analysis, the multi-agent reinforcement learning process shown in Figure 2 can be regarded as a Markov game process. Under the action of the agent state space, observation space and other joint spaces, the feature data of EMG can be iteratively trained. Among them, each agent can obtain local pose observation results based on feature data of EMG signal, and then transfer the current observation results to the next operation step through the transfer function [16]. When the detection model is used for learning, the loss function exists in the agent operation process, and the following mathematical expression is obtained

    yj=rjt+ϖ(Ojt+1;vj) (10)

    where j represents an agent, yj represents the output result of the agent, ϖ stands for discount factor, t represents the time, r is the reward function, O represents local observation space, v is the loss function.

    When conducting deep reinforcement learning detection in a multi-agent environment, we can summarize the information such as the agent's interaction track and local state to form a local experience track, which is convenient for extracting the agent's intention and guiding subsequent agents to interact with the environment [17]. In the process of 3D human pose detection, the potential intention of the agent will affect the subsequent detection results. The maximum likelihood target optimization principle is adopted to complete the training of the encoder and decoder:

    υ=Gg=2Tt=1logUφ,γ(Ojg,t+1,rjg,t|Ojg,t,ajg,t,bjg,t1) (11)

    where υ stands for likelihood target, φ represents encoder, γ represents decoder, g represents the interaction between agent and environment, G represents the number of interactions, T represents the execution cycle, U is the transfer function, log stands for logarithmic function, a stands for learning rate, b represents the interactive trajectory.

    After the encoder and decoder are updated, a generator with the re-projection network as the core [18] is designed and added to the back of the decoder to form a complete multi-agent deep reinforcement learning detection model. Using the optimized detection model, the 3D pose matching the two-dimensional observation results is obtained.

    Considering that each Nano sensor intelligently covers part of the human, the detection result is the local pose of the human based on one sensor to collect the EMG signal. To obtain the overall 3D human pose detection results, data fusion and pose solution are required. Therefore, the 3D human pose detection process is shown in Figure 4.

    Figure 4.  3D Human Pose Detection Process.

    It can be seen from the analysis of Figure 4 that Nano sensors are placed in key parts of the human to collect human EMG signals. After the EMG signal is de-noised by blind source separation technology, the time-domain and frequency-domain features of the surface EMG signal are extracted. In the multi-agent environment, build a multi-agent deep reinforcement learning pose detection model, and detect the human motion pose when judging. If so, it is necessary to fuse the multi-sensor pose detection results with the pose solution to obtain the 3D human pose detection results; If not, it is necessary to recollect the EMG signal until the detection results are obtained.

    This paper combines Kalman filtering and complementary filtering technologies to design a new data fusion method to fuse 3D human pose information detected by multiple Nano sensors [19]. In which, depending on the complementary filtering algorithm, linear complementary fusion results can be output

    ˉϕ=e1ϕ1+e2ϕ2++emϕm (12)

    where ˉϕ represents pose fusion result, e is the weight coefficient, ϕ represents the pose result detected by a single Nano sensor, m is the number of Nano sensors.

    Considering that there may be nonlinear state data in 3D human pose detection results, the paper uses PI control technology [20] to design nonlinear complementary filters to obtain new pose fusion results

    ˉϕ=1I[e1+(K1+K2I)e2++(K1+K2I)em] (13)

    K1 is the scale parameter, K2 is the integral parameter, I represents the frequency domain complex variable involved in Laplace transform.

    Kalman filter is introduced to perform autoregressive analysis on the output results of nonlinear complementary filter. Recursive processing is performed from the perspectives of state vector and observation vector [21], to generate state update equation and state observation equation

    {Xt=ξtt1Xt1+Γt1Φt1Zt=JtXt+Vt (14)

    where X represents the updated state vector, Z represents the observation vector, ξ represents the state transition matrix, Γ represents the noise input matrix, Φ is the process noise, J stands for observation matrix, V stands for observation noise.

    After the data fusion processing is completed through the above calculation, a new pattern classifier is proposed using support vector machine technology, and the fused data is input into the classifier to generate 3D human pose detection results.

    Relying on Nano sensors and multi-agent deep reinforcement learning technology, after completing the design of 3D human pose detection method, to verify its effectiveness, experimental tests are performed.

    Select two public data sets to ensure smooth experiment, and conduct model training and testing, respectively. Data set1: The NinaPro data set (www.ninapro.hevs.cn) is selected for model training before the experiment. The data set consists of eight databases, each of which contains many surface electromyography signals. Among them, the data in database DB1–DB7 are all original EMG signals. The more special one is database DB8, in which motion direction, acceleration and other contents are added. Data set2: UCI data set (www.ics.uci.edu/~mlearn/MLRepository.html). A database for machine learning, this data set has 559 datasets and its number is growing, the UCI data set is a commonly used standard test dataset. This includes the UCI Human Activity Recognition dataset, which is based on sensor data collected by smartphones for activity recognition, and is primarily derived from volunteers between the ages of 19 and 48.

    Considering that the detection method proposed in the paper relies on EMG signal unfolding, the NinaPro data set was selected for model training before the start of the experiments to ensure that the multi-intelligent deep reinforcement learning detection model has a good detection performance. To reduce the computational time, the DB8 EMG database, which is more comprehensive in the NinaPro dataset, is used in the paper to train the detection model built above, obtain the optimal parameter information, and then develop 3D human pose detection. To demonstrate the performance of the proposed method for 3D human pose detection. For the UCI database, some experimental data were selected to form data set 2. The data set 2 contains alpine ski rappelling motion pose and human open pose. A group of 20 professionals wearing Nano sensors will demonstrate different sliding and descending poses and record data information of the human in different movement poses.

    To obtain the EMG signals of various parts of the human in motion in an all-round way, placement points are set at various joints of the human, and Nano sensors are pasted at corresponding parts of the human, as shown in Figure 5.

    Figure 5.  Placement point of Nano sensor.

    Set a sliding window of fixed size to collect the sensing signal required for the experiment in data set 2. The five types of human glide pose included in data set 2 are straight sliding, inclined sliding, lateral sliding, plough sliding and plough turning, two types of open pose of humans are lying down and standing up, as shown in Figure 6.

    Figure 6.  Pose categories of the slide down action contained in dataset 2.

    A total of 5000 images were used in this experiment, including 3000 images from data set 1 and 2000 images from data set 2. Four thousand images were randomly selected for training, and the remaining 1000 images were used for testing.

    After preprocessing the data set, input it into the multi-agent deep reinforcement learning detection model described above, perform multiple iterative training, and record the model parameters when the detection model is in the optimal state, as shown in Table 1.

    Table 1.  Model Parameters.
    Parameter name Parameter value
    Input layer node 20
    Output layer node 5
    Sequence length 128
    Number of hidden layers 5
    Hidden layer node 20
    Learning rate 0.001
    Batch_ size 1000
    Sub agent field of view 9 × 9

     | Show Table
    DownLoad: CSV

    In addition to the detection of proposed method, this experiment also applies the methods of HPDM [5], HPDFLML [6], HPDMF [7], HPDDS [8] and HPDMK [9] to perform 3D human pose detection experiments.

    (1) 3D human pose detection results: The proposed method and the several detection methods are used, and the detection data set 1 is used to obtain the 3D human pose detection results of each method. According to the detection results, the corresponding confusion matrix is generated to intuitively describe the recognition effect of different detection methods on human sliding pose.

    (2) Accuracy: In dataset 2, the straight sliding motion is selected as the research object. By using the other methods and the detection of proposed method, all the straight glide pose can be extracted. Compare the pose detection results of each method with the actual pose, and calculate the accuracy of the detection results of different methods to describe the advantages of the proposed method.

    A=α+ββ+χ+α+δ (15)

    where A stands for accuracy, β represents true positive, α represents true negative, χ indicates false negative, δ is false positive.

    Precision: Fifty groups of human pose data are selected for each type of sliding pose in dataset 2 to form a test dataset containing 250 groups of data. The detection of proposed method, as well as the methods in HPDM [5], HPDFLML [6], HPDMF [7], HPDDS [8] and HPDMK [9] are used to complete the detection of the pose of the inclined glide. According to the detection results of different methods and the actual pose distribution, calculate the precision of the detection results

    P=ββ+δ (16)

    where P is precision.

    Recall rate: Three hundred groups of glide pose data are randomly selected from dataset 2 for lateral glide pose detection. Summarize the recognition results of different detection methods for sideslip pose, and determine the recall rate of each 3D human pose detection method

    R=ββ+χ (17)

    where R is recall rate

    Specificity: Using the detection of proposed method and the several detection methods, all plough turning poses are detected from dataset 1, and the specificity is calculated according to the detection results.

    S=αα+δ (18)

    where S is specificity.

    According to the detection of proposed method, feature extraction is carried out for the de-noised EMG signal, and then the multi-agent deep reinforcement learning detection model is input. The fusion processing and pose solution are performed for the detection results. The 3D human pose detection results are shown in Figure 7.

    Figure 7.  3D human pose detection results of the proposed method.

    According to Figure 7, there is no cross between the five types of detection results using the detection of proposed method: straight sliding, inclined sliding, lateral sliding, plough sliding and plough turning, lying down, standing up these seven types of test results do not exist in the case of cross. Each type is well gathered together, which shows that this method can accurately identify each type of glide pose, indicating that the pose detection of proposed method is feasible. According to the experimental design, the confusion matrix corresponding to the pose detection results of different methods is obtained, as shown in Figure 8.

    Figure 8.  Comparison of confusion matrix corresponding to pose detection results of different methods.

    According to the data in Figure 8, it can be seen that the detection accuracy of the proposed method for inclined sliding, lateral sliding, plough sliding and plough turning, lying down, standing up is kept above 90%, even up to 98%. As for other pose detection of other methods, there are 3D human pose detection accuracies below 90%, especially the methods in HPDDS [8] and HPDMK [9] for inclined sliding, lateral sliding, plough sliding and plough turning, lying down, standing up. The maximum detection accuracy of the method in HPDMF [7] is 90%, and the maximum detection accuracy of the methods in HPDM [5] and HPDFLML [6] is 92%. It can be seen that the proposed method has a lower detection error for 3D human pose compared with other methods. This is because this paper constructs a multi-intelligent deep reinforcement learning detection model, and the combination of deep learning and reinforcement learning effectively reduces the detection error.

    At the same time, the accuracy comparison results of different detection methods are summarized in Table 2.

    Table 2.  Comparison results of accuracy.
    Number of experiments Proposed method HPDM [5]method HPDFLML [6]method HPDMF [7]method HPDDS [8]method HPDMK [9]method
    10 0.96 0.85 0.86 0.88 0.87 0.81
    20 0.97 0.86 0.87 0.81 0.84 0.82
    30 0.96 0.89 0.88 0.84 0.88 0.88
    40 0.96 0.90 0.81 0.90 0.86 0.84
    50 0.94 0.91 0.89 0.89 0.87 0.85
    60 0.96 0.90 0.90 0.87 0.86 0.84

     | Show Table
    DownLoad: CSV

    It can be seen from Table 2 that the maximum accuracy rate of this method is 0.97, which is 0.06, 0.07, 0.07, 0.09 and 0.12 higher than that of HPDM [5] method, HPDFLML [6] method, HPDMF [7] method, HPDDS [8] method and HPDMK [9] method, respectively. It indicates that the accuracy rate of the proposed method is higher, and shows that the multi-intelligent body deep reinforcement learning detection model constructed by proposed method can achieve the goal of accurate 3D human pose detection.

    According to the data in Figure 9, we can see that the maximum accuracy rate of the proposed method is 0.98, 0.06 higher than the method in HPDM [5], 0.07 higher than the method in HPDFLML [6], 0.09 higher than the method in HPDMF [7], 0.12 higher than the method in HPDDS [8], and 0.14 higher than the method in HPDMK [9]. This shows that compared with other methods, the accuracy of the proposed method is higher, and it can achieve accurate 3D human pose detection. The efficiency of this paper in using multi-intelligent deep reinforcement learning for pose detection research is fully verified.

    Figure 9.  Comparison Results of Accuracy.

    The recall comparison results of different detection methods are shown in Table 3.

    Table 3.  Comparison results of recall rate.
    Number of experiments Proposed method HPDM [5]method HPDFLML [6]method HPDMF [7]method HPDDS [8]method HPDMK [9]method
    10 0.95 0.89 0.81 0.86 0.81 0.68
    20 0.94 0.90 0.82 0.85 0.79 0.78
    30 0.91 0.87 0.88 0.82 0.76 0.71
    40 0.94 0.85 0.85 0.88 0.78 0.75
    50 0.93 0.88 0.86 0.81 0.82 0.72
    60 0.93 0.81 0.81 0.86 0.79 0.79

     | Show Table
    DownLoad: CSV

    It can be seen from the results in Table 3 that the maximum recall rate of this method is 0.95, which is 0.05, 0.07, 0.07, 0.13 and 0.18 higher than that of HPDM [5] method, HPDFLML [6] method, HPDMF [7] method, HPDDS [8] method and HPDMK [9] method, respectively, indicating that the recall rate of this method is higher, indicating that this method can ensure the integrity of 3D human pose accurate detection results. The reason for this is that this paper uses Nano sensor to acquire the EMG signals on the human surface, and the quality of the data acquired is guaranteed, which in turn improves the integrity of the detection results.

    The specificity comparison results of different detection methods are shown in Figure 10. According to the results in Figure 10, we can see that the maximum specificity of the proposed method is 0.98, 0.06 higher than the method in HPDM [5], 0.09 higher than the method in HPDFLML [6], 0.11 higher than the method in HPDMF [7], 0.16 higher than the method in HPDDS [8], and 0.18 higher than the method in HPDMK [9]. This shows that the specificity of the proposed method is higher than the method in the experimental comparison, which can achieve accurate 3D human pose detection. To sum up, the accuracy and precision of this method have reached 0.97 and 0.98, respectively, the recall rate is 0.95, and the specificity has also reached 0.98. Compared with the other five 3D human pose detection methods, the pose detection error of this method is significantly reduced.

    Figure 10.  Comparison results of specificity.

    As the application scenarios of 3D human pose detection methods become more and more complex, to meet the requirements of pose detection, this paper combines Nano sensors and multi-agent deep reinforcement learning technology to establish a pose detection method with good performance. The results show that the detection accuracy of the proposed method is more than 90% for straight sliding, inclined sliding, lateral sliding, plough sliding, plough turning, lying down and standing up. The accuracy and precision of the proposed method are 0.97 and 0.98, respectively, the recall rate is 0.95, the specificity is 0.98, the pose detection accuracy is higher, and the practical application effect is better. However, the data used in the experimental process are not massive enough, and the data sets used are mostly volunteer body data with relatively good kinematic functions, which have some deviation from the real-life human pose, especially the subtle information characteristics in the process of human pose transformation. Therefore, in the future there is a need to conduct experiments using a larger number and more types of data to improve the shortcomings of proposed method and expand the application scope of proposed method.

    The authors declare that they have no conflicts of interest.



    [1] L. Chen, S. Li, Human Motion target posture detection algorithm using semi-supervised learning in Internet of Things, IEEE Access, 9 (2021), 90529–90538. https://doi.org/10.1109/ACCESS.2021.3091430 doi: 10.1109/ACCESS.2021.3091430
    [2] M. Iwamoto, D. Kato, Efficient actor-critic reinforcement learning with embodiment of muscle tone for posture stabilization of the human arm, Neural Comput., 33 (2020), 1–28. https://doi.org/doi.org/10.1162/neco_a_01333 doi: 10.1162/neco_a_01333
    [3] A. Guzman-Pando, M. I. Chacon-Murguia, L. B. Chacon-Diaz, Human-like evaluation method for object motion detection algorithms, IET Computer Vision, 14 (2020), 674–682. https://doi.org/10.1049/iet-cvi.2019.0997 doi: 10.1049/iet-cvi.2019.0997
    [4] M. Wu, D. Du, Y. Li, W. Bai, W. Liu, Multi-cascade perceptual human posture recognition enhancement network, IEEE Access, 9 (2021), 64256–64266. https://doi.org/10.1109/ACCESS.2021.3074541 doi: 10.1109/ACCESS.2021.3074541
    [5] X. Song, L. Fan, Human posture recognition and estimation method based on 3D Multiview basketball sports dataset, Complexity, 25 (2021), 1–10. https://doi.org/10.1155/2021/6697697 doi: 10.1155/2021/6697697
    [6] W. Ren, O. Ma, H. Ji, X. Liu, Human posture recognition using a hybrid of fuzzy logic and machine learning approaches, IEEE Access, 8 (2020), 135628–135639. https://doi.org/10.1109/ACCESS.2020.3011697 doi: 10.1109/ACCESS.2020.3011697
    [7] W. Ding, B. Hu, H. Liu, X. M. Wang, X. S. Huang, Human posture recognition based on multiple features and rule learning, Int. J. Mach. Learn. Cyber, 11 (2020), 2529–2540. https://doi.org/10.1007/s13042-020-01138-y doi: 10.1007/s13042-020-01138-y
    [8] J. Wang, X. H. Liu, Human posture recognition method based on skeleton vector with depth sensor, IOP Conf. Ser. Mater. Sci. Eng., 806 (2020), 012035. https://doi.org/10.1088/1757-899X/806/1/012035 doi: 10.1088/1757-899X/806/1/012035
    [9] D. He, L. Li, A new Kinect-based posture recognition method in physical sports training based on urban data, Wireless Commun. Mobile Comput., 20 (2020), 1–9. https://doi.org/10.1155/2020/8817419 doi: 10.1155/2020/8817419
    [10] S. Liaqat, K. Dashtipour, K. Arshad, K. Assaleh, N. Ramzan, A hybrid posture detection framework: Integrating machine learning and deep neural networks, IEEE Sensors J., 21(2021), 9515–9522. https://doi.org/10.1109/JSEN.2021.3055898 doi: 10.1109/JSEN.2021.3055898
    [11] Z. Huang, J. Li, J. Huang, J. Ota, Y. Zhang, Motion planning for bandaging task with abnormal posture detection and avoidance, IEEE/ASME Transact. Mechatr., 25 (2020), 2364–2375. https://doi.org/10.1109/TMECH.2020.2973674 doi: 10.1109/TMECH.2020.2973674
    [12] H. Xia, X. Gao, Multi-scale mixed dense graph convolution network for skeleton-based action recognition, IEEE Access, 9 (2021), 36475–36484. https://doi.org/10.1109/ACCESS.2020.3049029 doi: 10.1109/ACCESS.2020.3049029
    [13] R. Xia, Y. Li, W. Luo, LAGA-Net: Local-and-global attention network for skeleton based action recognition, IEEE Transact. Multi., 24 (2022), 2648–2661. https://doi.org/10.1109/TMM.2021.3086758 doi: 10.1109/TMM.2021.3086758
    [14] Y. Kong, Y. Wang, A. Li, Spatiotemporal saliency representation learning for video action recognition, IEEE Transact. Multi., 24 (2022), 1515–1528. https://doi.org/10.1109/TMM.2021.3066775 doi: 10.1109/TMM.2021.3066775
    [15] M. Perez, J. Liu, A. C. Kot, Interaction relational network for mutual action recognition, IEEE Transact. Multi., 24 (2022), 366–376. https://doi.org/10.1109/TMM.2021.3050642 doi: 10.1109/TMM.2021.3050642
    [16] J. Xie, Q. G. Miao, R.Y Liu, W. T. Xin, L. Tang, S. Zhong, et al., Attention adjacency matrix based graph convolutional networks for skeleton-based action recognition, Neurocomputing, 440 (2021), 230–239. https://doi.org/10.1016/j.neucom.2021.02.001 doi: 10.1016/j.neucom.2021.02.001
    [17] D. Ludl, T. Gulde, C. Curio, Enhancing data-driven algorithms for human pose estimation and action recognition through simulation, IEEE Transact. Intell. Transport. Syst., 21 (2020), 3990–3999. https://doi.org/10.1109/TITS.2020.2988504 doi: 10.1109/TITS.2020.2988504
    [18] X. Ma, X. Li, Dynamic gesture contour feature extraction method using residual network transfer learning, Wireless Commun. Mobile Comput, 2021 (2021). https://doi.org/10.1155/2021/1503325 doi: 10.1155/2021/1503325
    [19] T. Ahmad, L. Jin, L. Lin, G. Z. Tang, Skeleton-based action recognition using sparse spatio-temporal GCN with edge effective resistance, Neurocomputing, 423 (2021), 389–398. https://doi.org/10.1016/j.neucom.2020.10.096 doi: 10.1016/j.neucom.2020.10.096
    [20] D. K. Vishwakarma, A two-fold transformation model for human action recognition using decisive pose, Cognit. Syst. Res., 61 (2020), 1–13. https://doi.org/10.1016/j.cogsys.2019.12.004 doi: 10.1016/j.cogsys.2019.12.004
    [21] Y. Lin, W. Chi, W. Sun, S. Liu, D. Fan, Human action recognition algorithm based on improved resnet and skeletal keypoints in single image, Math. Problems Eng., 12(2020), 1–12. https://doi.org/10.1155/2020/6954174 doi: 10.1155/2020/6954174
  • This article has been cited by:

    1. Yan Xiu, Huaixian Yin, Longlong Zhang, Kai Wang, Ying Zhang, Recent Advances in Nanosensors for Motion Detection, 2024, 6, 2637-6113, 621, 10.1021/acsaelm.3c01718
    2. Xinwen Gu, Qingwei Wang, Sha Ji, Delai Zhou, Raja Soosaimarian Peter Raj, Trajectory Method for Defense Human Motion Posture Based on Nano-Sensor, 2024, 67, 1678-4324, 10.1590/1678-4324-2024230247
    3. Zepeng Ning, Lihua Xie, A survey on multi-agent reinforcement learning and its application, 2024, 3, 29498554, 73, 10.1016/j.jai.2024.02.003
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1981) PDF downloads(93) Cited by(3)

Figures and Tables

Figures(10)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog