Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion

Van Dong Nguyen; Dinh Quoc Vo; Van Tu Duong; Huy Hung Nguyen; Tan Tien Nguyen; Van Dong Nguyen; Dinh Quoc Vo; Van Tu Duong; Huy Hung Nguyen; Tan Tien Nguyen

doi:10.3934/mbe.2022033

Mathematical Biosciences and Engineering

2022, Volume 19, Issue 1: 738-758. doi: 10.3934/mbe.2022033

Previous Article Next Article

Research article Special Issues

Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion

1.
Faculty of Mechanical Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam
2.
Vietnam National University Ho Chi Minh City, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Vietnam
3.
National Key Laboratory of Digital Control and System Engineering (DCSELab), HCMUT, 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam
4.
Faculty of Electronics and Telecommunication, Saigon University, Vietnam

Received: 10 August 2021 Accepted: 08 November 2021 Published: 19 November 2021

This article proposes a locomotion controller inspired by black Knifefish for undulating elongated fin robot. The proposed controller is built by a modified CPG network using sixteen coupled Hopf oscillators with the feedback of the angle of each fin-ray. The convergence rate of the modified CPG network is optimized by a reinforcement learning algorithm. By employing the proposed controller, the undulating elongated fin robot can realize swimming pattern transformations naturally. Additionally, the proposed controller enables the configuration of the swimming pattern parameters known as the amplitude envelope, the oscillatory frequency to perform various swimming patterns. The implementation processing of the reinforcement learning-based optimization is discussed. The simulation and experimental results show the capability and effectiveness of the proposed controller through the performance of several swimming patterns in the varying oscillatory frequency and the amplitude envelope of each fin-ray.

Keywords:

Citation: Van Dong Nguyen, Dinh Quoc Vo, Van Tu Duong, Huy Hung Nguyen, Tan Tien Nguyen. Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion[J]. Mathematical Biosciences and Engineering, 2022, 19(1): 738-758. doi: 10.3934/mbe.2022033

Related Papers:

[1]	Teng Fei, Hongjun Wang, Lanxue Liu, Liyi Zhang, Kangle Wu, Jianing Guo . Research on multi-strategy improved sparrow search optimization algorithm. Mathematical Biosciences and Engineering, 2023, 20(9): 17220-17241. doi: 10.3934/mbe.2023767
[2]	Akim Kapsalyamov, Shahid Hussain, Prashant K. Jamwal . A novel compliant surgical robot: Preliminary design analysis. Mathematical Biosciences and Engineering, 2020, 17(3): 1944-1958. doi: 10.3934/mbe.2020103
[3]	Liwei Deng, Zhen Liu, Tao Zhang, Zhe Yan . Study of visual SLAM methods in minimally invasive surgery. Mathematical Biosciences and Engineering, 2023, 20(3): 4388-4402. doi: 10.3934/mbe.2023203
[4]	Xiangyang Ren, Shuai Chen, Kunyuan Wang, Juan Tan . Design and application of improved sparrow search algorithm based on sine cosine and firefly perturbation. Mathematical Biosciences and Engineering, 2022, 19(11): 11422-11452. doi: 10.3934/mbe.2022533
[5]	Anlu Yuan, Tieyi Zhang, Lingcong Xiong, Zhipeng Zhang . Torque control strategy of electric racing car based on acceleration intention recognition. Mathematical Biosciences and Engineering, 2024, 21(2): 2879-2900. doi: 10.3934/mbe.2024128
[6]	Kai Cheng, Lixia Li, Yanmin Du, Jiangtao Wang, Zhenghua Chen, Jian Liu, Xiangsheng Zhang, Lin Dong, Yuanyuan Shen, Zhenlin Yang . A systematic review of image-guided, surgical robot-assisted percutaneous puncture: Challenges and benefits. Mathematical Biosciences and Engineering, 2023, 20(5): 8375-8399. doi: 10.3934/mbe.2023367
[7]	Xinran Zhang, Yongde Zhang, Jianzhi Yang, Haiyan Du . A prostate seed implantation robot system based on human-computer interactions: Augmented reality and voice control. Mathematical Biosciences and Engineering, 2024, 21(5): 5947-5971. doi: 10.3934/mbe.2024262
[8]	Xiaoguang Liu, Jiawei Wang, Tie Liang, Cunguang Lou, Hongrui Wang, Xiuling Liu . SE-TCN network for continuous estimation of upper limb joint angles. Mathematical Biosciences and Engineering, 2023, 20(2): 3237-3260. doi: 10.3934/mbe.2023152
[9]	Zhen Yang, Junli Li, Liwei Yang, Qian Wang, Ping Li, Guofeng Xia . Path planning and collision avoidance methods for distributed multi-robot systems in complex dynamic environments. Mathematical Biosciences and Engineering, 2023, 20(1): 145-178. doi: 10.3934/mbe.2023008
[10]	Liwei Yang, Lixia Fu, Ping Li, Jianlin Mao, Ning Guo, Linghao Du . LF-ACO: an effective formation path planning for multi-mobile robot. Mathematical Biosciences and Engineering, 2022, 19(1): 225-252. doi: 10.3934/mbe.2022012

Abstract

1. Introduction

With the progression of technology, minimally invasive surgical (MISR) robots have incrementally supplanted traditional surgical approaches, demonstrating superior performance in terms of operative clarity, user-friendliness, and operational adaptability ^[1]. Moreover, the implementation of MIS robots is commonly associated with enhanced patient outcomes, reduced pain levels, minimized blood loss and scarring, as well as shortened recovery time ^[2]. However, due to the lack of direct access to the surgical site, surgeons tend to exert excessive force, resulting in tissue damage ^[3]. The absence of haptic feedback is widely recognized as a significant challenge in current robot-assisted laparoscopic surgery ^[4,5]. The accuracy of force feedback is crucial for determining tissue properties and executing delicate surgical procedures. The accuracy of force feedback is crucial for the identification of tissue properties and the execution of precise surgical procedures.

Force feedback plays a pivotal role in surgical procedures, facilitating surgeons to perceive the mechanical characteristics of tissues, assess their anatomical structure, and execute precise force-control actions for ensuring safe tissue manipulation ^[6,7]. To demonstrate the feasibility and efficacy of force feedback in laparoscopic surgery, Wagner and Semere incorporated a commercially available force/torque sensor at the distal end of the surgical instrument to precisely measure tool-tissue forces. The measured forces were then fed back to the master robot through a teleoperation system, enabling the effective execution of blunt dissection tasks. Remarkably, this implementation of force feedback resulted in a significant 50% reduction in applied force and an impressive 66% decrease in tissue damage error ^[8,9].

To date, extensive research has been conducted on the force feedback of minimally invasive surgical robots. The existing methods for estimating forces in surgical robots can be broadly categorized into direct force sensing and sensorless force estimation ^[10,11]. In the direct force sensing method, the force exerted by the surgical tool on the tissue is quantified through an integrated sensor positioned either on or in close proximity to the tool ^[12,13]. Using piezoresistive sensors, Abiri et al. ^[14] developed a device that can be affixed to the tip of a surgical instrument for measuring interaction forces. Furthermore, Reiley et al. ^[15] incorporated strain gauges onto the shafts of surgical tools for measuring interaction forces. However, challenges such as biocompatibility, sterilization, and miniaturization impose limitations on the feasibility of this approach ^[16,17]. The exposure to high temperatures, pressures, and humidity during sterilization procedures may potentially inflict damage upon electronic components ^[18]. Zarrin et al. ^[19] and Carotenuto et al. ^[20] proposed a laparoscopic sensing instrument based on Fiber Bragg grating (FBG) that demonstrates resistance to high-temperature sterilization. However, the durability of the fiber is inadequate and it exhibits mechanical limitations during complex movements.

In contrast, sensorless force estimation methods enable the prediction of forces using existing resources, obviating the need for additional electronics. Haghighipanah et al. ^[21] and Wang et al. ^[22], employed a cable tension disturbance observer on the surgical robot's end effector to quantify external forces. Tholey et al. ^[23] and Zhao et al. ^[24] utilized the driving motor's current as an estimation of the force exerted between the gripper and tissue at the distal end of the surgical robot. However, the utilization of these motion-based methods poses challenges in acquiring precise modeling information for surgical tools or robotic operators, thereby hindering the establishment of an accurate method for predicting interaction forces. In addition, there are vision-based force estimation methods. When force is applied, the deformation of the soft tissues is captured by the camera. This information is then used to predict the actual force exerted on the object ^[25,26,27]. Noohi et al. ^[28] propose a virtual template-based approach for estimating tool-soft tissue interactions using monocular camera images. In ^[25], relying exclusively on endoscopic images acquired by the stereo camera, Haouchine employed a dynamic biomechanical model based on organ shape to accurately estimate the contact force at the tool tip. To enhance force estimation accuracy, Mozaffari et al. ^[29] introduced a neural fuzzy inference system for identifying tool-tissue forces and maximum local stress in laparoscopic surgery. Despite extensive research on replacing force sensors with image sensors, occlusion, optical noise, and camera motion can still introduce interference in the image, thereby impacting the accuracy of force estimation. These studies demonstrate that both approaches relying on dynamic motion information and strategies based on advanced visual analysis exhibit a certain level of stability, robustness, and efficacy in estimating applied forces. Although these methods have inherent limitations in terms of predictive accuracy, they possess significant potential for wide-ranging applications in the field of force estimation.

Building upon the aforementioned research, this study aims to further investigate and advance the new trajectory of force estimation technology. We propose a novel approach for clamping force prediction based on mechanical clamp blade motion parameters. In this methodology, a backpropagation (BP) neural network is employed to construct a soft tissue force prediction model by considering the interplay between soft tissue pressure, instrument displacement, motion speed, and indenter contact area. The BP neural network exhibits exceptional capabilities in nonlinear analysis, generalization, and fault tolerance, rendering it suitable for effectively capturing the intricate mapping relationship between the clamping force of soft tissue and the motion parameters of the clamp blade. However, the convergence speed of BP neural networks is often slow and they are prone to getting trapped in local optima. Therefore, this study proposes the integration of the sparrow search algorithm (SSA) to enhance the predictive performance of the model for clamping force in dynamic and non-linear surgical environments by optimizing the weights and thresholds of BP neural networks. The proposed algorithm surpasses existing algorithms, such as the genetic algorithm (GA), in terms of search accuracy, convergence speed, and stability ^[30,31]. This is crucial for facilitating rapid and precise decision-making in surgical applications. The model's prediction performance was validated through an in vitro compression experiment on pig kidney tissue. This model not only enables real-time clamping force prediction, but also facilitates more accurate estimation of the force between the instrument and tissue during future surgical procedures, thereby enhancing surgical safety and efficacy.

The remaining sections of the paper are organized as follows: Section 2 provides a detailed description of the design and implementation of soft tissue compression experiments. Section 3 elaborates on the algorithms employed for predicting soft tissue gripping force, along with their theoretical foundations. In Section 4, we present experimental results and conduct an in-depth analysis of the performance of our proposed prediction model. Finally, Section 5 summarizes the research findings and discusses potential avenues for future investigation.

2. Grip force prediction algorithm

Soft tissues exhibit intricate nonlinear and stage-dependent characteristics, posing challenges for traditional modeling approaches to accurately capture their behavior. Therefore, in this study, we propose a mechanical model based on the SSA-optimized BP neural network to precisely simulate the dynamic mechanical response of soft tissues under continuous compression. The proposed approach integrates the SSA's capability in global optimization with the efficiency of the BP neural network for function approximation and simulation of complex system behavior, enabling a comprehensive understanding of soft tissue's mechanical response under compression.

2.1. BP neural network

In our study, we employ BP neural networks to forecast the mechanical properties of soft tissues. BP networks are renowned for their inherent capability to automatically discern intricate relationships between input and output data, a crucial aspect in comprehending the nonlinear mechanical response exhibited by soft tissues. Through the network's self-learning and adaptive capabilities, we can encode the mechanical behavior of soft tissue into its weights, thereby achieving more accurate simulation and prediction of its response.

The BP neural network is comprised of multiple layers, typically including an input layer, one or more hidden layers, and an output layer. In this study, the structure of the soft tissue force prediction model based on the BP neural network is illustrated in Figure 1, enabling the network to effectively capture data patterns ranging from simple to intricate.

Figure 1. The basic structure of the neural network.

DownLoad: Full-Size Img PowerPoint

The operation process of the BP neural network is primarily divided into two stages: forward propagation and backpropagation. During the forward propagation phase, the input signal undergoes a sequential pass through each layer of the network. At each node, the input signal is multiplied by its corresponding weight, added to the bias term, and then transformed into an output signal using an activation function. The input $s_{j}^{(l)}$ of each neuron j in each layer l is obtained through a linear combination of the activation values $o_{j}^{(l-1)}$ , weights $w_{{y}}^{(l)}$ , and biases $b_{j}^{(l)}$ from the previous layer, which can be calculated as follows:

$s_{j}^{(l)} = \sum_{i} w_{i j}^{(l)} o_{j}^{(l-1)}+b_{j}^{(l)}$

(1)

The activation value $o_{j}^{(l)}$ of neuron j corresponds to the output obtained by applying the input signal through the activation function:

$o_{j}^{(l)} = f\left(s_{j}^{(l)}\right)$

(2)

where $f(\cdot)$ is the activation function, and the sigmoid function $f(x) = \frac{1}{1+e^{-x}}$ is usually chosen.

In the backpropagation phase, the network's weight and threshold are updated through error minimization between the network output and target output. Specifically, the network error is defined as the sum of squared differences between the desired output and actual output:

$E = \frac{1}{2} \sum_{k-1}^{N}\left(y_{k}-o_{k}^{(L)}\right)^{2}$

(3)

where $y_{k}$ is the target output value, $o_{k}^{(L)}$ is the actual output value of the network, L represents the output layer, and k traverses all output neurons. Therefore, the error gradient is calculated as follows:

The error gradient for neuron k in the output layer L is as follows:

$\delta_{k}^{(L)} = \left(o_{k}^{(L)}-y_{k}\right)-f^{\prime}\left(s_{k}^{(L)}\right)$

(4)

The error gradient for neuron j in hidden layer l (l < L) is as follows:

$\delta_{j}^{(l)} = \left(\sum_{k} w_{j k}^{(l+1)} \delta_{k}^{(l+1)}\right) \cdot f^{\prime}\left(s_{j}^{(l)}\right)$

(5)

The weight and threshold adjustments of the neural network are updated using the gradient descent method:

$w_{i j}^{(l)} = w_{i j}^{(l)}-\eta \cdot o_i^{(l)} \delta_j^{(l)})$

(6)

$b_{j}^{l} = b_{j}^{l}-\eta \cdot \delta_{j}^{l}$

(7)

where, $\eta$ is the learning rate, a positive number less than 1.

The formula for determining the number of neurons in the hidden layer is:

$m = \sqrt{n+0}+a$

(8)

where m is the number of nodes in the hidden layer, n is the number of nodes in the input layer, o is the number of nodes in the output layer, and a is a constant between 1 and 10.

The BP neural network demonstrates exceptional performance in automatically extracting association rules between input and output data, while effectively retaining the acquired knowledge within its network weights, thereby exhibiting remarkable self-learning and adaptive characteristics. However, when confronted with complex nonlinear data such as soft tissue mechanics, BP networks encounter specific challenges. These include relatively slow convergence rates and high sensitivity to initial weight settings, which may result in the network getting trapped in local minima during the learning process and failing to accurately approximate intricate real-world models. This limitation becomes particularly evident when processing soft tissue data characterized by complex mechanical behavior, necessitating optimization of relevant parameters to overcome these challenges.

2.2. Sparrow search algorithm

The SSA, proposed by Xue and Shen in 2020 ^[32], is based on the behavioral patterns observed in sparrow populations, encompassing the distinct roles of producers, followers, and scouts. By simulating the foraging and predator avoidance strategies of these characters, SSA demonstrated superior capabilities in global search. We integrate this algorithm with a BP neural network to optimize the initialization and adjustment process of the BP network by leveraging SSA's global search abilities. This integration not only enhances the convergence speed of the model, but also improves its adaptability in tackling complex and nonlinear problems, particularly in simulating the mechanical behavior of soft tissues.

A population of n sparrows can be represented as:

$X = \left[\begin{array}{cccc}X_{1, 1} & X_{1, 2} & \cdots & X_{1, d} \\ X_{2, 1} & X_{2, 2} & \cdots & X_{2, d} \\ & & & \\ X_{n, 1} & X_{n, 2} & \cdots & X_{n, d}\end{array}\right]$

(9)

where d represents the dimensionality of the problem variable to be optimized and n denotes the number of sparrows. In the subsequent optimization problem, d signifies the number of BP parameters that need to be optimized, that is, the total number of weights and biases in the BP neural network. Consequently, all sparrows' fitness values can be expressed as:

$F_{x} = \left[\begin{array}{c} f\left(\left[\begin{array}{llll} X_{1, 1} & X_{1, 2} & \cdots & X_{1, d} \end{array}\right]\right) \\ f\left(\left[\begin{array}{llll} X_{2, 1} & X_{2, 2} & \cdots & X_{2, d} \end{array}\right]\right) \\ \\ f\left(\left[\begin{array}{llll} X_{n, 1} & X_{n, 2} & \cdots & X_{n, d} \end{array}\right]\right) \end{array}\right]$

(10)

where f is the fitness value. According to the above SSA principle, the optimization objective function can be established as:

$f = \frac{1}{N} \sum\limits_{k = 1}^{N}\left(y_{k}-o_{k}^{(L)}\right)^{2}$

(11)

where N is the total number of training sets. y_k and $o_{k}^{(L)}$ are the true and predicted values of the kth data, respectively. The fitness function indicates that we ultimately want to get a network with a small training set error. The producer position is updated during the process of population iteration as follows:

$X_{i, j}^{t+1} = \begin{cases}X_{i, j}^t * \exp \left(\frac{-i}{\partial \cdot { item }_{\max }}\right) & R_2 < S T \\ X_{i, j}^t+Q \cdot L & R_2 > S T\end{cases}$

(12)

where t represents the current iteration value, j represents the dimension between 1 and d, $X_{i, j}^{t}$ represents the value of the jth dimension when the ith sparrow iterates t times, and ${item}_{\text {max }}$ is the number with the most iterations. $R_{2} \in[0, 1]$ and $S T \in[0.5, 1.0]$ represent the alarm value and safety threshold, respectively. The variable Q is a random number that conforms to a normal distribution, while L represents an 1*d matrix consisting of all elements equal to 1. If $R_{2} < S T$ , then the sparrow population is safe. Otherwise, some sparrows perceive danger, and all sparrows must quickly move to a safe area.

The follower's position is continuously updated, ensuring constant monitoring of the producer. In the event that the follower detects a superior food source, it engages in competitive interactions to acquire it. If successful, the follower obtains access to the food resource; otherwise, it continues its vigilant surveillance.

$X_{i, j}^{t+1} = \left\{\begin{array}{cc} Q \cdot \exp \left(\frac{X_{ {worst }}^t-X_{i, j}^t}{j^2}\right) & i > n / 2 \\ X_p^{t+1}+\left|X_{i, j}^t-X_p^{t+1}\right| \cdot A^{+} \cdot L & { otherwise } \end{array}\right.$

(13)

where $X_{p}^{t}$ represents the best position occupied by the producer, and $X_{ {worst }}^{t}$ indicates the current worst position. A represents an 1*d matrix where the elements are randomly assigned values of either 1 or -1, and $A^{*} = A^{T}\left(A A^{T}\right)-1$ . When $i>n / 2$ , the ith follower exhibits a suboptimal fitness value attributed to its state of hunger.

The sparrow position is updated in the event of danger as follows:

$X_{i, j}^{t+1} = \begin{cases}X_{{best }}^t+\beta \cdot\left|X_{i, j}^t-X_{ {best }}^t\right| & f_i > f_g \\ X_{i, j}^t+K \cdot\left(\frac{X_{i, j}^t-X_{ {worst }}^t}{\left(f_i-f_w\right)+\varepsilon}\right) & f_i = f_g\end{cases}$

(14)

Scouts constitute 10–20 percent of the population, with $X_{{test }}^{t}$ representing the current optimal proportion. The variable $\beta$ follows a standard normal distribution with a mean of 0 and a variance of 1. $K \in[-1, 1]$ is a random number. $f_{i}$ represents the current fitness value of the sparrow. $f_{g}$ and $f_{w}$ are the current best and worst fitness values. The constant $\varepsilon$ represents the minimum value required to prevent zero partitioning error. The value of $f_{i}>f_{g}$ indicates that the sparrow is positioned at the periphery of the population, rendering it susceptible to predation by hazardous factors. The condition $f_{{i}} = f_{g}$ indicates that the sparrow positioned in the center of the population possesses awareness of potential threats and necessitates repositioning closer to other sparrows. K represents both the direction of movement for the sparrow and serves as a control coefficient for determining step size ^[33].

2.3. Establishing the SSA-BP model

The present study introduces a novel prediction model that combines the SSA algorithm with the BP neural network, aiming to enhance the precision of soft tissue gripping force prediction in micro-invasive surgery. During the training process, we initially determine the specific structural parameters of the input layer, hidden layer, and output layer. Subsequently, SSA is employed to explore optimal weights and biases. The initialization of sparrow population parameters includes specifying the number of sparrows in the population, proportion of discoverers, warning value threshold, and maximum iteration count. In order to precisely guide the optimization process, the fitness function of SSA is defined as the error function of the neural network for evaluating and adjusting the search strategy. Subsequently, based on optimal weight and optimal deviation, prediction of clamping force in soft tissue is performed. The flow chart illustrating the entire optimization process is presented in Figure 2. The key steps of the SSA-BP prediction model are outlined as follows:

1) Construct the initial BP neural network and initialize the weights and biases of the network.

2) Configure the parameters of the SSA algorithm, including sparrow population size, alarm value, safety threshold, and iteration number. Evaluate and rank the fitness of each individual sparrow, identify the location of current optimal and worst solutions, and update the sparrow's fitness in subsequent iterations.

3) Employ SSA to optimize the BP prediction model and iterate through the optimization process until reaching either a predetermined error range or maximum number of iterations.

4) Upon completion of iteration, apply the optimized weights and deviations obtained by SSA to the BP network.

5) Utilize the optimized BP network model for predicting and analyzing collected soft tissue pressure data.

Figure 2. The flow charts of the BP network are optimized by SSA.

DownLoad: Full-Size Img PowerPoint

2.4. Evaluation index

The evaluation indices employed in this study encompass root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE), sum of squares of error (SSE), and coefficient of determination (R²). The evaluation of model prediction error and prediction effect primarily relies on RMSE, MSE, MAE, SSE, and R². A smaller value for RMSE, MSE, MAE, and SSE indicates a better prediction effect, while a larger value for R² suggests an improved prediction effect. The following equations represent the calculation formulas for RMSE, MSE, MAE, SSE, and R². In these equations, $o_{k}^{(L)}$ denotes the predicted output value, $y_{k}$ represents the actual value, $\overline{{y}}$ signifies the mean value of $y_{k}$ , and N corresponds to the number of samples.

$R M S E = \sqrt{\frac{1}{N} \sum\limits_{k = 1}^{N}\left|y_{k}-o_{k}^{(L)}\right|^{2}}$

(15)

$M S E = \frac{1}{N} \sum\limits_{k = 1}^{N}\left|y_{k}-o_{k}^{(L)}\right|^{2}$

(16)

$M A E = \frac{1}{N} \sum\limits_{k = 1}^{N}\left|y_{k}-\sigma_{k}^{(L)}\right|$

(17)

$S S E = \sum\limits_{k = 1}^{N}\left(y_{k}-o_{k}^{(L)}\right)^{2}$

(18)

$R^{2} = 1-\frac{\sum\limits_{k = 1}^{N}\left(o_{k}^{(L)}-y_{k}\right)^{2}}{\sum\limits_{k = 1}^{N}\left(y_{k}-\bar{y}\right)^{2}}$

(19)

3. Experimental design

In this study, a series of in vitro experiments were conducted to simulate surgical procedures and construct a model that accurately reflects the mechanical response of biological soft tissues. The experimental design was rigorous and the data obtained were highly accurate, ensuring the reliability of the developed model.

3.1. Experimental equipment

The primary equipment utilized in the experiment is the SHIMADZU (EZ-LX) universal mechanical testing machine, manufactured by Shimadzu Corporation (Japan). This device features a high-precision force sensor, and its mechanical arm is equipped with a specially designed clamp blade to facilitate swift replacement of experimental instruments without compromising the data reading accuracy of the sensor during experimentation. The experimental setup, as depicted in Figure 3, was utilized as the test platform. Its operation involved controlling the clamp blades with varying contact areas to achieve a uniform compression at low speed until reaching the predetermined compression depth. Subsequently, compression was halted and the force sensor reading was recorded.

Figure 3. Mechanical testing machine.

DownLoad: Full-Size Img PowerPoint

3.2. Experimental material

Considering the influence of the contact area between the clamp leaf and soft tissue during surgical procedures, three types of indenters were designed to simulate various operation scenarios, each with different contact areas. The indenter utilized in the experiment, as depicted in Figure 4, exhibits respective contact areas of 45.76, 29.29, and 16.47 mm². Detailed dimensions of the indenter are provided in Table 1. This variation allows for an assessment of the impact that contact area has on pressure.

Figure 4. Physical drawing of the indenter.

DownLoad: Full-Size Img PowerPoint

Table 1. Indenter design size.

Type	Scale	Profile size (mm)	Slot size (mm)	Contact area (mm²)	Gear height (mm)	Gear angle (°)
YT1	1	15.12*5.0	12.60*2.40	45.76	0.2	90
YT2	0.8	12.16*4.0	10.08*1.92	29.29
YT3	0.6	9.12*3.0	7.56*1.44	16.47

| Show Table

DownLoad: CSV

3.3. Experimental sample

The pig kidney was selected as the experimental sample to simulate laparoscopic surgery due to its high availability and similar tissue characteristics to human abdominal organs. Specifically, pig kidneys offer an ideal model for simulating urinary system surgery. Following removal from pigs, the samples were cut into standardized chunks (30 ± 1 mm × 15 ± 1 mm, thickness 8 ± 1 mm) and stored in portable coolers to maintain freshness and biomechanical stability. The experiment commenced within 5 hours after removal to ensure preservation of the tissue's biomechanical properties.

3.4. Experimental parameters

The experimental parameters encompass compression depth and loading speed to replicate the clamping procedure in real surgical scenarios.

1) Compression depth: To prevent direct contact between the forceps leaf and the support platform beneath the tissue sample during experimentation, which could potentially overload the sensor, a strict upper limit of compression depth was imposed equal to the thickness of the sample. Considering that the treated pig kidney tissue sample had an approximate thickness of 8mm, a maximum compression depth of 3mm was set during experimentation to ensure test process safety and data reliability.

2) Loading speed: The study revealed that the average clamping velocity of the laparoscopic clamp on soft tissue during the operation ranged from 1.2 to 2.4 mm/s ^[34,35,36]. To accurately simulate surgical conditions, forceps blades were subjected to four different loading speeds in this experiment: 0.5, 1, 1.5, and 2 mm/s, aiming to investigate the impact of varying speeds on compression efficacy.

3.5. Experimental process

In order to ensure the stability of the biological properties of the soft tissues throughout the experiment, we opted for a controlled temperature range of 24 ± 2 ℃ and a humidity range of 45–55%. Prior to the commencement of the experiment, the sample was extracted from the cooler and allowed to cool down at room temperature. Subsequently, the clamp blade indenter was affixed to the sensor of the mechanical testing machine ensuring proper alignment with the sample on the test platform. Compression experiments were performed at four different loading speeds, each compression reaching its predetermined maximum depth, while real-time monitoring and recording of force variations were facilitated by dedicated software integrated into the testing apparatus. Figure 5 illustrates an experimental scenario depicting pig kidney tissue compression using forceps leaves.

Figure 5. Experimental setup for compressing pig kidney tissue using a clamping device with leaf-shaped jaws.

DownLoad: Full-Size Img PowerPoint

4. Results and algorithm evaluation

4.1. Analysis of compression experiment results

Based on the obtained experimental data on compression, a preliminary analysis was conducted on the mechanical properties of pig kidney tissue. The temporal pressure curve is depicted in Figure 6.

Figure 6. Pressure curve over time.

DownLoad: Full-Size Img PowerPoint

As illustrated in Figure 6, the stress of pig kidney tissue under static load can be categorized into two distinct stages. In the initial stage, namely from O to 𝐺₀, the pressure exhibits a non-linear increase over time until it reaches its peak value 𝐺₀. This initial phase of rapid growth reflects the immediate elastic response of the organization and is referred to as the stress response phase. In the transition stage from 𝐺₀ to 𝐺₁, the pressure gradually decreases over time and eventually reaches a stable state. At 𝐺₀𝐺₁, the curve demonstrates a decreasing pressure trend with respect to time, indicating that the tissue undergoes relaxation under continuous stress. This stage signifies the gradual adaptation of the material to applied stress and is characteristic of stress relaxation. Although the relaxation process can theoretically extend for several hours until complete tissue relaxation is achieved, this study specifically focuses on the initial phase of relaxation, i.e., the stress variations over a brief time period. During the experiment, the duration of the relaxation stage was less than 60 seconds, yet it yielded a significant effect on tissue relaxation.

This study focuses on the modeling and optimization of stress segments exhibiting hyperelastic behavior during compression in the stress response phase. The influence of experimental parameters, such as press speed and indenter's accessible area, on the applied pressure during interaction with soft tissue is thoroughly investigated and analyzed. By constructing a mechanical model based on collected experimental data, we accurately describe and predict the behavior of soft tissue under compressive forces.

4.1.1. Influence analysis of the contact area of indenter

During the compression experiment, precise control of indenter movement speed was achieved using computer software. The compression speeds were set at 0.5, 1, 1.5, and 2 mm/s to simulate the compression of diverse biological tissues with different contact areas. Each indenter continuously compressed pig kidney tissue at its corresponding speed until reaching a preset compression depth, after which the compression was halted to record data. This allowed for obtaining the relationship between pressure exerted by three distinct forceps indenters on pig kidney tissue and compression depth under various loading speeds, as depicted in Figure 7.

Figure 7. Force-compression depth curves for different indenters.

DownLoad: Full-Size Img PowerPoint

As shown in Figure 7, the relationship between the pressure exerted by the clamp indenter with different contact area and the compression depth under constant speed is analyzed. The findings demonstrate a consistent increase in pressure with an expanding contact area of the indenter, indicating a discernible pattern. Specifically, larger contact areas result in higher peak pressures applied to pig kidney tissue. This phenomenon can be attributed to increased deformation volume of pig kidney tissue during compression due to larger contact areas, leading to greater strain. Considering the viscoelastic properties inherent in pig kidney tissue, this translates into a substantial increase in pressure when interacting with an indenter featuring a larger contact area at equivalent compression depths.

4.1.2. Influence analysis of pressing speed

To accurately assess the influence of pressing speed on compression force, we maintained a constant contact area for the indenter while varying the compression speed, as depicted in Figure 8.

Figure 8. Force-compression depth curves at different loading speeds.

DownLoad: Full-Size Img PowerPoint

The pressure-compression depth relation curve in Figure 8 reveals that, under identical conditions, an increase in compression speed results in a strengthened interaction between the indenter and pig kidney tissue, consequently elevating the peak force value. This phenomenon signifies that higher compression speeds amplify the impact force exerted by the indenter on the structure, leading to more pronounced structural deformation within a shorter duration and subsequently generating augmented pressure.

Through comparative analysis of the data presented in Figures 7 and 8, it was observed that porcine kidney tissue exhibited similar rates of force change when pressure was applied by indenters with different contact areas at a fixed compression speed. However, altering the compression speed while maintaining a constant contact area resulted in significant differences in the rate of force change. This finding suggests that during the force-response phase of the compression experiment, the impact of compression speed on the force experienced by pig kidney tissue is more critical than that of contact area size.

4.2. Analysis of model prediction results

In this study, to assess the prediction efficiency of the SSA-BP neural network model, we employed two classical neural network prediction models for comparison: the standard BP neural network and the GA-BP (Genetic Algorithm optimized BP) neural network. All models were evaluated on an identical dataset to ensure impartiality.

As a model capable of mapping multi-dimensional functions, the BP neural network possesses inherent advantages in addressing complex relationship analysis. In our experimental setup, the input layer comprises three neurons corresponding to the compression depth, compression speed, and contact area of the indenter. The number of neurons in the hidden layer was determined using an empirical formula, resulting in a final network structure of 3-10-10-1. The learning rate is set at 0.0001 with an expected prediction accuracy of 10e-8 and a maximum iteration limit of 5000. Moreover, we specify the tansig function as the transfer function between the input layer and hidden layer while employing the purelin function as the transfer function between hidden layer and output layer. During the training process, we utilize the traindx algorithm as our chosen training method.

The GA-BP neural network models employ genetic algorithms to optimize the initial weights and thresholds of the network. Key parameters encompass a population size of 45 individuals, a maximum number of genetic iterations set at 55, a generation gap value of 0.85, a crossover probability of 0.6, and a mutation probability amounting to 0.022.

The SSA-BP neural network model employs the SSA for optimizing the network parameters. The parameter configuration of the model includes a population size of 20, a maximum iteration count of 30, a discoverer's share in the population set at 15%, and a maximum safety threshold limited to 0.7.

The present study compares and analyzes the root mean square error (RMSE) iteration of the three distinct models during the training process, with the corresponding results presented in Figures 9–11. The error curve of the SSA-BP model (Figure 11) reveals that it achieves the lowest RMSE on both the validation and test sets at the 36th iteration, indicating superior convergence performance compared to GA-BP and standard BP models. The error curve of the GA-BP model, in contrast, exhibits a tendency to plateau after 15 iterations, implying a potential convergence towards a local optimum solution. This outcome underscores the efficacy of SSA optimization strategies in expediting the training process of BP neural networks and showcasing notable advantages in achieving rapid convergence towards lower error levels.

Figure 9. Error curve based on the BP model.

DownLoad: Full-Size Img PowerPoint

Figure 10. Error curve based on the GA-BP model.

DownLoad: Full-Size Img PowerPoint

Figure 11. Error curve based on the SSA-BP model.

DownLoad: Full-Size Img PowerPoint

In order to evaluate the training efficacy of the aforementioned prediction model, we compare the predicted data from the experimental sample with the actual experimental data. Figure 12 illustrates the test results obtained using SSA-BP, GA-BP, and BP neural network models. It is evident that optimizing the BP neural network with SSA has significantly enhanced its predictive performance, resulting in a higher correlation between optimized predictions and actual data compared to those generated by the GA-BP model. The comparison of absolute error (AE) among SSA-BP, GA-BP, and BP neural networks is illustrated in Figure 13. It can be observed that the initial network error is below 0.15 N, while the GA-optimized BP network achieves an error rate below 0.1 N and the SSA-optimized BP network attains an even lower error rate below 0.05 N. These results demonstrate that the SSA-BP neural network model exhibits excellent predictive performance and successfully meets the desired target.

Figure 12. Comparison of pressure prediction results.

DownLoad: Full-Size Img PowerPoint

Figure 13. Comparison of pressure prediction error.

DownLoad: Full-Size Img PowerPoint

To ensure the reliability of simulation results, three prediction models were compared and analyzed based on their performance indexes, which included RMSE, MSE, MAE and SSE. Generally speaking, lower values of these indicators indicate better predictive performance and higher model accuracy. The GA-BP and SSA-BP neural network models exhibit significant improvement in all error indicators when compared to the standard BP model, as depicted in Figure 14. Notably, the SSA-BP model demonstrates the most substantial reduction in errors. The coefficient of determination (R2) serves as a crucial metric for assessing the efficacy of model fitting to a given dataset. Notably, the SSA-BP neural network exhibits the highest R2 value, indicating superior data fitting capability. Consequently, the SSA-BP model demonstrates enhanced predictive capability compared to other comparative models.

Figure 14. Comparison of evaluation results.

DownLoad: Full-Size Img PowerPoint

5. Conclusions

The present study aims to investigate the force feedback in minimally invasive surgery, with a specific focus on the intricate interaction between surgical instruments and delicate biological soft tissues. In order to address the issue of inadequate clamping force feedback mechanisms in existing minimally invasive surgical robot systems, this study proposes a clamping force detection scheme based on clamp blade parameters. Through a series of compression tests conducted on isolated pig kidney tissue using a universal test machine, the relationship between pressure and compression depth, compression speed, and contact area of the indenter is thoroughly analyzed, leading to the establishment of a mechanical model. This model utilizes an SSA-optimized BP neural network for accurate estimation and prediction of clamping force during surgery. A comparative analysis with traditional BP models and GA-BP models reveals that the SSA-BP model outperforms both in terms of prediction accuracy and convergence speed. Notably, when considering key performance indicators such as the RMSE, MSE, MAE, and SSE, the SSA-BP model demonstrates superior predictive ability and higher degree of fit.

In summary, the main conclusions of this study are as follows:

1) The key factors influencing the clamping force of surgical instruments were determined through rigorous experimentation, providing a solid empirical foundation for the development of mechanical models.

2) The proposed SSA-BP neural network model exhibits remarkable advantages in accurately predicting force feedback, thereby enhancing its predictive accuracy.

3) Simulation results validate the potential of the SSA-BP model in significantly improving the precision of clamping force prediction within minimally invasive surgical robot systems.

Future work will primarily focus on enhancing the model's generalization capabilities and validating it across a broader range of biological soft tissues. Furthermore, there will be an exploration of integrating this predictive model into existing minimally invasive surgical robot systems to enhance their force-feedback mechanisms and improve overall surgical safety and efficiency.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This paper is supported by National Natural Science Foundation of China under grant No. 52305066.

Conflict of interest

The authors declare there is no conflict of interest.

References

[1]	J. Yuh, Design and Control of Autonomous Underwater Robots: A Survey, Auton. Robot., 8 (2000), 7–24. doi: 10.1023/A:1008984701078. doi: 10.1023/A:1008984701078
[2]	K. H. Low, Maneuvering of biomimetic fish by integrating a bouyancy body with modular undulating fins, Int. J. Humanoid Robot., 4 (2007), 671–695. doi: 10.1142/S0219843607001217. doi: 10.1142/S0219843607001217
[3]	C. Ren, X. Zhi, Y. Pu, F. Zhang, A multi-scale UAV image matching method applied to large-scale landslide reconstruction, Math. Biosci. Eng., 18 (2021), 2274–2287. doi: 10.3934/MBE.2021115. doi: 10.3934/MBE.2021115
[4]	C. I. Sprague, O. Ozkahraman, A. Munafo, R. Marlow, A. Phillips, P. Ogren, Improving the Modularity of AUV Control Systems using Behaviour Trees, AUV 2018 - 2018 IEEE/OES Auton. Underw. Veh. Work. Proc., Nov. 2018, doi: 10.1109/AUV.2018.8729810.
[5]	G. Ferri, A. Munafo, K. D. LePage, An Autonomous Underwater Vehicle Data-Driven Control Strategy for Target Tracking, IEEE J. Ocean. Eng., 43 (2018), 323–343. doi: 10.1109/JOE.2018.2797558. doi: 10.1109/JOE.2018.2797558
[6]	G. Salavasidis, A. Munafò, C. A. Harris, T. Prampart, R. Templeton, M. Smart, et al., Terrain-aided navigation for long-endurance and deep-rated autonomous underwater vehicles, J. F. Robot., 36 (2019), 447–474. doi: 10.1002/ROB.21832. doi: 10.1002/ROB.21832
[7]	W. Zhao, Y. Hu, L. Wang, Construction and Central Pattern Generator-Based Control of a Flipper-Actuated Turtle-Like Underwater Robot, Adv. Robot., 23 (2009), 19–43. doi: 10.1163/156855308X392663. doi: 10.1163/156855308X392663
[8]	C. Zhou, K. H. Low, Kinematic modeling framework for biomimetic undulatory fin motion based on coupled nonlinear oscillators, in 2010 IEEE/RSJ Int. Conf. Intel. Robots Syst., 2010,934–939. doi: 10.1109/IROS.2010.5651162.
[9]	J. Yu, K. Wang, M. Tan, J. Zhang, Design and control of an embedded vision guided robotic fish with multiple control surfaces, Sci. World J., 2014 (2014), 631296. doi: 10.1155/2014/631296. doi: 10.1155/2014/631296
[10]	A. J. Ijspeert, A. Crespi, Online trajectory generation in an amphibious snake robot using a lamprey-like central pattern generator model, Proc. - IEEE Int. Conf. Robot. Autom., (2007), 262–268. doi: 10.1109/ROBOT.2007.363797. doi: 10.1109/ROBOT.2007.363797
[11]	D. Korkmaz, G. Ozmen Koca, G. Li, C. Bal, M. Ay, Z. H. Akpolat, Locomotion control of a biomimetic robotic fish based on closed loop sensory feedback CPG model, J. Mar. Eng. Technol., 20 (2021), 125–137. doi: 10.1080/20464177.2019.1638703. doi: 10.1080/20464177.2019.1638703
[12]	J.-K. Ryu, N. Chong, B.-J. You, H. Christensen, Locomotion of snake-like robots using adaptive neural oscillators, Intell. Serv. Robot., 3 (2009), 1–10. doi: 10.1007/s11370-009-0049-4. doi: 10.1007/s11370-009-0049-4
[13]	M. Ikeda, K. Watanabe, I. Nagai, Propulsion movement control using CPG for a Manta robot, in The 6th Int. Conf. Soft Comput. Intel. Syst., and The 13th Int. Sympo. on Adv. Intel. Syst., 2012,755–758. doi: 10.1109/SCIS-ISIS.2012.6505174.
[14]	L. Shang, S. Wang, M. Tan, Fuzzy Logic PID Based Control Design for a Biomimetic Underwater Vehicle with Two Undulating Long-fins, in India Conf. (INDICON) 2015 Annual IEEE, 2015, 1–6.
[15]	J. Zhang, Multimodal swimming control of a robotic fish with pectoral fins using a CPG network, Chinese Sci. Bull., 57 (2012), 1209–1216.
[16]	K. Inoue, S. Ma, C. Jin, Neural oscillator network-based controller for meandering locomotion of snake-like robots, in IEEE Int. Conf. Robot. Autom., 2004. Proc.. ICRA '04. 2004, 5 (2004), 5064–5069. doi: 10.1109/ROBOT.2004.1302520.
[17]	C. Zhou, Modeling and control of swimming gaits for fish-like robots using coupled nonlinear oscillators, Nanyang Technological University, 2012.
[18]	V. D. Nguyen, D. K. Phan, C. A. T. Pham, D. H. Kim, V. T. Dinh, T. T. Nguyen, Study on Determining the Number of Fin-Rays of a Gymnotiform Undulating Fin Robot, Lect. Notes Electr. Eng., 465 (2018), 745–752. doi: 10.1007/978-3-319-69814-4_72. doi: 10.1007/978-3-319-69814-4_72
[19]	X. Dong, S. Wang, Z. Cao, M. Tan, CPG Based Motion Control for an Underwater Thruster with Undulating Long-Fin, IFAC Proc. Vol., 41 (2008), 5433–5438. doi: 10.3182/20080706-5-KR-1001.00916. doi: 10.3182/20080706-5-KR-1001.00916
[20]	A. Crespi, D. Lachat, A. Pasquier, A. J. Ijspeert, Controlling swimming and crawling in a fish robot using a central pattern generator, Auton. Robots, 25 (2008), 3–13. doi: 10.1007/s10514-007-9071-6. doi: 10.1007/s10514-007-9071-6
[21]	M. Sfakiotakis, A. Manolis, N. Spyridakis, J. Fasoulas, M. Arapis, Development and Experimental Evaluation of an Undulatory Fin Prototype, in Proceedings of the RAAD 2013 22nd Int. Workshop on Robot. Alpe-Adria-Danube Region, 2013, no. May 2014, 1–8.
[22]	M. Sfakiotakis, R. Gliva, M. Mountoufaris, Steering-plane motion control for an underwater robot with a pair of undulatory fin propulsors, in 2016 24th Mediterranean Conf. Control Autom. (MED), 2016,496–503, doi: 10.1109/MED.2016.7535989.
[23]	V. H. Nguyen, V. D. Nguyen, V. T. Duong, H. H. Nguyen, T. T. Nguyen, Experimental Study on Kinematic Parameter and Undulating Pattern Influencing Thrust Performance of Biomimetic Underwater Undulating Driven Propulsor, Int. J. Mech. Mechatronics Eng., 20 (2020), 7.
[24]	W. Zhao, J. Yu, Y. Fang, L. Wang, Development of Multi-mode Biomimetic Robotic Fish Based on Central Pattern Generator, 2006 IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2006, doi: 10.1109/IROS.2006.281800. doi: 10.1109/IROS.2006.281800
[25]	X. Wu, S. Ma, CPG-based control of serpentine locomotion of a snake-like robot, Mechatronics, 20 (2010), 326–334. doi: 10.1016/j.mechatronics.2010.01.006. doi: 10.1016/j.mechatronics.2010.01.006
[26]	R. Gliva, M. Mountoufaris, N. Spyridakis, M. Sfakiotakis, Development of a Bio-Inspired Underwater Robot Prototype with Undulatory Fin Propulsion, in 9th Int. Conf. on New Horiz. Ind. Bus. Edu. (NHIBE'15), 2015, 1–6.
[27]	Z. Lu, S. Ma, B. Li, Y. Wang, 3D Locomotion of a Snake-like Robot Controlled by Cyclic Inhibitory CPG Model, 2006 IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2006, doi: 10.1109/IROS.2006.281801. doi: 10.1109/IROS.2006.281801
[28]	M. Wang, J. Yu, M. Tan, G. Zhang, A CPG-based sensory feedback control method for robotic fish locomotion, in Proceedings of the 30th Chinese Control Conf., 2011, 4115–4120.
[29]	C. Zhou, K. H. Low, On-line Optimization of Biomimetic Undulatory Swimming by an Experiment-based Approach, J. Bionic. Eng., 11 (2014), 213–225. doi: 10.1016/S1672-6529(14)60042-1. doi: 10.1016/S1672-6529(14)60042-1
[30]	M. Sfakiotakis, J. Fasoulas, R. Gliva, A. Yannakoudakis, Model-based fin ray joint tracking control for undulatory fin mechanisms, Int. Congr. Ultra Mod. Telecommun. Control Syst. Work., 2016 (2016), 158–165. doi: 10.1109/ICUMT.2015.7382421. doi: 10.1109/ICUMT.2015.7382421
[31]	C. Zhou and K. H. Low, Design and locomotion control of a biomimetic underwater vehicle with fin propulsion, IEEE/ASME Trans. Mechatronics, 17 (2012), 25–35. doi: 10.1109/TMECH.2011.2175004. doi: 10.1109/TMECH.2011.2175004
[32]	M. Sfakiotakis, J. Fasoulas, M. M. Kavoussanos, M. Arapis, Experimental investigation and propulsion control for a bio-inspired robotic undulatory fin, Robotica, 33 (2015), 1062–1084. doi: 10.1017/S0263574714002926. doi: 10.1017/S0263574714002926
[33]	P. M. Özturan, A. Bozanta, B. Basarir-Ozel, E. Akar, M. Coşkun, A roadmap for an integrated university information system based on connectivity issues: Case of Turkey, Int. J. Manag. Sci. Inf. Technol., 17 (2015), 1–23. doi: 10.14313/JAMRIS. doi: 10.14313/JAMRIS
[34]	K. H. Low, A. Willy, Biomimetic motion planning of an undulating robotic fish fin, JVC/Journal Vib. Control, 12 (2006), 1337–1359. doi: 10.1177/1077546306070597. doi: 10.1177/1077546306070597
[35]	R. Ruiz-Torres, O. M. Curet, G. V. Lauder, M. A. Maciver, Erratum: Kinematics of the ribbon fin in hovering and swimming of the electric ghost knifefish (Journal of Experimental Biology 216, (823-834)), J. Exp. Biol., 217 (2014), 3765–3766. doi: 10.1242/jeb.113670. doi: 10.1242/jeb.113670
[36]	K. H. Low, Modelling and parametric study of modular undulating fin rays for fish robots, Mech. Mach. Theory, 44 (2009), 615–632. doi: 10.1016/j.mechmachtheory.2008.11.009. doi: 10.1016/j.mechmachtheory.2008.11.009
[37]	I. English, H. Liu, O. M. Curet, Robotic device shows lack of momentum enhancement for gymnotiform swimmers, Bioinspir. Biomim., 14 (2019), 024001. doi: 10.1088/1748-3190/aaf983. doi: 10.1088/1748-3190/aaf983
[38]	I. D. Neveln, R. Bale, A. P. S. Bhalla, O. M. Curet, N. A. Patankar, M. A. MacIver, Undulating fins produce off-axis thrust and flow structures, J. Exp. Biol., 217 (2014), 201–213. doi: 10.1242/jeb.091520. doi: 10.1242/jeb.091520
[39]	M. Ikeda, S. Hikasa, K. Watanabe, I. Nagai, A CPG design of considering the attitude for the propulsion control of a Manta robot, in IECON 2013 - 39th Ann. Conf. IEEE Ind. Electron. Soc., 2013, 6354–6358. doi: 10.1109/IECON.2013.6700181.
[40]	C. Liu, Q. Chen, D. Wang, CPG-inspired workspace trajectory generation and adaptive locomotion control for quadruped robots, IEEE Trans. Syst. man, Cybern. Part B, Cybern. a Publ. IEEE Syst. Man, Cybern. Soc., 41 (2011), 867–880. doi: 10.1109/TSMCB.2010.2097589. doi: 10.1109/TSMCB.2010.2097589
[41]	C. M. A. Pinto, D. Rocha, C. P. Santos, Hexapod robots: New CPG model for generation of trajectories, J. Numer. Anal. Ind. Appl. Math., 7 (2012), 15–26.
[42]	T. Wang, W. Guo, M. Li, F. Zha, L. Sun, CPG Control for Biped Hopping Robot in Unpredictable Environment, J. Bionic Eng., 9 (2012), 29–38. doi: 10.1016/S1672-6529(11)60094-2. doi: 10.1016/S1672-6529(11)60094-2
[43]	S. Inagaki, H. Yuasa, T. Arai, CPG model for autonomous decentralized multi-legged robot system—generation and transition of oscillation patterns and dynamics of oscillators, Rob. Auton. Syst., 44 (2003), 171–179. doi: 10.1016/S0921-8890(03)00067-8. doi: 10.1016/S0921-8890(03)00067-8
[44]	M. Mokhtari, M. Taghizadeh, M. Mazare, Hybrid Adaptive Robust Control Based on CPG and ZMP for a Lower Limb Exoskeleton, Robotica, 39 (2021), 181–199. doi: 10.1017/S0263574720000260. doi: 10.1017/S0263574720000260
[45]	X. Wu, L. Teng, W. Chen, G. Ren, Y. Jin, H. Li, CPGs with continuous adjustment of phase difference for locomotion control, Int. J. Adv. Robot. Syst., 10 (2013), 1–13. doi: 10.5772/56490. doi: 10.5772/56490
[46]	Y. Cao, Y. Lu, Y. Cai, S. Bi, G. Pan, CPG-fuzzy-based control of a cownose-ray-like fish robot, Ind. Robot Int. J. Robot. Res. Appl., 46 (2019), 779–791. doi: 10.1108/IR-02-2019-0029. doi: 10.1108/IR-02-2019-0029
[47]	I. B. Jeong, C. S. Park, K. I. Na, S. Han, J. H. Kim, Particle swarm optimization-based central patter generator for robotic fish locomotion, 2011 IEEE Congr. Evol. Comput. CEC 2011, (2011), 152–157, doi: 10.1109/CEC.2011.5949612. doi: 10.1109/CEC.2011.5949612
[48]	M. C. Chen Wang, G. Xie, L. Wang, CPG-based locomotion control of a robotic fish: Using linear oscillators and reducing control parameters via PSO, Int. J. Innov. Comput. Inf. Control, 7 (2011), 4237–4249.
[49]	J. Yu, Z. Wu, M. Wang, M. Tan, CPG Network Optimization for a Biomimetic Robotic Fish via PSO, IEEE Trans. Neural Networks Learn. Syst., 27 (2016), 1962–1968. doi: 10.1109/TNNLS.2015.2459913. doi: 10.1109/TNNLS.2015.2459913
[50]	J. Lee, S. Lee, S. Chang, B.-H. Ahn, A Comparison of GA and PSO for Excess Return Evaluation in Stock Markets, Lect. Notes Comput. Sci., 3562 (2005), 221–230. doi: 10.1007/11499305_23. doi: 10.1007/11499305_23
[51]	C. Niehaus, T. Röfer, T. Laue, Gait Optimization on a Humanoid Robot using Particle Swarm Optimization, 2007.
[52]	Y. Zou, T. Liu, D. Liu, F. Sun, Reinforcement learning-based real-time energy management for a hybrid tracked vehicle, Appl. Energy, 171 (2016), 372–382. doi: 10.1016/j.apenergy.2016.03.082. doi: 10.1016/j.apenergy.2016.03.082
[53]	T. Liu, Y. Zou, D. Liu, F. Sun, Reinforcement learning-based energy management strategy for a hybrid electric tracked vehicle, Energies, 8 (2015), 7243–7260. doi: 10.3390/en8077243. doi: 10.3390/en8077243
[54]	R. C. Hsu, C. T. Liu, D. Y. Chan, A reinforcement-learning-based assisted power management with QoR provisioning for human-electric hybrid bicycle, IEEE Trans. Ind. Electron., 59 (2012), 3350–3359. doi: 10.1109/TIE.2011.2141092. doi: 10.1109/TIE.2011.2141092
[55]	H. Lee, C. Kang, Y. Il Park, N. Kim, S. W. Cha, Online data-driven energy management of a hybrid electric vehicle using model-based Q-learning, IEEE Access, 8 (2020), 84444–84454. doi: 10.1109/ACCESS.2020.2992062. doi: 10.1109/ACCESS.2020.2992062
[56]	T. Liu, X. H, S. E. Li, D. Cao, Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle, IEEE/ASME Trans. Mechatronics, 22 (2017), 1497–1507. doi: 10.1109/TMECH.2017.2707338. doi: 10.1109/TMECH.2017.2707338
[57]	Y. Lu, R. He, X. Chen, B. Lin, C. Yu, Energy-efficient depth-based opportunistic routing with q-learning for underwater wireless sensor networks, Sensors (Switzerland), 20 (2020), 1–25. doi: 10.3390/s20041025. doi: 10.3390/s20041025
[58]	R. Plate, C. Wakayama, Utilizing kinematics and selective sweeping in reinforcement learning-based routing algorithms for underwater networks, Ad Hoc Networks, 34 (2015), 105–120. doi: 10.1016/j.adhoc.2014.09.012. doi: 10.1016/j.adhoc.2014.09.012
[59]	Y. He, L. Xing, Y. Chen, W. Pedrycz, L. Wang, G. Wu, A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites, IEEE Trans. Syst. Man. Cybern. Syst., 1–12, 2020. doi: 10.1109/tsmc.2020.3020732. doi: 10.1109/tsmc.2020.3020732
[60]	Z. Jin, Y. Ma, Y. Su, S. Li, X. Fu, A Q-learning-based delay-aware routing algorithm to extend the lifetime of underwater sensor networks, Sensors (Switzerland), 17 (2017), 1–15. doi: 10.3390/s17071660. doi: 10.3390/s17071660
[61]	D. Zhang, Z. H. Ye, P. C. Chen, Q. G. Wang, Intelligent event-based output feedback control with Q-learning for unmanned marine vehicle systems, Control Eng. Pract., 105 (2020), 104616. doi: 10.1016/j.conengprac.2020.104616. doi: 10.1016/j.conengprac.2020.104616
[62]	Z. Chen, B. Qin, M. Sun, Q. Sun, Q-Learning-based parameters adaptive algorithm for active disturbance rejection control and its application to ship course control, Neurocomputing, 408 (2020), 51–63. doi: 10.1016/j.neucom.2019.10.060. doi: 10.1016/j.neucom.2019.10.060
[63]	Y. Nakamura, T. Mori, S. Ishii, Natural Policy Gradient Reinforcement Learning for a CPG Control of a Biped Robot, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 3242 (2004), 972–981. doi: 10.1007/978-3-540-30217-9_98. doi: 10.1007/978-3-540-30217-9_98
[64]	T. Mori, Y. Nakamura, M. A. Sato, S. Ishii, Reinforcement learning for a CPG-driven biped robot, Proc. Natl. Conf. Artif. Intell., (2004), 623–630.

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)