Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion

Van Dong Nguyen; Dinh Quoc Vo; Van Tu Duong; Huy Hung Nguyen; Tan Tien Nguyen; Van Dong Nguyen; Dinh Quoc Vo; Van Tu Duong; Huy Hung Nguyen; Tan Tien Nguyen

doi:10.3934/mbe.2022033

Mathematical Biosciences and Engineering

2022, Volume 19, Issue 1: 738-758. doi: 10.3934/mbe.2022033

Previous Article Next Article

Research article Special Issues

Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion

1.
Faculty of Mechanical Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam
2.
Vietnam National University Ho Chi Minh City, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Vietnam
3.
National Key Laboratory of Digital Control and System Engineering (DCSELab), HCMUT, 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam
4.
Faculty of Electronics and Telecommunication, Saigon University, Vietnam

Received: 10 August 2021 Accepted: 08 November 2021 Published: 19 November 2021

This article proposes a locomotion controller inspired by black Knifefish for undulating elongated fin robot. The proposed controller is built by a modified CPG network using sixteen coupled Hopf oscillators with the feedback of the angle of each fin-ray. The convergence rate of the modified CPG network is optimized by a reinforcement learning algorithm. By employing the proposed controller, the undulating elongated fin robot can realize swimming pattern transformations naturally. Additionally, the proposed controller enables the configuration of the swimming pattern parameters known as the amplitude envelope, the oscillatory frequency to perform various swimming patterns. The implementation processing of the reinforcement learning-based optimization is discussed. The simulation and experimental results show the capability and effectiveness of the proposed controller through the performance of several swimming patterns in the varying oscillatory frequency and the amplitude envelope of each fin-ray.
- reinforcement learning,
- undulating fin,
- biomimetic robot,
- Hopf oscillator
Citation: Van Dong Nguyen, Dinh Quoc Vo, Van Tu Duong, Huy Hung Nguyen, Tan Tien Nguyen. Reinforcement learning-based optimization of locomotion controller using multiple coupled CPG oscillators for elongated undulating fin propulsion[J]. Mathematical Biosciences and Engineering, 2022, 19(1): 738-758. doi: 10.3934/mbe.2022033

Related Papers:

Abstract

This article proposes a locomotion controller inspired by black Knifefish for undulating elongated fin robot. The proposed controller is built by a modified CPG network using sixteen coupled Hopf oscillators with the feedback of the angle of each fin-ray. The convergence rate of the modified CPG network is optimized by a reinforcement learning algorithm. By employing the proposed controller, the undulating elongated fin robot can realize swimming pattern transformations naturally. Additionally, the proposed controller enables the configuration of the swimming pattern parameters known as the amplitude envelope, the oscillatory frequency to perform various swimming patterns. The implementation processing of the reinforcement learning-based optimization is discussed. The simulation and experimental results show the capability and effectiveness of the proposed controller through the performance of several swimming patterns in the varying oscillatory frequency and the amplitude envelope of each fin-ray.

References

[1]	J. Yuh, Design and Control of Autonomous Underwater Robots: A Survey, Auton. Robot., 8 (2000), 7–24. doi: 10.1023/A:1008984701078. doi: 10.1023/A:1008984701078
[2]	K. H. Low, Maneuvering of biomimetic fish by integrating a bouyancy body with modular undulating fins, Int. J. Humanoid Robot., 4 (2007), 671–695. doi: 10.1142/S0219843607001217. doi: 10.1142/S0219843607001217
[3]	C. Ren, X. Zhi, Y. Pu, F. Zhang, A multi-scale UAV image matching method applied to large-scale landslide reconstruction, Math. Biosci. Eng., 18 (2021), 2274–2287. doi: 10.3934/MBE.2021115. doi: 10.3934/MBE.2021115
[4]	C. I. Sprague, O. Ozkahraman, A. Munafo, R. Marlow, A. Phillips, P. Ogren, Improving the Modularity of AUV Control Systems using Behaviour Trees, AUV 2018 - 2018 IEEE/OES Auton. Underw. Veh. Work. Proc., Nov. 2018, doi: 10.1109/AUV.2018.8729810.
[5]	G. Ferri, A. Munafo, K. D. LePage, An Autonomous Underwater Vehicle Data-Driven Control Strategy for Target Tracking, IEEE J. Ocean. Eng., 43 (2018), 323–343. doi: 10.1109/JOE.2018.2797558. doi: 10.1109/JOE.2018.2797558
[6]	G. Salavasidis, A. Munafò, C. A. Harris, T. Prampart, R. Templeton, M. Smart, et al., Terrain-aided navigation for long-endurance and deep-rated autonomous underwater vehicles, J. F. Robot., 36 (2019), 447–474. doi: 10.1002/ROB.21832. doi: 10.1002/ROB.21832
[7]	W. Zhao, Y. Hu, L. Wang, Construction and Central Pattern Generator-Based Control of a Flipper-Actuated Turtle-Like Underwater Robot, Adv. Robot., 23 (2009), 19–43. doi: 10.1163/156855308X392663. doi: 10.1163/156855308X392663
[8]	C. Zhou, K. H. Low, Kinematic modeling framework for biomimetic undulatory fin motion based on coupled nonlinear oscillators, in 2010 IEEE/RSJ Int. Conf. Intel. Robots Syst., 2010,934–939. doi: 10.1109/IROS.2010.5651162.
[9]	J. Yu, K. Wang, M. Tan, J. Zhang, Design and control of an embedded vision guided robotic fish with multiple control surfaces, Sci. World J., 2014 (2014), 631296. doi: 10.1155/2014/631296. doi: 10.1155/2014/631296
[10]	A. J. Ijspeert, A. Crespi, Online trajectory generation in an amphibious snake robot using a lamprey-like central pattern generator model, Proc. - IEEE Int. Conf. Robot. Autom., (2007), 262–268. doi: 10.1109/ROBOT.2007.363797. doi: 10.1109/ROBOT.2007.363797
[11]	D. Korkmaz, G. Ozmen Koca, G. Li, C. Bal, M. Ay, Z. H. Akpolat, Locomotion control of a biomimetic robotic fish based on closed loop sensory feedback CPG model, J. Mar. Eng. Technol., 20 (2021), 125–137. doi: 10.1080/20464177.2019.1638703. doi: 10.1080/20464177.2019.1638703
[12]	J.-K. Ryu, N. Chong, B.-J. You, H. Christensen, Locomotion of snake-like robots using adaptive neural oscillators, Intell. Serv. Robot., 3 (2009), 1–10. doi: 10.1007/s11370-009-0049-4. doi: 10.1007/s11370-009-0049-4
[13]	M. Ikeda, K. Watanabe, I. Nagai, Propulsion movement control using CPG for a Manta robot, in The 6th Int. Conf. Soft Comput. Intel. Syst., and The 13th Int. Sympo. on Adv. Intel. Syst., 2012,755–758. doi: 10.1109/SCIS-ISIS.2012.6505174.
[14]	L. Shang, S. Wang, M. Tan, Fuzzy Logic PID Based Control Design for a Biomimetic Underwater Vehicle with Two Undulating Long-fins, in India Conf. (INDICON) 2015 Annual IEEE, 2015, 1–6.
[15]	J. Zhang, Multimodal swimming control of a robotic fish with pectoral fins using a CPG network, Chinese Sci. Bull., 57 (2012), 1209–1216.
[16]	K. Inoue, S. Ma, C. Jin, Neural oscillator network-based controller for meandering locomotion of snake-like robots, in IEEE Int. Conf. Robot. Autom., 2004. Proc.. ICRA '04. 2004, 5 (2004), 5064–5069. doi: 10.1109/ROBOT.2004.1302520.
[17]	C. Zhou, Modeling and control of swimming gaits for fish-like robots using coupled nonlinear oscillators, Nanyang Technological University, 2012.
[18]	V. D. Nguyen, D. K. Phan, C. A. T. Pham, D. H. Kim, V. T. Dinh, T. T. Nguyen, Study on Determining the Number of Fin-Rays of a Gymnotiform Undulating Fin Robot, Lect. Notes Electr. Eng., 465 (2018), 745–752. doi: 10.1007/978-3-319-69814-4_72. doi: 10.1007/978-3-319-69814-4_72
[19]	X. Dong, S. Wang, Z. Cao, M. Tan, CPG Based Motion Control for an Underwater Thruster with Undulating Long-Fin, IFAC Proc. Vol., 41 (2008), 5433–5438. doi: 10.3182/20080706-5-KR-1001.00916. doi: 10.3182/20080706-5-KR-1001.00916
[20]	A. Crespi, D. Lachat, A. Pasquier, A. J. Ijspeert, Controlling swimming and crawling in a fish robot using a central pattern generator, Auton. Robots, 25 (2008), 3–13. doi: 10.1007/s10514-007-9071-6. doi: 10.1007/s10514-007-9071-6
[21]	M. Sfakiotakis, A. Manolis, N. Spyridakis, J. Fasoulas, M. Arapis, Development and Experimental Evaluation of an Undulatory Fin Prototype, in Proceedings of the RAAD 2013 22nd Int. Workshop on Robot. Alpe-Adria-Danube Region, 2013, no. May 2014, 1–8.
[22]	M. Sfakiotakis, R. Gliva, M. Mountoufaris, Steering-plane motion control for an underwater robot with a pair of undulatory fin propulsors, in 2016 24th Mediterranean Conf. Control Autom. (MED), 2016,496–503, doi: 10.1109/MED.2016.7535989.
[23]	V. H. Nguyen, V. D. Nguyen, V. T. Duong, H. H. Nguyen, T. T. Nguyen, Experimental Study on Kinematic Parameter and Undulating Pattern Influencing Thrust Performance of Biomimetic Underwater Undulating Driven Propulsor, Int. J. Mech. Mechatronics Eng., 20 (2020), 7.
[24]	W. Zhao, J. Yu, Y. Fang, L. Wang, Development of Multi-mode Biomimetic Robotic Fish Based on Central Pattern Generator, 2006 IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2006, doi: 10.1109/IROS.2006.281800. doi: 10.1109/IROS.2006.281800
[25]	X. Wu, S. Ma, CPG-based control of serpentine locomotion of a snake-like robot, Mechatronics, 20 (2010), 326–334. doi: 10.1016/j.mechatronics.2010.01.006. doi: 10.1016/j.mechatronics.2010.01.006
[26]	R. Gliva, M. Mountoufaris, N. Spyridakis, M. Sfakiotakis, Development of a Bio-Inspired Underwater Robot Prototype with Undulatory Fin Propulsion, in 9th Int. Conf. on New Horiz. Ind. Bus. Edu. (NHIBE'15), 2015, 1–6.
[27]	Z. Lu, S. Ma, B. Li, Y. Wang, 3D Locomotion of a Snake-like Robot Controlled by Cyclic Inhibitory CPG Model, 2006 IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2006, doi: 10.1109/IROS.2006.281801. doi: 10.1109/IROS.2006.281801
[28]	M. Wang, J. Yu, M. Tan, G. Zhang, A CPG-based sensory feedback control method for robotic fish locomotion, in Proceedings of the 30th Chinese Control Conf., 2011, 4115–4120.
[29]	C. Zhou, K. H. Low, On-line Optimization of Biomimetic Undulatory Swimming by an Experiment-based Approach, J. Bionic. Eng., 11 (2014), 213–225. doi: 10.1016/S1672-6529(14)60042-1. doi: 10.1016/S1672-6529(14)60042-1
[30]	M. Sfakiotakis, J. Fasoulas, R. Gliva, A. Yannakoudakis, Model-based fin ray joint tracking control for undulatory fin mechanisms, Int. Congr. Ultra Mod. Telecommun. Control Syst. Work., 2016 (2016), 158–165. doi: 10.1109/ICUMT.2015.7382421. doi: 10.1109/ICUMT.2015.7382421
[31]	C. Zhou and K. H. Low, Design and locomotion control of a biomimetic underwater vehicle with fin propulsion, IEEE/ASME Trans. Mechatronics, 17 (2012), 25–35. doi: 10.1109/TMECH.2011.2175004. doi: 10.1109/TMECH.2011.2175004
[32]	M. Sfakiotakis, J. Fasoulas, M. M. Kavoussanos, M. Arapis, Experimental investigation and propulsion control for a bio-inspired robotic undulatory fin, Robotica, 33 (2015), 1062–1084. doi: 10.1017/S0263574714002926. doi: 10.1017/S0263574714002926
[33]	P. M. Özturan, A. Bozanta, B. Basarir-Ozel, E. Akar, M. Coşkun, A roadmap for an integrated university information system based on connectivity issues: Case of Turkey, Int. J. Manag. Sci. Inf. Technol., 17 (2015), 1–23. doi: 10.14313/JAMRIS. doi: 10.14313/JAMRIS
[34]	K. H. Low, A. Willy, Biomimetic motion planning of an undulating robotic fish fin, JVC/Journal Vib. Control, 12 (2006), 1337–1359. doi: 10.1177/1077546306070597. doi: 10.1177/1077546306070597
[35]	R. Ruiz-Torres, O. M. Curet, G. V. Lauder, M. A. Maciver, Erratum: Kinematics of the ribbon fin in hovering and swimming of the electric ghost knifefish (Journal of Experimental Biology 216, (823-834)), J. Exp. Biol., 217 (2014), 3765–3766. doi: 10.1242/jeb.113670. doi: 10.1242/jeb.113670
[36]	K. H. Low, Modelling and parametric study of modular undulating fin rays for fish robots, Mech. Mach. Theory, 44 (2009), 615–632. doi: 10.1016/j.mechmachtheory.2008.11.009. doi: 10.1016/j.mechmachtheory.2008.11.009
[37]	I. English, H. Liu, O. M. Curet, Robotic device shows lack of momentum enhancement for gymnotiform swimmers, Bioinspir. Biomim., 14 (2019), 024001. doi: 10.1088/1748-3190/aaf983. doi: 10.1088/1748-3190/aaf983
[38]	I. D. Neveln, R. Bale, A. P. S. Bhalla, O. M. Curet, N. A. Patankar, M. A. MacIver, Undulating fins produce off-axis thrust and flow structures, J. Exp. Biol., 217 (2014), 201–213. doi: 10.1242/jeb.091520. doi: 10.1242/jeb.091520
[39]	M. Ikeda, S. Hikasa, K. Watanabe, I. Nagai, A CPG design of considering the attitude for the propulsion control of a Manta robot, in IECON 2013 - 39th Ann. Conf. IEEE Ind. Electron. Soc., 2013, 6354–6358. doi: 10.1109/IECON.2013.6700181.
[40]	C. Liu, Q. Chen, D. Wang, CPG-inspired workspace trajectory generation and adaptive locomotion control for quadruped robots, IEEE Trans. Syst. man, Cybern. Part B, Cybern. a Publ. IEEE Syst. Man, Cybern. Soc., 41 (2011), 867–880. doi: 10.1109/TSMCB.2010.2097589. doi: 10.1109/TSMCB.2010.2097589
[41]	C. M. A. Pinto, D. Rocha, C. P. Santos, Hexapod robots: New CPG model for generation of trajectories, J. Numer. Anal. Ind. Appl. Math., 7 (2012), 15–26.
[42]	T. Wang, W. Guo, M. Li, F. Zha, L. Sun, CPG Control for Biped Hopping Robot in Unpredictable Environment, J. Bionic Eng., 9 (2012), 29–38. doi: 10.1016/S1672-6529(11)60094-2. doi: 10.1016/S1672-6529(11)60094-2
[43]	S. Inagaki, H. Yuasa, T. Arai, CPG model for autonomous decentralized multi-legged robot system—generation and transition of oscillation patterns and dynamics of oscillators, Rob. Auton. Syst., 44 (2003), 171–179. doi: 10.1016/S0921-8890(03)00067-8. doi: 10.1016/S0921-8890(03)00067-8
[44]	M. Mokhtari, M. Taghizadeh, M. Mazare, Hybrid Adaptive Robust Control Based on CPG and ZMP for a Lower Limb Exoskeleton, Robotica, 39 (2021), 181–199. doi: 10.1017/S0263574720000260. doi: 10.1017/S0263574720000260
[45]	X. Wu, L. Teng, W. Chen, G. Ren, Y. Jin, H. Li, CPGs with continuous adjustment of phase difference for locomotion control, Int. J. Adv. Robot. Syst., 10 (2013), 1–13. doi: 10.5772/56490. doi: 10.5772/56490
[46]	Y. Cao, Y. Lu, Y. Cai, S. Bi, G. Pan, CPG-fuzzy-based control of a cownose-ray-like fish robot, Ind. Robot Int. J. Robot. Res. Appl., 46 (2019), 779–791. doi: 10.1108/IR-02-2019-0029. doi: 10.1108/IR-02-2019-0029
[47]	I. B. Jeong, C. S. Park, K. I. Na, S. Han, J. H. Kim, Particle swarm optimization-based central patter generator for robotic fish locomotion, 2011 IEEE Congr. Evol. Comput. CEC 2011, (2011), 152–157, doi: 10.1109/CEC.2011.5949612. doi: 10.1109/CEC.2011.5949612
[48]	M. C. Chen Wang, G. Xie, L. Wang, CPG-based locomotion control of a robotic fish: Using linear oscillators and reducing control parameters via PSO, Int. J. Innov. Comput. Inf. Control, 7 (2011), 4237–4249.
[49]	J. Yu, Z. Wu, M. Wang, M. Tan, CPG Network Optimization for a Biomimetic Robotic Fish via PSO, IEEE Trans. Neural Networks Learn. Syst., 27 (2016), 1962–1968. doi: 10.1109/TNNLS.2015.2459913. doi: 10.1109/TNNLS.2015.2459913
[50]	J. Lee, S. Lee, S. Chang, B.-H. Ahn, A Comparison of GA and PSO for Excess Return Evaluation in Stock Markets, Lect. Notes Comput. Sci., 3562 (2005), 221–230. doi: 10.1007/11499305_23. doi: 10.1007/11499305_23
[51]	C. Niehaus, T. Röfer, T. Laue, Gait Optimization on a Humanoid Robot using Particle Swarm Optimization, 2007.
[52]	Y. Zou, T. Liu, D. Liu, F. Sun, Reinforcement learning-based real-time energy management for a hybrid tracked vehicle, Appl. Energy, 171 (2016), 372–382. doi: 10.1016/j.apenergy.2016.03.082. doi: 10.1016/j.apenergy.2016.03.082
[53]	T. Liu, Y. Zou, D. Liu, F. Sun, Reinforcement learning-based energy management strategy for a hybrid electric tracked vehicle, Energies, 8 (2015), 7243–7260. doi: 10.3390/en8077243. doi: 10.3390/en8077243
[54]	R. C. Hsu, C. T. Liu, D. Y. Chan, A reinforcement-learning-based assisted power management with QoR provisioning for human-electric hybrid bicycle, IEEE Trans. Ind. Electron., 59 (2012), 3350–3359. doi: 10.1109/TIE.2011.2141092. doi: 10.1109/TIE.2011.2141092
[55]	H. Lee, C. Kang, Y. Il Park, N. Kim, S. W. Cha, Online data-driven energy management of a hybrid electric vehicle using model-based Q-learning, IEEE Access, 8 (2020), 84444–84454. doi: 10.1109/ACCESS.2020.2992062. doi: 10.1109/ACCESS.2020.2992062
[56]	T. Liu, X. H, S. E. Li, D. Cao, Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle, IEEE/ASME Trans. Mechatronics, 22 (2017), 1497–1507. doi: 10.1109/TMECH.2017.2707338. doi: 10.1109/TMECH.2017.2707338
[57]	Y. Lu, R. He, X. Chen, B. Lin, C. Yu, Energy-efficient depth-based opportunistic routing with q-learning for underwater wireless sensor networks, Sensors (Switzerland), 20 (2020), 1–25. doi: 10.3390/s20041025. doi: 10.3390/s20041025
[58]	R. Plate, C. Wakayama, Utilizing kinematics and selective sweeping in reinforcement learning-based routing algorithms for underwater networks, Ad Hoc Networks, 34 (2015), 105–120. doi: 10.1016/j.adhoc.2014.09.012. doi: 10.1016/j.adhoc.2014.09.012
[59]	Y. He, L. Xing, Y. Chen, W. Pedrycz, L. Wang, G. Wu, A Generic Markov Decision Process Model and Reinforcement Learning Method for Scheduling Agile Earth Observation Satellites, IEEE Trans. Syst. Man. Cybern. Syst., 1–12, 2020. doi: 10.1109/tsmc.2020.3020732. doi: 10.1109/tsmc.2020.3020732
[60]	Z. Jin, Y. Ma, Y. Su, S. Li, X. Fu, A Q-learning-based delay-aware routing algorithm to extend the lifetime of underwater sensor networks, Sensors (Switzerland), 17 (2017), 1–15. doi: 10.3390/s17071660. doi: 10.3390/s17071660
[61]	D. Zhang, Z. H. Ye, P. C. Chen, Q. G. Wang, Intelligent event-based output feedback control with Q-learning for unmanned marine vehicle systems, Control Eng. Pract., 105 (2020), 104616. doi: 10.1016/j.conengprac.2020.104616. doi: 10.1016/j.conengprac.2020.104616
[62]	Z. Chen, B. Qin, M. Sun, Q. Sun, Q-Learning-based parameters adaptive algorithm for active disturbance rejection control and its application to ship course control, Neurocomputing, 408 (2020), 51–63. doi: 10.1016/j.neucom.2019.10.060. doi: 10.1016/j.neucom.2019.10.060
[63]	Y. Nakamura, T. Mori, S. Ishii, Natural Policy Gradient Reinforcement Learning for a CPG Control of a Biped Robot, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 3242 (2004), 972–981. doi: 10.1007/978-3-540-30217-9_98. doi: 10.1007/978-3-540-30217-9_98
[64]	T. Mori, Y. Nakamura, M. A. Sato, S. Ishii, Reinforcement learning for a CPG-driven biped robot, Proc. Natl. Conf. Artif. Intell., (2004), 623–630.

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)