Skip to main content
Top
Published in: International Journal of Computer Assisted Radiology and Surgery 7/2020

Open Access 01-07-2020 | Original Article

Investigating exploration for deep reinforcement learning of concentric tube robot control

Authors: Keshav Iyengar, George Dwyer, Danail Stoyanov

Published in: International Journal of Computer Assisted Radiology and Surgery | Issue 7/2020

Login to get access

Abstract

Purpose

Concentric tube robots are composed of multiple concentric, pre-curved, super-elastic, telescopic tubes that are compliant and have a small diameter suitable for interventions that must be minimally invasive like fetal surgery. Combinations of rotation and extension of the tubes can alter the robot’s shape but the inverse kinematics are complex to model due to the challenge of incorporating friction and other tube interactions or manufacturing imperfections. We propose a model-free reinforcement learning approach to form the inverse kinematics solution and directly obtain a control policy.

Method

Three exploration strategies are shown for deep deterministic policy gradient with hindsight experience replay for concentric tube robots in simulation environments. The aim is to overcome the joint to Cartesian sampling bias and be scalable with the number of robotic tubes. To compare strategies, evaluation of the trained policy network to selected Cartesian goals and associated errors are analyzed. The learned control policy is demonstrated with trajectory following tasks.

Results

Separation of extension and rotation joints for Gaussian exploration is required to overcome Cartesian sampling bias. Parameter noise and Ornstein–Uhlenbeck were found to be optimal strategies with less than 1 mm error in all simulation environments. Various trajectories can be followed with the optimal exploration strategy learned policy at high joint extension values. Our inverse kinematics solver in evaluation has 0.44 mm extension and \(0.3^{\circ }\) rotation error.

Conclusion

We demonstrate the feasibility of effective model-free control for concentric tube robots. Directly using the control policy, arbitrary trajectories can be followed and this is an important step towards overcoming the challenge of concentric tube robot control for clinical use in minimally invasive interventions.
Literature
1.
go back to reference Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Advances in neural information processing systems, pp 5048–5058 Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Advances in neural information processing systems, pp 5048–5058
2.
go back to reference Bergeles C, Lin FY, Yang GZ (2015) Concentric tube robot kinematics using neural networks. In: Hamlyn symposium on medical robotics, pp 13–14 Bergeles C, Lin FY, Yang GZ (2015) Concentric tube robot kinematics using neural networks. In: Hamlyn symposium on medical robotics, pp 13–14
4.
go back to reference Dupont P, Gosline A, Vasilyev N, Lock J, Butler E, Folk C, Cohen A, Chen R, Schmitz G RH, del Nido P (2012) Concentric tube robots for minimally invasive surgery. In: Hamlyn symposium on medical robotics, vol 7, p 8 Dupont P, Gosline A, Vasilyev N, Lock J, Butler E, Folk C, Cohen A, Chen R, Schmitz G RH, del Nido P (2012) Concentric tube robots for minimally invasive surgery. In: Hamlyn symposium on medical robotics, vol 7, p 8
6.
go back to reference Dwyer G, Chadebecq F, Amo MT, Bergeles C, Maneas E, Pawar V, Vander Poorten E, Deprest J, Ourselin S, De Coppi P, Vercauteren T, Stoyanov D (2017) A continuum robot and control interface for surgical assist in fetoscopic interventions. IEEE Robot Autom Lett 2(3):1656–1663CrossRef Dwyer G, Chadebecq F, Amo MT, Bergeles C, Maneas E, Pawar V, Vander Poorten E, Deprest J, Ourselin S, De Coppi P, Vercauteren T, Stoyanov D (2017) A continuum robot and control interface for surgical assist in fetoscopic interventions. IEEE Robot Autom Lett 2(3):1656–1663CrossRef
7.
go back to reference Dwyer G, Colchester RJ, Alles EJ, Maneas E, Ourselin S, Vercauteren T, Deprest J, Vander Poorten E, De Coppi P, Desjardins AE, Stoyanov D (2019) Robotic control of a multi-modal rigid endoscope combining optical imaging with all-optical ultrasound. In: 2019 International conference on robotics and automation (ICRA). IEEE, pp 3882–3888 Dwyer G, Colchester RJ, Alles EJ, Maneas E, Ourselin S, Vercauteren T, Deprest J, Vander Poorten E, De Coppi P, Desjardins AE, Stoyanov D (2019) Robotic control of a multi-modal rigid endoscope combining optical imaging with all-optical ultrasound. In: 2019 International conference on robotics and automation (ICRA). IEEE, pp 3882–3888
8.
go back to reference Grassmann R, Modes V, Burgner-Kahrs J (2018) Learning the forward and inverse kinematics of a 6-DOF concentric tube continuum robot in SE(3). In: IEEE international conference on intelligent robots and systems, pp 5125–5132. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/IROS.2018.8594451 Grassmann R, Modes V, Burgner-Kahrs J (2018) Learning the forward and inverse kinematics of a 6-DOF concentric tube continuum robot in SE(3). In: IEEE international conference on intelligent robots and systems, pp 5125–5132. Institute of Electrical and Electronics Engineers Inc. https://​doi.​org/​10.​1109/​IROS.​2018.​8594451
9.
go back to reference Grassmann RM, Burgner-Kahrs J (2019) On the merits of joint space and orientation representations in learning the forward kinematics in SE ( 3 ). In: Robotics: science and systems Grassmann RM, Burgner-Kahrs J (2019) On the merits of joint space and orientation representations in learning the forward kinematics in SE ( 3 ). In: Robotics: science and systems
10.
go back to reference Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters. In: Thirty-second AAAI conference on artificial intelligence Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters. In: Thirty-second AAAI conference on artificial intelligence
12.
go back to reference Jordan MI, Rumelhart DE (1992) Forward models: supervised learning with a distal teacher. Cogn Sci 16(3):307–354CrossRef Jordan MI, Rumelhart DE (1992) Forward models: supervised learning with a distal teacher. Cogn Sci 16(3):307–354CrossRef
13.
go back to reference Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arxiv:1509.02971 Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arxiv:​1509.​02971
16.
go back to reference Nikishin E, Izmailov P, Athiwaratkun B, Podoprikhin D, Garipov T, Shvechikov P, Vetrov D, Wilson AG (2018) Improving stability in deep reinforcement learning with weight averaging. In: Uncertainty in artificial intelligence workshop on uncertainty in deep learning, vol 5 Nikishin E, Izmailov P, Athiwaratkun B, Podoprikhin D, Garipov T, Shvechikov P, Vetrov D, Wilson AG (2018) Improving stability in deep reinforcement learning with weight averaging. In: Uncertainty in artificial intelligence workshop on uncertainty in deep learning, vol 5
17.
go back to reference OpenAI Andrychowicz M, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J, Sidor S, Tobin J, Welinder P, Weng L, Zaremba W (2018) Learning dexterous in-hand manipulation. http://arxiv.org/abs/1808.00177 OpenAI Andrychowicz M, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J, Sidor S, Tobin J, Welinder P, Weng L, Zaremba W (2018) Learning dexterous in-hand manipulation. http://​arxiv.​org/​abs/​1808.​00177
18.
go back to reference Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen RY, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen RY, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:​1706.​01905
20.
go back to reference Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
21.
go back to reference Xu W, Chen J, Lau HY, Ren H (2017) Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators. Int J Med Robot Comput Assist Surg 13(3):e1774CrossRef Xu W, Chen J, Lau HY, Ren H (2017) Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators. Int J Med Robot Comput Assist Surg 13(3):e1774CrossRef
Metadata
Title
Investigating exploration for deep reinforcement learning of concentric tube robot control
Authors
Keshav Iyengar
George Dwyer
Danail Stoyanov
Publication date
01-07-2020
Publisher
Springer International Publishing
Published in
International Journal of Computer Assisted Radiology and Surgery / Issue 7/2020
Print ISSN: 1861-6410
Electronic ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-020-02194-z

Other articles of this Issue 7/2020

International Journal of Computer Assisted Radiology and Surgery 7/2020 Go to the issue