Top

International Journal of Computer Assisted Radiology and Surgery

Published in:

Open Access 01-07-2020 | Original Article

Investigating exploration for deep reinforcement learning of concentric tube robot control

Authors: Keshav Iyengar, George Dwyer, Danail Stoyanov

Published in: International Journal of Computer Assisted Radiology and Surgery | Issue 7/2020

Abstract

Purpose

Concentric tube robots are composed of multiple concentric, pre-curved, super-elastic, telescopic tubes that are compliant and have a small diameter suitable for interventions that must be minimally invasive like fetal surgery. Combinations of rotation and extension of the tubes can alter the robot’s shape but the inverse kinematics are complex to model due to the challenge of incorporating friction and other tube interactions or manufacturing imperfections. We propose a model-free reinforcement learning approach to form the inverse kinematics solution and directly obtain a control policy.

Method

Three exploration strategies are shown for deep deterministic policy gradient with hindsight experience replay for concentric tube robots in simulation environments. The aim is to overcome the joint to Cartesian sampling bias and be scalable with the number of robotic tubes. To compare strategies, evaluation of the trained policy network to selected Cartesian goals and associated errors are analyzed. The learned control policy is demonstrated with trajectory following tasks.

Results

Separation of extension and rotation joints for Gaussian exploration is required to overcome Cartesian sampling bias. Parameter noise and Ornstein–Uhlenbeck were found to be optimal strategies with less than 1 mm error in all simulation environments. Various trajectories can be followed with the optimal exploration strategy learned policy at high joint extension values. Our inverse kinematics solver in evaluation has 0.44 mm extension and \(0.3^{\circ }\) rotation error.

Conclusion

We demonstrate the feasibility of effective model-free control for concentric tube robots. Directly using the control policy, arbitrary trajectories can be followed and this is an important step towards overcoming the challenge of concentric tube robot control for clinical use in minimally invasive interventions.

Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Advances in neural information processing systems, pp 5048–5058

Bergeles C, Lin FY, Yang GZ (2015) Concentric tube robot kinematics using neural networks. In: Hamlyn symposium on medical robotics, pp 13–14

Burgner J, Rucker DC, Gilbert HB, Swaney PJ, Russell PT, Weaver KD, Webster RJ (2014) A telerobotic system for transnasal surgery. IEEE/ASME Trans Mechatron 19(3):996–1006. https://doi.org/10.1109/TMECH.2013.2265804 CrossRef

Dupont P, Gosline A, Vasilyev N, Lock J, Butler E, Folk C, Cohen A, Chen R, Schmitz G RH, del Nido P (2012) Concentric tube robots for minimally invasive surgery. In: Hamlyn symposium on medical robotics, vol 7, p 8

Dupont PE, Lock J, Itkowitz B, Butler E (2010) Design and control of concentric-tube robots. IEEE Trans Robot 26(2):209–225. https://doi.org/10.1109/TRO.2009.2035740 CrossRefPubMedPubMedCentral

Dwyer G, Chadebecq F, Amo MT, Bergeles C, Maneas E, Pawar V, Vander Poorten E, Deprest J, Ourselin S, De Coppi P, Vercauteren T, Stoyanov D (2017) A continuum robot and control interface for surgical assist in fetoscopic interventions. IEEE Robot Autom Lett 2(3):1656–1663CrossRef

Dwyer G, Colchester RJ, Alles EJ, Maneas E, Ourselin S, Vercauteren T, Deprest J, Vander Poorten E, De Coppi P, Desjardins AE, Stoyanov D (2019) Robotic control of a multi-modal rigid endoscope combining optical imaging with all-optical ultrasound. In: 2019 International conference on robotics and automation (ICRA). IEEE, pp 3882–3888

Grassmann R, Modes V, Burgner-Kahrs J (2018) Learning the forward and inverse kinematics of a 6-DOF concentric tube continuum robot in SE(3). In: IEEE international conference on intelligent robots and systems, pp 5125–5132. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/IROS.2018.8594451

Grassmann RM, Burgner-Kahrs J (2019) On the merits of joint space and orientation representations in learning the forward kinematics in SE ( 3 ). In: Robotics: science and systems

10.

Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters. In: Thirty-second AAAI conference on artificial intelligence

11.

Hill A, Raffin A, Ernestus M, Gleave A, Kanervisto A, Traore R, Dhariwal P, Hesse C, Klimov O, Nichol A, Plappert M, Radford A, Schulman J, Sidor S, Wu Y (2018) Stable baselines. https://github.com/hill-a/stable-baselines

12.

Jordan MI, Rumelhart DE (1992) Forward models: supervised learning with a distal teacher. Cogn Sci 16(3):307–354CrossRef

13.

Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arxiv:1509.02971

14.

Lock J, Dupont PE (2011) Friction modeling in concentric tube robots. In: Proceedings—IEEE international conference on robotics and automation, pp 1139–1146. https://doi.org/10.1109/ICRA.2011.5980347

15.

Nair A, McGrew B, Andrychowicz M, Zaremba W, Abbeel P (2018) Overcoming exploration in reinforcement learning with demonstrations. In: Proceedings—IEEE international conference on robotics and automation, pp 6292–6299. https://doi.org/10.1109/ICRA.2018.8463162

16.

Nikishin E, Izmailov P, Athiwaratkun B, Podoprikhin D, Garipov T, Shvechikov P, Vetrov D, Wilson AG (2018) Improving stability in deep reinforcement learning with weight averaging. In: Uncertainty in artificial intelligence workshop on uncertainty in deep learning, vol 5

17.

OpenAI Andrychowicz M, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J, Sidor S, Tobin J, Welinder P, Weng L, Zaremba W (2018) Learning dexterous in-hand manipulation. http://arxiv.org/abs/1808.00177

18.

Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen RY, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905

19.

Rucker DC, Jones BA, Webster RJ (2010) A geometrically exact model for externally loaded concentric-tube continuum robots. IEEE Trans Robot 26(5):769–780. https://doi.org/10.1109/TRO.2010.2062570 CrossRefPubMedPubMedCentral

20.

Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge

21.

Xu W, Chen J, Lau HY, Ren H (2017) Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators. Int J Med Robot Comput Assist Surg 13(3):e1774CrossRef

Title: Investigating exploration for deep reinforcement learning of concentric tube robot control
Authors: Keshav Iyengar
George Dwyer
Danail Stoyanov
Publication date: 01-07-2020
Publisher: Springer International Publishing
Published in: International Journal of Computer Assisted Radiology and Surgery / Issue 7/2020
Print ISSN: 1861-6410
Electronic ISSN: 1861-6429
DOI: https://doi.org/10.1007/s11548-020-02194-z

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Investigating exploration for deep reinforcement learning of concentric tube robot control

Abstract

Purpose

Method

Results

Conclusion

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Purpose

Method

Results

Conclusion

Please log in to get access to this content

Other articles of this Issue 7/2020

Ultrasound 3D reconstruction of malignant masses in robotic-assisted partial nephrectomy using the PAF rail system: a comparison study

Estimation of boundary conditions for patient-specific liver simulation during augmented surgery

i3PosNet: instrument pose estimation from X-ray in temporal bone surgery

Toward automatic C-arm positioning for standard projections in orthopedic surgery

Robust real-time bone surfaces segmentation from ultrasound using a local phase tensor-guided CNN

WGAN domain adaptation for the joint optic disc-and-cup segmentation in fundus images