BronchoCopilot: Towards Autonomous Robotic Bronchoscopy via Multimodal Reinforcement Learning
CoRR(2024)
摘要
Bronchoscopy plays a significant role in the early diagnosis and treatment of
lung diseases. This process demands physicians to maneuver the flexible
endoscope for reaching distal lesions, particularly requiring substantial
expertise when examining the airways of the upper lung lobe. With the
development of artificial intelligence and robotics, reinforcement learning
(RL) method has been applied to the manipulation of interventional surgical
robots. However, unlike human physicians who utilize multimodal information,
most of the current RL methods rely on a single modality, limiting their
performance. In this paper, we propose BronchoCopilot, a multimodal RL agent
designed to acquire manipulation skills for autonomous bronchoscopy.
BronchoCopilot specifically integrates images from the bronchoscope camera and
estimated robot poses, aiming for a higher success rate within challenging
airway environment. We employ auxiliary reconstruction tasks to compress
multimodal data and utilize attention mechanisms to achieve an efficient latent
representation of this data, serving as input for the RL module. This framework
adopts a stepwise training and fine-tuning approach to mitigate the challenges
of training difficulty. Our evaluation in the realistic simulation environment
reveals that BronchoCopilot, by effectively harnessing multimodal information,
attains a success rate of approximately 90% in fifth generation airways with
consistent movements. Additionally, it demonstrates a robust capacity to adapt
to diverse cases.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要