Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers
arxiv(2024)
摘要
An accurate detection and tracking of devices such as guiding catheters in
live X-ray image acquisitions is an essential prerequisite for endovascular
cardiac interventions. This information is leveraged for procedural guidance,
e.g., directing stent placements. To ensure procedural safety and efficacy,
there is a need for high robustness no failures during tracking. To achieve
that, one needs to efficiently tackle challenges, such as: device obscuration
by contrast agent or other external devices or wires, changes in field-of-view
or acquisition angle, as well as the continuous movement due to cardiac and
respiratory motion. To overcome the aforementioned challenges, we propose a
novel approach to learn spatio-temporal features from a very large data cohort
of over 16 million interventional X-ray frames using self-supervision for image
sequence data. Our approach is based on a masked image modeling technique that
leverages frame interpolation based reconstruction to learn fine inter-frame
temporal correspondences. The features encoded in the resulting model are
fine-tuned downstream. Our approach achieves state-of-the-art performance and
in particular robustness compared to ultra optimized reference solutions (that
use multi-stage feature fusion, multi-task and flow regularization). The
experiments show that our method achieves 66.31
error against reference solutions (23.20
achieving a success score of 97.95
frames-per-second (on GPU). The results encourage the use of our approach in
various other tasks within interventional image analytics that require
effective understanding of spatio-temporal semantics.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要