The effects of different levels of realism on the training of CNNs with only synthetic images for the semantic segmentation of robotic instruments in a head phantom

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY(2020)

引用 4|浏览9
暂无评分
摘要
Purpose The manual generation of training data for the semantic segmentation of medical images using deep neural networks is a time-consuming and error-prone task. In this paper, we investigate the effect of different levels of realism on the training of deep neural networks for semantic segmentation of robotic instruments. An interactive virtual-reality environment was developed to generate synthetic images for robot-aided endoscopic surgery. In contrast with earlier works, we use physically based rendering for increased realism. Methods Using a virtual reality simulator that replicates our robotic setup, three synthetic image databases with an increasing level of realism were generated: flat, basic, and realistic (using the physically-based rendering). Each of those databases was used to train 20 instances of a UNet-based semantic-segmentation deep-learning model. The networks trained with only synthetic images were evaluated on the segmentation of 160 endoscopic images of a phantom. The networks were compared using the Dwass–Steel–Critchlow–Fligner nonparametric test. Results Our results show that the levels of realism increased the mean intersection-over-union (mIoU) of the networks on endoscopic images of a phantom ( p<0.01 ). The median mIoU values were 0.235 for the flat dataset, 0.458 for the basic, and 0.729 for the realistic. All the networks trained with synthetic images outperformed naive classifiers. Moreover, in an ablation study, we show that the mIoU of physically based rendering is superior to texture mapping ( p<0.01 ) of the instrument (0.606), the background (0.685), and the background and instruments combined (0.672). Conclusions Using physical-based rendering to generate synthetic images is an effective approach to improve the training of neural networks for the semantic segmentation of surgical instruments in endoscopic images. Our results show that this strategy can be an essential step in the broad applicability of deep neural networks in semantic segmentation tasks and help bridge the domain gap in machine learning.
更多
查看译文
关键词
Deep learning,Semantic segmentation,Photorealistic rendering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要