CDText: Scene text detector based on context-aware deformable transformer.

Pattern Recognition Letters(2023)

引用 1|浏览0
暂无评分
摘要
Scene text detection task aims to precisely locate text regions in natural scenes. However, the existing methods still face challenges in detecting arbitrary-shaped text, due to their limited feature representation capability. To alleviate this problem, we propose a scene text detector, i.e., CDText, based on structure of context-aware deformable transformer. Specifically, CDText firstly adopts different convolution kernel designs for feature extraction, which designs receptive fields with different size for multi-scale feature perception and fusion. Meanwhile, multi-head self-attention mechanism is used to strengthen the reasoning ability of CDText in a global sense, thus enhancing feature maps with abundant context information by extracting implicit relationship between multi-scale text features. Moreover, CDText designs a segmentation head to segment text instances of arbitrary shapes from rectangular detection boxes. Experiments show that CDText is superior to comparative methods in detection accuracy, achieving F -scores of 92.7, 81.9, and 82.9 on ICDAR2013, Total Text, and CTW-150 0 datasets, respectively.
更多
查看译文
关键词
Disentangled representation learning,Group -disentangled feature representation,Thoracic pathologic prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要