CDText: Scene text detector based on context-aware deformable transformer.

Yirui Wu,Qiran Kong,Yong Lai,Fabio Narducci,Shaohua Wan

Pattern Recognition Letters（2023）

引用 1|浏览0

暂无评分

摘要

Scene text detection task aims to precisely locate text regions in natural scenes. However, the existing methods still face challenges in detecting arbitrary-shaped text, due to their limited feature representation capability. To alleviate this problem, we propose a scene text detector, i.e., CDText, based on structure of context-aware deformable transformer. Specifically, CDText firstly adopts different convolution kernel designs for feature extraction, which designs receptive fields with different size for multi-scale feature perception and fusion. Meanwhile, multi-head self-attention mechanism is used to strengthen the reasoning ability of CDText in a global sense, thus enhancing feature maps with abundant context information by extracting implicit relationship between multi-scale text features. Moreover, CDText designs a segmentation head to segment text instances of arbitrary shapes from rectangular detection boxes. Experiments show that CDText is superior to comparative methods in detection accuracy, achieving F -scores of 92.7, 81.9, and 82.9 on ICDAR2013, Total Text, and CTW-150 0 datasets, respectively.

查看译文

关键词

Disentangled representation learning,Group -disentangled feature representation,Thoracic pathologic prediction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要