DQ-DETR: Dynamic Queries Enhanced Detection Transformer for Arbitrary Shape Text Detection.

ICDAR (2)(2023)

引用 0|浏览17
暂无评分
摘要
We propose a new Transformer-based text detection model, named Dynamic Queries enhanced DEtection TRansformer (DQ-DETR), to detect arbitrary shape text instances from images with high localization accuracy. Unlike previous Transformer-based methods which take all control points on the boundaries/center-lines of all text instances as the queries of each Transformer decoder layer, we extend the query set for each decoder layer gradually, allowing the DQ-DETR to achieve higher localization accuracy by detecting control points for each text instance progressively. Specifically, after refining the positions of existing control points from the preceding decoder layer, each decoder layer further appends a new point on each side of each center-line segment, which are input to the next decoder layer as additional queries for detecting new control points. As offsets from the new control points to the added reference points are small, their positions can be predicted more precisely, leading to higher center-line detection accuracy. Consequently, our DQ-DETR achieves state-of-the-art performance on five public text detection benchmarks, including MLT2017, Total-Text, CTW1500, ArT and DAST1500.
更多
查看译文
关键词
detection,text,shape,dq-detr
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要