HisDoc R-CNN: Robust Chinese Historical Document Text Line Detection with Dynamic Rotational Proposal Network and Iterative Attention Head.

ICDAR (1)(2023)

引用 0|浏览6
暂无评分
摘要
Text line detection is an essential task in a historical document analysis system. Although many existing text detection methods have achieved remarkable performance on various scene text datasets, they cannot perform well because of the high density, multiple scales, and multiple orientations of text lines in complex historical documents. Thus, it is crucial and challenging to investigate effective text line detection methods for historical documents. In this paper, we propose a Dynamic Rotational Proposal Network (DRPN) and an Iterative Attention Head (IAH), which are incorporated into Mask R-CNN to detect text lines in historical documents. The DRPN can dynamically generate horizontal or rotational proposals to enhance the robustness of the model for multi-oriented text lines and alleviate the multi-scale problem in historical documents. The proposed IAH integrates a multi-dimensional attention mechanism that can better learn the features of dense historical document text lines while improving detection accuracy and reducing the model parameters via an iterative mechanism. Our HisDoc R-CNN achieves state-of-the-art performance on various historical document benchmarks including CHDAC (the IACC competition ( http://iacc.pazhoulab-huangpu.com/shows/108/1.html ) dataset), MTHv2, and ICDAR 2019 HDRC CHINESE, thereby demonstrating the robustness of our method. Furthermore, we present special tricks for historical document scenarios, which may provide useful insights for practical applications.
更多
查看译文
关键词
text,line,attention,r-cnn
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要