A Benchmark of Parsing Vietnamese Publications

IEEE ACCESS(2022)

引用 3|浏览10
暂无评分
摘要
In recent decades, digital transformation has received growing attention worldwide, that has leveraged the explosion of digitized document data. In this paper, we address the problem of parsing publications, in particular, Vietnamese publications. The Vietnamese publications are well-known with high variant, diverse layouts, and some characters are equivocal in the visual form due to accent symbols and derivative characters that pose many challenges. To this end, we collect the UIT-DODV-Ext dataset: a challenging Vietnamese document image including scientific papers and textbooks with 5,000 fully annotated images. We introduce a general framework to parse Vietnamese publications containing two components: page object detection and caption recognition. We further conduct an extensive benchmark with various state-of-the-art object detection and text recognition methods. Finally, we present a hybrid parser which achieves the top place in the benchmark. Extensive experiments on the UIT-DODV-Ext dataset provide a comprehensive evaluation and insightful analysis.
更多
查看译文
关键词
Object detection, Text recognition, Transformers, Benchmark testing, Task analysis, Image recognition, Portable document format, Page object detection, text recognition, caption recognition, UIT-DODV-Ext
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要