Scan: Sequence-Character Aware Network For Text Recognition

VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP(2021)

引用 1|浏览56
暂无评分
摘要
Text recognition continues to be a challenging problem in the context of text reading in natural scenes. Bearing in mind the sequential nature of text, the problem is usually posed as a sequence prediction problem from a whole-word image. Alternatively, it can also be posed as a character prediction problem. The latter approach is typically more robust to challenging word shapes. Attempting to find the sweet spot that attains the best of the two approaches, we propose Sequence-Character Aware Network (SCAN). SCAN starts by locating and recognizing the characters, and then generates the word using a sequence-based approach. It comprises two modules: a semantic-segmentation-based character prediction, and an encoder-decoder network for word generation. The training is done over two stages. In the first stage, we adopt a multi-task training technique with both character-level and word-level losses and trainable loss weighting. In the second stage, the character-level loss is removed, enabling the use of data with only word-level annotations. Experiments are conducted on several datasets for both regular and irregular text, showing state of the art performance of the proposed approach. It also shows that the proposed approach is robust against noisy word detection.
更多
查看译文
关键词
Text Recognition, Multi-task Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要