Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss
CoRR(2024)
摘要
Scene text recognition is an important and challenging task in computer
vision. However, most prior works focus on recognizing pre-defined words, while
there are various out-of-vocabulary (OOV) words in real-world applications.
In this paper, we propose a novel open-vocabulary text recognition framework,
Pseudo-OCR, to recognize OOV words. The key challenge in this task is the lack
of OOV training data. To solve this problem, we first propose a pseudo label
generation module that leverages character detection and image inpainting to
produce substantial pseudo OOV training data from real-world images. Unlike
previous synthetic data, our pseudo OOV data contains real characters and
backgrounds to simulate real-world applications. Secondly, to reduce noises in
pseudo data, we present a semantic checking mechanism to filter semantically
meaningful data. Thirdly, we introduce a quality-aware margin loss to boost the
training with pseudo data. Our loss includes a margin-based part to enhance the
classification ability, and a quality-aware part to penalize low-quality
samples in both real and pseudo data.
Extensive experiments demonstrate that our approach outperforms the
state-of-the-art on eight datasets and achieves the first rank in the ICDAR2022
challenge.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要