A Light Transformer-Based Architecture for Handwritten Text Recognition

Killian Barrere,Yann Soullard,Aurelie Lemaitre,Bertrand Couasnon

DOCUMENT ANALYSIS SYSTEMS, DAS 2022（2022）

引用 1|浏览6

暂无评分

摘要

Transformer models have been showing ground-breaking results in the domain of natural language processing. More recently, they started to gain interest in many others fields as in computer vision. Traditional Transformer models typically require a significant amount of training data to achieve satisfactory results. However, in the domain of handwritten text recognition, annotated data acquisition remains costly resulting in small datasets compared to those commonly used to train a Transformer-based model. Hence, training Transformer models able to transcribe handwritten text from images remains challenging. We propose a light encoder-decoder Transformer-based architecture for handwriting text recognition, containing a small number of parameters compared to traditional Transformer architectures. We trained our architecture using a hybrid loss, combining the well-known connectionist temporal classification with the cross-entropy. Experiments are conducted on the well-known IAM dataset with and without the use of additional synthetic data. We show that our network reaches state-of-the-art results in both cases, compared with other larger Transformer-based models.

查看译文

关键词

Light network, Hybrid loss, Transformer, Handwritten text recognition, Neural networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要