HWNet v3: a joint embedding framework for recognition and retrieval of handwritten text

INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION(2023)

引用 0|浏览9
暂无评分
摘要
Learning an efficient label embedding framework for word images enables effective word spotting of handwritten documents. In this work, we propose different schemes of label embedding for word images using deep neural architectures and their representations. We refer to our first scheme as the two-stage label embedding technique which projects both word images and their corresponding textual strings into a common subspace. We further introduce an end-to-end label embedding scheme using deep neural architecture which simplifies the embedding process and reports state-of-the-art performance for the task of word spotting and recognition. We also validate the role of synthetic data as a complementary modality to further enhance the embedding process. On the challenging IAM handwritten dataset, we report an mAP of 0.9753 for query-by-string-based word spotting, while under lexicon-based word recognition, our proposed method reports 1.67 and 3.62 character and word error rates, respectively. We also present the detailed ablation study on various variants of our end-to-end embedding architecture and perform analysis under varying embedding sizes. We further validate the embedding scheme on degraded printed document datasets from both Latin and Indic scripts.
更多
查看译文
关键词
Word image embedding,Word spotting,Word recognition,Label embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要