Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition

2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)(2016)

引用 6|浏览36
暂无评分
摘要
Existing transcripts for historic manuscripts are a very valuable resource for training models useful for automatic recognition, aided transcription, and/or indexing of the remaining untranscribed parts of these collections. However, these existing transcripts generally exhibit two main problems which hinder their convenience: a) text of the transcripts is seldom aligned with manuscript lines, and b) text often deviate very significantly from what can be seen in the manuscript, either because writing style has been modernized or abbreviations have been expanded, or both. This work presents an analysis of these problems and discusses possible solutions for minimizing human effort needed to adapt existing transcripts in order to render them usable. Empirical results presented show the huge performance gain that can be obtained by adequately adapting the transcripts, thus motivating future development of the proposed solutions.
更多
查看译文
关键词
Handwritten Text Recognition,Historical Manuscripts,Modernized Transcripts,Transcript-image Alignment,Diplomatization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要