Multimodal interactive transcription of text images

Alejandro H. Toselli,Verónica Romero,Moisés Pastor,Enrique Vidal

Pattern Recognition（2010）

引用 132|浏览0

暂无评分

摘要

To date, automatic handwriting recognition systems are far from being perfect and heavy human intervention is often required to check and correct the results of such systems. This ''post-editing'' process is both inefficient and uncomfortable to the user. An example is the transcription of historic documents: state-of-the-art handwritten text recognition technology is not suitable to perform this task automatically and expensive paleography expert work is needed to achieve correct transcriptions. As an alternative to fully manual transcription and post-editing, a multimodal interactive approach is proposed here where user feedback is provided by means of touchscreen pen strokes and/or more traditional keyboard and mouse operation. User's feedback directly allows to improve system accuracy, while multimodality increases system ergonomy and user acceptability. Multimodal interaction is approached in such a way that both the main and the feedback data streams help each-other to optimize overall performance and usability. Empirical tests on three cursive handwritten tasks suggest that, using this approach, considerable amounts of user effort can be saved with respect to both pure manual work and non-interactive, post-editing processing.

查看译文

关键词

user acceptability,text image,manual transcription,expensive paleography expert work,post-editing processing,cursive handwritten task,multimodal interactive transcription,correct transcription,user effort,user feedback,automatic handwriting recognition system,feedback data stream,multimodal interaction,handwriting recognition,pattern recognition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要