Localization of Pointed-At Word in Printed Documents via a Single Neural Network

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS(2022)

引用 0|浏览1
暂无评分
摘要
Most existing object detection methods and text detection methods are mainly designed to detect either text or objects. In some scenarios where the task is to find the target word pointed-at by an object, results of existing methods are far from satisfying. However, such scenarios happen often in human-computer interaction, when the computer needs to figure out which word the user is pointing at. Comparing with object detection, pointed-at word localization (PAWL) requires higher accuracy, especially in dense text scenarios. Moreover, in printed document, characters are much smaller than those in scene text detection datasets such as ICDAR-2013, ICDAR-2015 and ICPR-2018 etc. To address these problems, the authors propose a novel target word localization network (TWLN) to detect the pointed-at word in printed documents. In this work, a single deep neural network is trained to extract the features of markers and text sequentially. For each image, the location of the marker is predicted firstly, according to the predicted location, a smaller image is cropped from the original image and put into the same network, then the location of pointed-at word is predicted. To train and test the networks, an efficient approach is proposed to generate the dataset from PDF format documents by inserting markers pointing at the words in the documents, which avoids laborious labeling work. Experiments on the proposed dataset demonstrate that TWLN outperforms the compared object detection method and optical character recognition method on every category of targets, especially when the target is a single character that only occupies several pixels in the image. TWLN is also tested with real photographs, and the accuracy shows no significant differences, which proves the validity of the generating method to construct the dataset.
更多
查看译文
关键词
character localization, deep learning, object detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要