Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results

LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, pp. 1266-1272, 2012.

Cited by: 7|Views8
EI

Abstract:

In this paper we deal with named entity detection on data acquired via OCR process on documents dating from 1890. The resulting corpus is very noisy. We perform an analysis to find possible strategies to overcome errors introduced by the OCR process. We propose a preprocessing procedure in three steps to clean data and correct, at least i...More

Code:

Data:

Full Text
Bibtex
Your rating :
0

 

Tags
Comments