Optical character recognition of arabic printed text

Research and Development(2012)

引用 4|浏览1
暂无评分
摘要
Optical character recognition (OCR) systems improve human- machine interaction. They are widely used in many areas such as editing and storing previously printed or handwritten documents. Much of research has been done regarding the identification of Latin, Japanese and Chinese characters. However, very little investigation has been performed regarding Arabic recognition. Probably the reason is limitation of IT activities in Arabic speaking countries and the difficulty and complexity of Arabic characters identification compared to the others. More difficulties are introduced from the cursive nature of Arabic text. In this paper, a technique has been employed to segment printed Arabic text in order to separate the Arabic characters and then extracting powerful features for each to be recognized. In-order to recognize characters, those features are then compared with a pre-prepared database fields. Although the database was prepared from characters written in Time New Roman font, experimental results show the relatively high accuracy of the method developed when it is tested on several sizes of several fonts beside Time New Roman font.
更多
查看译文
关键词
human computer interaction,natural language processing,optical character recognition,text analysis,arabic character identification,arabic character recognition,arabic speaking countries,ocr system,time new roman font,human machine interaction improvement,printed arabic text segmentation,arabic characters,feature extraction,ocr,optical character recognition (ocr),recognition,segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要