Classification of oriental and European scripts by using characteristic features

ICDAR-1(1997)

引用 88|浏览36
暂无评分
摘要
Two types of techniques are usually adopted in language differentiation: token matching and statistical analysis. In this paper we present a method which uses a combined analysis of several discriminating statistical features for the differentiation between European and oriental language scripts. When applied to more than 23 languages, it has proved to be effective in classifying documents printed in these different scripts
更多
查看译文
关键词
language differentiation,classifying document,image matching,different script,european scripts,characteristic features,statistical analysis,statistical feature,combined analysis,ocr,document classification,statistical features,feature extraction,image classification,token matching,optical character recognition,european script classification,document image processing,oriental language script,character sets,oriental script classification,indexing,natural languages,machine intelligence,testing,pattern recognition,image segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要